feat: Add intelligent auto-router and enhanced integrations

- Add intelligent-router.sh hook for automatic agent routing
- Add AUTO-TRIGGER-SUMMARY.md documentation
- Add FINAL-INTEGRATION-SUMMARY.md documentation
- Complete Prometheus integration (6 commands + 4 tools)
- Complete Dexto integration (12 commands + 5 tools)
- Enhanced Ralph with access to all agents
- Fix /clawd command (removed disable-model-invocation)
- Update hooks.json to v5 with intelligent routing
- 291 total skills now available
- All 21 commands with automatic routing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
This commit is contained in:
admin
2026-01-28 00:27:56 +04:00
Unverified
parent 3b128ba3bd
commit b52318eeae
1724 changed files with 351216 additions and 0 deletions

114
prometheus/CONTRIBUTING.md Normal file
View File

@@ -0,0 +1,114 @@
# 🧠 Contributing to **Prometheus**
Thank you for your interest in contributing to **Prometheus** — were excited to have you on board!
Your contributions help us build a stronger, smarter foundation for autonomous software reasoning. 💪
---
## 🚀 Getting Started
1. **Fork the Repository**
Click *Fork* on GitHub and clone your fork locally:
```bash
git clone https://github.com/EuniAI/Prometheus.git
```
2. **Set Up the Environment**
Follow the setup instructions in [`README.md`](./README.md) to install dependencies and configure your development environment.
3. **Create a Feature Branch**
Use a descriptive name for your branch:
```bash
git checkout -b feature/short-description
```
---
## 🧩 Development Guidelines
### 🧱 Code Style
* We use **[ruff](https://docs.astral.sh/ruff/)** for linting and formatting.
* Before committing, run:
```bash
ruff format
ruff check --fix
```
* Use clear, descriptive names for variables, functions, and classes.
* Keep your code modular and well-documented.
### 🧪 Testing
* Write tests for **every new feature or bug fix**.
* Run the test suite before pushing:
```bash
coverage run --source=prometheus -m pytest -v -s -m "not git"
coverage report
```
* Ensure test coverage remains high and includes both **unit** and **integration** tests.
---
## 🔁 Pull Request Process
### ✅ Before You Submit
* Update relevant documentation.
* Ensure all tests and CI checks pass.
* Keep changes **focused, atomic, and well-scoped**.
### 📬 Submitting a PR
1. Open a Pull Request with a clear, descriptive title.
2. Explain *what* you changed and *why* it matters.
3. Link any related issues.
4. Provide **reproduction steps** or **test instructions**, if applicable.
### 👀 Review Process
* Maintainers will review your PR and may suggest improvements.
* Please address feedback respectfully and promptly.
* Once approved, your PR will be merged into the main branch. 🎉
---
## 🐞 Reporting Issues
If you encounter a problem:
* Open a GitHub issue with a **clear description**.
* Include steps to reproduce, logs, and screenshots if possible.
* Describe the **expected** vs **actual** behavior.
Well-documented issues are easier and faster to fix!
---
## 🤝 Code of Conduct
We expect all contributors to:
* Be respectful, inclusive, and professional.
* Welcome constructive feedback.
* Prioritize whats best for the community.
* Show empathy and kindness to others.
Were building a community of collaboration and innovation — lets keep it positive and inspiring. ✨
---
## 💬 Need Help?
If you have questions or ideas:
* Start a discussion in [GitHub Discussions](../../discussions)
* Open an issue for technical topics
* Contact the maintainers directly
* Email us at 📧 **[team@euni.ai](mailto:team@euni.ai)**
---
Thank you for helping make **Prometheus** better.
Together, were shaping the future of autonomous code reasoning. 🚀

33
prometheus/Dockerfile Normal file
View File

@@ -0,0 +1,33 @@
FROM python:3.11-slim
ENV PYTHONUNBUFFERED=1
# Install Docker CLI and other dependencies
RUN apt-get update && \
apt-get install -y \
git \
apt-transport-https \
ca-certificates \
curl \
gnupg \
build-essential \
gcc \
lsb-release && \
curl -fsSL https://download.docker.com/linux/debian/gpg | gpg --dearmor -o /usr/share/keyrings/docker-archive-keyring.gpg && \
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/docker-archive-keyring.gpg] https://download.docker.com/linux/debian $(lsb_release -cs) stable" > /etc/apt/sources.list.d/docker.list && \
apt-get update && \
apt-get install -y docker-ce-cli && \
apt-get clean && \
rm -rf /var/lib/apt/lists/*
WORKDIR /app
COPY . /app
RUN pip install --upgrade pip \
&& pip install hatchling \
&& pip install .[test]
EXPOSE 9002
CMD ["uvicorn", "prometheus.app.main:app", "--host", "0.0.0.0", "--port", "9002"]

674
prometheus/LICENSE Normal file
View File

@@ -0,0 +1,674 @@
GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see <https://www.gnu.org/licenses/>.
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
<program> Copyright (C) <year> <name of author>
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
<https://www.gnu.org/licenses/>.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
<https://www.gnu.org/licenses/why-not-lgpl.html>.

352
prometheus/README.md Normal file
View File

@@ -0,0 +1,352 @@
<a name="readme-top"></a>
<div align="center">
<img src="./docs/static/images/icon.jpg" alt="Prometheus Logo" width="160">
<h1 style="border-bottom: none;">
<b><a href="https://euni.ai/" target="_blank">Prometheus</a></b><br>
⚡ AI-Powered Software 3.0 Platform ⚡
</h1>
<p align="center">
<strong>Redefining Software Productivity Beyond Vibe Coding — Theres More Beyond Lovable and Replit.</strong><br>
<em>Moving beyond unreliable prototype generation, Prometheus turns your ideas into verifiable, affordable software through autonomous code agents.</em>
</p>
<!-- 🌍 Project Links -->
<p align="center">
<a href="https://euni.ai/"><b>Website</b></a> •
<a href="https://x.com/Euni_AI"><b>X/Twitter</b></a> •
<a href="https://www.linkedin.com/company/euni-ai/"><b>LinkedIn</b></a> •
<a href="https://discord.gg/jDG4wqkKZj"><b>Discord</b></a> •
<a href="https://www.reddit.com/r/EuniAI"><b>Reddit</b></a> •
<a href="https://github.com/EuniAI/Prometheus"><b>GitHub</b></a>
</p>
<!-- Badges -->
<p align="center">
<a href="https://github.com/EuniAI/Prometheus/stargazers">
<img src="https://img.shields.io/github/stars/EuniAI/Prometheus?style=for-the-badge&color=yellow" alt="Stars">
</a>
<a href="https://github.com/EuniAI/Prometheus/forks">
<img src="https://img.shields.io/github/forks/EuniAI/Prometheus?style=for-the-badge&color=blue" alt="Forks">
</a>
<a href="https://opensource.org/licenses/Apache-2.0">
<img src="https://img.shields.io/badge/license-Apache--2.0-green?style=for-the-badge" alt="License: Apache 2.0">
</a>
<a href="https://www.arxiv.org/abs/2507.19942">
<img src="https://img.shields.io/badge/Paper-arXiv-red?style=for-the-badge&logo=arxiv&logoColor=white" alt="arXiv Paper">
</a>
<a href="https://github.com/EuniAI/Prometheus/graphs/contributors">
<img src="https://img.shields.io/github/contributors/EuniAI/Prometheus?style=for-the-badge&color=orange" alt="Contributors">
</a>
</p>
<p align="center">
<a href="https://github.com/EuniAI/Prometheus" target="_blank">
<img src="https://img.shields.io/github/commit-activity/m/EuniAI/Prometheus?label=Commits&color=brightgreen&style=flat" alt="Commit Activity">
</a>
<a href="https://github.com/EuniAI/Prometheus/forks" target="_blank">
<img src="https://img.shields.io/github/forks/EuniAI/Prometheus.svg?style=flat&color=blue&label=Forks" alt="Forks">
</a>
<a href="https://github.com/EuniAI/Prometheus/issues" target="_blank">
<img alt="Issues Closed" src="https://img.shields.io/github/issues-search?query=repo%3AEuniAI%2FPrometheus%20is%3Aclosed&label=Issues%20Closed&labelColor=%237d89b0&color=%235d6b98&style=flat">
</a>
<a href="https://github.com/EuniAI/Prometheus/discussions" target="_blank">
<img alt="Discussion Posts" src="https://img.shields.io/github/discussions/EuniAI/Prometheus?label=Discussions&labelColor=%239b8afb&color=%237a5af8&style=flat">
</a>
</p>
<hr style="width:80%;border:1px solid #ddd;">
</div>
<!-- <div align="center">
<a href="https://github.com/EuniAI/Prometheus/graphs/contributors"><img src="https://img.shields.io/github/contributors/EuniAI/Prometheus?style=for-the-badge&color=blue" alt="Contributors"></a>
<a href="https://github.com/EuniAI/Prometheus/stargazers"><img src="https://img.shields.io/github/stars/EuniAI/Prometheus?style=for-the-badge&color=blue" alt="Stargazers"></a>
<a href="https://www.arxiv.org/abs/2507.19942"><img src="https://img.shields.io/badge/Paper-arXiv-red?style=for-the-badge&logo=arxiv" alt="Paper"></a>
<br/>
<a href="https://github.com/EuniAI/Prometheus/blob/main/CREDITS.md"><img src="https://img.shields.io/badge/Project-Credits-blue?style=for-the-badge&color=FFE165&logo=github&logoColor=white" alt="Credits"></a>
<a href="https://discord.gg/jDG4wqkKZj"><img src="https://img.shields.io/badge/Discord-Join%20Us-5865F2?style=for-the-badge&logo=discord&logoColor=white" alt="Discord"></a>
<br/>
<hr>
</div> -->
<br/>
## 📣 News
- **[2025-11]** Prometheus achieved TOP 5 and TOP 1 🏆 in the agents using gpt-5 in the **[swebench leaderboard](https://www.swebench.com/)** for automated software engineering using LLMs 🎉!
- **[2025-10]** We maintain a curated list of code agent projects and research: **[Awesome Code Agents](https://github.com/EuniAI/awesome-code-agents)** - explore the cutting-edge developments in autonomous code generation, bug fixing, and software engineering automation.
---
## 📖 Overview
Prometheus is a research-backed, production-ready platform that leverages **unified knowledge graphs** and **multi-agent systems** to perform intelligent operations on multilingual codebases. Built on LangGraph state machines, it orchestrates specialized AI agents to automatically classify issues, reproduce bugs, retrieve relevant context, and generate validated patches.
### Key Capabilities
- **Automated Issue Resolution**: End-to-end bug fixing with reproduction, patch generation, and multi-level validation
- **Feature Implementation Pipeline**: Context-aware feature request analysis, implementation planning, and code generation with optional regression testing
- **Intelligent Context Retrieval**: Graph-based semantic search over codebase structure, AST, and documentation
- **Multi-Agent Orchestration**: Coordinated workflow between classification, reproduction, and resolution agents
- **Knowledge Graph Integration**: Neo4j-powered unified representation of code structure and semantics
- **Containerized Execution**: Docker-isolated testing and validation environment
- **Question Answering**: Natural language queries with tool-augmented LLM agents
📖 **[Multi-Agent Architecture](docs/Multi-Agent-Architecture.md)** | 📄 **[Research Paper](https://arxiv.org/abs/2507.19942)**
---
```bibtex
@misc{chen2025prometheusKG,
title={Prometheus: Unified Knowledge Graphs for Issue Resolution in Multilingual Codebases},
author={Zimin Chen and Yue Pan and Siyu Lu and Jiayi Xu and Claire Le Goues and Martin Monperrus and He Ye},
year={2025},
eprint={2507.19942},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2507.19942},
}
```
---
## 🤖 Why Prometheus?
| System | Core Description | Limitations | Why **Prometheus** is Superior |
|---------------------------------------------------------|----------------------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| **[SWE-Agent](https://github.com/SWE-agent/SWE-agent)** | Research baseline for automatic bug fixing using a single LLM-computer interface | Static, single-turn reasoning<br> No cross-file or cross-repo understanding<br> Lacks full detectreproducerepairverify (DRRV) automation | ✅ Prometheus performs **multi-agent collaborative reasoning** across files and commits, enabling full-cycle issue understanding and repair |
| **[Lingxi](https://github.com/lingxi-agent/Lingxi)** | Multi-agent system for automated bug fixing and code reasoning | Limited context retrieval<br> No persistent knowledge graph or long-term memory<br> Requires human validation for many patches | ✅ Prometheus integrates a **Unified Codebase Knowledge Graph** and **long-term memory (Athena)** for deeper semantic reasoning and repository-wide understanding |
| **[TRAE](https://github.com/bytedance/trae-agent)** | Multi-agent reasoning and tool execution framework | Focused on task orchestration rather than reasoning depth<br> No unified memory or structured code representation | ✅ Prometheus emphasizes **deep reasoning and knowledge unification**, allowing consistent understanding across large and complex repositories |
| **[OpenHands](https://github.com/OpenHands/OpenHands)** | General-purpose open-source AI developer using sandbox execution | Strong executor but weak contextual reasoning<br> No repository-level semantic linking<br> Task-by-task operation only | ✅ Prometheus combines **contextual understanding and code reasoning**, achieving coherent, reproducible debugging and intelligent code repair |
---
## 🏗️ Architecture
Prometheus implements a hierarchical multi-agent system:
```
User Issue
|
v
┌─────────────────────────────────┐
│ Issue Classification Agent │
│ (bug/question/feature/doc) │
└─────────────┬───────────────────┘
|
┌───────────────┼───────────────┐
| | |
v v v
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│Bug Pipeline │ │Feature │ │Question │
│ │ │Pipeline │ │Pipeline │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
| | |
v v v
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│Bug │ │Feature │ │Context │
│Reproduction │ │Analysis │ │Retrieval │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
| | |
v v v
┌──────────────┐ ┌──────────────┐ ┌──────────────┐
│Issue │ │Patch │ │Question │
│Resolution │ │Generation │ │Analysis │
└──────┬───────┘ └──────┬───────┘ └──────┬───────┘
| | |
└─────────────────┼─────────────────┘
v
Response Generation
```
**Core Components**:
- **Knowledge Graph**: Tree-sitter-based AST and semantic code representation in Neo4j
- **LangGraph State Machines**: Coordinated multi-agent workflows with checkpointing
- **Docker Containers**: Isolated build and test execution environments
- **LLM Integration**: Multi-tier model strategy (GPT-4, Claude, Gemini support)
See **[Architecture Documentation](docs/Multi-Agent-Architecture.md)** for details.
---
## ⚡ Quick Start
### ✅ Prerequisites
- **Docker** and **Docker Compose**
- **Python 3.11+** (for local development)
- **API Keys**: OpenAI, Anthropic, or Google Gemini
### 📦 Installation
1. **Clone the repository**
```bash
git clone https://github.com/EuniAI/Prometheus.git
cd Prometheus
```
2. **Configure environment**
```bash
cp example.env .env
# Edit .env with your API keys
```
3. **Generate JWT secret** (required for authentication)
```bash
python -m prometheus.script.generate_jwt_token
# Copy output to .env as PROMETHEUS_JWT_SECRET_KEY
```
4. **Create working directory**
```bash
mkdir -p working_dir
```
5. **Start services**
```bash
docker-compose up --build
```
6. **Access the platform**
- API: [http://localhost:9002/v1.2](http://localhost:9002/v1.2)
- Interactive Docs: [http://localhost:9002/docs](http://localhost:9002/docs)
---
## 💻 Development
### 🛠️ Local Setup
```bash
# Install dependencies
pip install hatchling
pip install .
pip install .[test]
# Run development server
uvicorn prometheus.app.main:app --host 0.0.0.0 --port 9002 --reload
```
### 🧪 Testing
```bash
# Run tests (excluding git-dependent tests)
coverage run --source=prometheus -m pytest -v -s -m "not git"
# Generate coverage report
coverage report -m
coverage html
open htmlcov/index.html
```
### 🗄️ Database Setup
**PostgreSQL** (required for state checkpointing):
```bash
docker run -d \
-p 5432:5432 \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=password \
-e POSTGRES_DB=postgres \
postgres
```
**Neo4j** (required for knowledge graph):
```bash
docker run -d \
-p 7474:7474 -p 7687:7687 \
-e NEO4J_AUTH=neo4j/password \
-e NEO4J_PLUGINS='["apoc"]' \
-e NEO4J_dbms_memory_heap_initial__size=4G \
-e NEO4J_dbms_memory_heap_max__size=8G \
neo4j
```
Verify at [http://localhost:7474](http://localhost:7474)
---
## 📚 Research & Citation
Prometheus is based on peer-reviewed research on unified knowledge graphs for multilingual code analysis.
```bibtex
@misc{prometheus2025,
title={Prometheus: Unified Knowledge Graphs for Issue Resolution in Multilingual Codebases},
author={Zimin Chen and Yue Pan and Siyu Lu and Jiayi Xu and Claire Le Goues and Martin Monperrus and He Ye},
year={2025},
eprint={2507.19942},
archivePrefix={arXiv},
primaryClass={cs.SE},
url={https://arxiv.org/abs/2507.19942}
}
```
---
## 🤝 Contributing
We welcome contributions! Please see our **[Contributing Guidelines](CONTRIBUTING.md)** for details on how to get started.
**Quick Links**:
- 📖 Read the full [Contributing Guide](CONTRIBUTING.md)
- 🐞 Report bugs via [GitHub Issues](https://github.com/EuniAI/Prometheus/issues)
- ✨ Submit feature requests and improvements via Pull Requests
- 💬 Join discussions on [Discord](https://discord.gg/jDG4wqkKZj)
We're grateful to all our amazing contributors who have made this project what it is today!
<a href="https://github.com/EuniAI/Prometheus/graphs/contributors">
<img src="https://contrib.rocks/image?repo=EuniAI/Prometheus&r=" width="400px"/>
</a>
If you have any questions or encounter issues, please feel free to reach out. For quick queries, you can also check our `Issues` page for common questions and solutions.
---
## 📜 License
This project is dual-licensed:
- **Community Edition**: Licensed under the [GNU General Public License v3.0 (GPLv3)](https://www.gnu.org/licenses/gpl-3.0.html).
You are free to use, modify, and redistribute this code, provided that any derivative works are also released under the GPLv3.
This ensures the project remains open and contributions benefit the community.
- **Commercial Edition**: For organizations that wish to use this software in **proprietary, closed-source, or commercial settings**,
a separate commercial license is available. Please contact **EUNI.AI Team** to discuss licensing terms.
---
## 💬 Support
- **Documentation**: [Multi-Agent Architecture](docs/Multi-Agent-Architecture.md) | [GitHub Issue Debug Guide](docs/GitHub-Issue-Debug-Guide.md)
- **Community**: [Discord Server](https://discord.gg/jDG4wqkKZj)
- **Email**: business@euni.ai
- **Issues**: [GitHub Issues](https://github.com/EuniAI/Prometheus/issues)
---
## 🌟 Star History
[![Star History Chart](https://api.star-history.com/svg?repos=EuniAI/Prometheus&type=Date)](https://www.star-history.com/#EuniAI/Prometheus&Date)
---
## 🙏 Acknowledgments
<div align="center">
<img src="./docs/static/images/delysium_logo.svg" alt="Delysium Logo" width="150">
</div>
We thank [Delysium](https://delysium.com) for their support in organizing LLM-related resources, architecture design, and optimization, which greatly strengthened our research infrastructure and capabilities.
---
<div align="center">
<p>Made with ❤️ by the <a href="https://euni.ai/">EuniAI</a> Team</p>
<p>
<a href="#readme-top">Back to top ↑</a>
</p>
</div>

View File

@@ -0,0 +1,101 @@
networks:
prometheus_network:
driver: bridge
services:
neo4j:
image: neo4j
container_name: prometheus_neo4j_container
networks:
- prometheus_network
environment:
- NEO4J_AUTH=neo4j/password
- NEO4J_PLUGINS=["apoc"]
- NEO4J_server_memory_heap_initial__size=6G
- NEO4J_server_memory_heap_max__size=12G
- NEO4J_dbms_memory_transaction_total_max=12G
- NEO4J_db_transaction_timeout=600s
volumes:
- ./data_neo4j:/var/lib/neo4j/data
healthcheck:
test: ["CMD", "cypher-shell", "-u", "neo4j", "-p", "password", "--non-interactive", "RETURN 1;"]
interval: 30s
timeout: 60s
retries: 3
postgres:
image: postgres:16
container_name: prometheus_postgres_container
networks:
- prometheus_network
environment:
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=password
- POSTGRES_DB=postgres
volumes:
- ./data_postgres:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -d postgres -U postgres"]
interval: 30s
timeout: 60s
retries: 3
prometheus:
build: .
container_name: prometheus
networks:
- prometheus_network
ports:
- "9002:9002"
environment:
# Logging
- PROMETHEUS_LOGGING_LEVEL=${PROMETHEUS_LOGGING_LEVEL}
# General settings
- PROMETHEUS_ENVIRONMENT=${PROMETHEUS_ENVIRONMENT}
- PROMETHEUS_BACKEND_CORS_ORIGINS=${PROMETHEUS_BACKEND_CORS_ORIGINS}
- PROMETHEUS_ENABLE_AUTHENTICATION=${PROMETHEUS_ENABLE_AUTHENTICATION}
# Neo4j settings
- PROMETHEUS_NEO4J_URI=${PROMETHEUS_NEO4J_URI}
- PROMETHEUS_NEO4J_USERNAME=${PROMETHEUS_NEO4J_USERNAME}
- PROMETHEUS_NEO4J_PASSWORD=${PROMETHEUS_NEO4J_PASSWORD}
- PROMETHEUS_NEO4J_BATCH_SIZE=${PROMETHEUS_NEO4J_BATCH_SIZE}
# Knowledge Graph settings
- PROMETHEUS_KNOWLEDGE_GRAPH_MAX_AST_DEPTH=${PROMETHEUS_KNOWLEDGE_GRAPH_MAX_AST_DEPTH}
- PROMETHEUS_KNOWLEDGE_GRAPH_CHUNK_SIZE=${PROMETHEUS_KNOWLEDGE_GRAPH_CHUNK_SIZE}
- PROMETHEUS_KNOWLEDGE_GRAPH_CHUNK_OVERLAP=${PROMETHEUS_KNOWLEDGE_GRAPH_CHUNK_OVERLAP}
- PROMETHEUS_WORKING_DIRECTORY=${PROMETHEUS_WORKING_DIRECTORY}
# LLM model settings
- PROMETHEUS_ADVANCED_MODEL=${PROMETHEUS_ADVANCED_MODEL}
- PROMETHEUS_BASE_MODEL=${PROMETHEUS_BASE_MODEL}
# API keys for various LLM providers
- PROMETHEUS_ANTHROPIC_API_KEY=${PROMETHEUS_ANTHROPIC_API_KEY}
- PROMETHEUS_GEMINI_API_KEY=${PROMETHEUS_GEMINI_API_KEY}
- PROMETHEUS_OPENAI_FORMAT_API_KEY=${PROMETHEUS_OPENAI_FORMAT_API_KEY}
- PROMETHEUS_OPENAI_FORMAT_BASE_URL=${PROMETHEUS_OPENAI_FORMAT_BASE_URL}
# Model settings
- PROMETHEUS_ADVANCED_MODEL_TEMPERATURE=${PROMETHEUS_ADVANCED_MODEL_TEMPERATURE}
- PROMETHEUS_BASE_MODEL_TEMPERATURE=${PROMETHEUS_BASE_MODEL_TEMPERATURE}
# Tavily API key
- PROMETHEUS_TAVILY_API_KEY=${PROMETHEUS_TAVILY_API_KEY}
# Database settings
- PROMETHEUS_DATABASE_URL=${PROMETHEUS_DATABASE_URL}
# JWT settings
- PROMETHEUS_JWT_SECRET_KEY=${PROMETHEUS_JWT_SECRET_KEY}
volumes:
- .:/app
- /var/run/docker.sock:/var/run/docker.sock
depends_on:
neo4j:
condition: service_healthy
postgres:
condition: service_healthy

View File

@@ -0,0 +1,11 @@
## This log tracks our evaluation results and associated costs.
| Date | Executed by | Version | Dataset | #Instance | Model | Resolved Rate | API Cost | Notes |
|------------|-------------|----------------------|-----------------------------------|-----------|----------------------|---------------|----------|------------------------------------|
| 2025-07-08 | Yue Pan | v1.0 | SWE-Bench Lite | 300 | DeepSeek V3 | 28.67% | $70.05 | initial version |
| 2025-07-18 | Yue Pan | v1.0 | SWE-Bench Multilingual | 300 | DeepSeek V3 | 13.67% | $113.6 | initial version |
| 2025-07-31 | Yue Pan | v1.1 | SWE-Bench Lite | 300 | GPT-4o | 30.00% | $1569.73 | context retrieval improved version |
| 2025-08-09 | Zhaoyang | v1.0 | SWE-Bench Verified | 500 | Devstral Medium 2507 | 33.00% | - | |
| 2025-08-11 | Yue Pan | v1.1 | SWE-Bench Verified | 500 | Devstral Medium 2507 | 38.4% | - | |
| 2025-11-06 | Yue Pan | v1.3(with Athena) | dcloud347/SWE-bench_verified_lite | 50 | GPT-5 + gpt-4o | 70.00% | $200.79 | |
| 2025-11-06 | Yue Pan | v1.3(without Athena) | dcloud347/SWE-bench_verified_lite | 50 | GPT-5 + gpt-4o | 56.00% | $367.73 | |

View File

@@ -0,0 +1,179 @@
# GitHub Issue Auto Debug Script Usage Guide
## Overview
`prometheus/script/github_issue_debug.py` is an automated script for:
1. Retrieving detailed information (title, body, comments, etc.) of a specified issue from the GitHub API.
2. Automatically uploading the GitHub repository to Prometheus.
3. Using Prometheus's AI analysis capabilities to debug the issue.
4. Returning analysis results, fix patches, etc.
## Prerequisites
### 1. Start Prometheus Service
Ensure the Prometheus service is running:
```bash
# Start using docker-compose
docker-compose up --build
```
### 2. Obtain GitHub Personal Access Token
1. Visit https://github.com/settings/tokens
2. Click "Generate new token (classic)"
3. Select the appropriate permission scope:
- `repo` (access private repositories)
- `public_repo` (access public repositories)
4. Generate and save the token.
## Basic Usage
### Simple Example
```bash
python github_issue_debug.py \
--github-token "your_token_here" \
--repo "owner/repository" \
--issue-number 42
```
### Full Parameter Example
```bash
python github_issue_debug.py \
--github-token "ghp_xxxxxxxxxxxxxxxxxxxx" \
--repo "microsoft/vscode" \
--issue-number 123 \
--prometheus-url "http://localhost:9002/v1.2" \
--output-file "debug_result.json" \
--run-build \
--run-test \
--run-reproduction-test \
--run-regression-test \
--push-to-remote \
--image-name "python:3.11-slim" \
--workdir "/app" \
--build-commands "pip install -r requirements.txt" "python setup.py build" \
--test-commands "pytest tests/" \
--candidate-patches 3
```
## Parameter Details
### Required Parameters
- `--github-token`: GitHub Personal Access Token
- `--repo`: GitHub repository name in the format `owner/repo`
- `--issue-number`: Issue number to process
### Optional Parameters
- `--prometheus-url`: Prometheus service address (default: http://localhost:8000)
- `--output-file`: Path to the result output file (if not specified, output to console)
### Validation Options
- `--run-build`: Run build validation for the generated patch
- `--run-test`: Run test validation for the generated patch
- `--run-reproduction-test`: Run reproduction test to verify if the issue can be reproduced
- `--run-regression-test`: Run regression test to ensure existing functionality is not broken
- `--push-to-remote`: Push the fix to a remote Git branch
### Docker Environment Configuration
- `--dockerfile-content`: Specify Dockerfile content directly
- `--image-name`: Use a predefined Docker image
- `--workdir`: Working directory inside the container (default: /app)
- `--build-commands`: List of build commands
- `--test-commands`: List of test commands
### Other Options
- `--candidate-patches`: Number of candidate patches (default: 6)
## Usage Scenarios
### Scenario 1: Simple Bug Report Analysis
```bash
# Analyze a simple bug report without running any validation
python github_issue_debug.py \
--github-token "your_token" \
--repo "pytorch/pytorch" \
--issue-number 89123
```
### Scenario 2: Python Project with Test Validation
```bash
# Perform a complete debug for a Python project, including build and test validation
python github_issue_debug.py \
--github-token "your_token" \
--repo "requests/requests" \
--issue-number 5678 \
--run-build \
--run-test \
--run-reproduction-test \
--run-regression-test \
--image-name "python:3.11-slim" \
--build-commands "pip install -e ." \
--test-commands "pytest tests/test_requests.py"
```
### Scenario 3: Node.js Project with Auto Push
```bash
# Process an issue for a Node.js project and automatically push the fix to a remote branch
python github_issue_debug.py \
--github-token "your_token" \
--repo "facebook/react" \
--issue-number 9876 \
--run-build \
--run-test \
--run-reproduction-test \
--run-regression-test \
--push-to-remote \
--image-name "node:18-slim" \
--build-commands "npm ci" "npm run build" \
--test-commands "npm test"
```
### Scenario 4: Custom Docker Environment
```bash
# Use a custom Dockerfile for debugging
python github_issue_debug.py \
--github-token "your_token" \
--repo "tensorflow/tensorflow" \
--issue-number 4321 \
--run-build \
--dockerfile-content "FROM tensorflow/tensorflow:latest-gpu
WORKDIR /app
COPY . /app
RUN pip install -r requirements.txt" \
--workdir "/app" \
--build-commands "python setup.py build_ext --inplace" \
--test-commands "python -m pytest tests/unit/"
```
## Output Result Explanation
After execution, the script outputs results in JSON format, including the following fields:
```json
{
"success": true,
"issue_info": {
"repo": "owner/repo",
"number": 123,
"title": "Issue Title",
"url": "https://github.com/owner/repo/issues/123",
"state": "open"
},
"prometheus_result": {
"patch": "Generated code patch",
"passed_reproducing_test": true,
"passed_existing_test": false,
"passed_regression_test": true,
"passed_reproduction_test": true,
"issue_response": "AI-generated issue response"
},
"created_branch_and_pushed": true,
"branch_name": "fix-issue-123"
}
```
### Result Field Description
- `success`: Whether the process was successful
- `issue_info`: Basic information about the GitHub issue
- `prometheus_result.patch`: Code fix patch generated by Prometheus
- `prometheus_result.passed_*`: Status of various validations
- `prometheus_result.issue_response`: AI-generated issue analysis and response

View File

@@ -0,0 +1,180 @@
# Multi-Agent Architecture
Prometheus uses a multi-agent system powered by LangGraph to intelligently process and resolve GitHub issues. Each agent is specialized for a specific task in the issue resolution pipeline.
## Agent Overview
### 1. Issue Classification Agent
**Purpose**: Automatically classifies GitHub issues into categories (bug, question, feature, documentation).
**Location**: `prometheus/lang_graph/subgraphs/issue_classification_subgraph.py`
**Workflow**:
- Retrieves relevant code context from the knowledge graph
- Uses LLM to analyze issue content and classify type
- Returns issue type for routing to appropriate handler
**When Used**: When `issue_type == "auto"` in the issue request
---
### 2. Environment Build Agent
**Status**: In Progress
**Purpose**: Automatically sets up and configures the development environment for testing and building.
**Planned Features**:
- Auto-detect project type (Python, Node.js, Java, etc.)
- Install dependencies
- Configure build tools
- Validate environment setup
---
### 3. Bug Reproduction Agent
**Purpose**: Attempts to reproduce reported bugs by writing and executing reproduction tests.
**Location**: `prometheus/lang_graph/subgraphs/bug_reproduction_subgraph.py`
**Workflow**:
1. Retrieves bug-related code context from knowledge graph
2. Generates reproduction test code using LLM
3. Edits necessary files to create the test
4. Executes the test in a Docker container
5. Evaluates whether the bug was successfully reproduced
6. Retries with feedback if reproduction fails
**Output**:
- `reproduced_bug`: Boolean indicating success
- `reproduced_bug_file`: Path to reproduction test
- `reproduced_bug_commands`: Commands to reproduce
- `reproduced_bug_patch`: Git patch with changes
**Key Features**:
- Iterative refinement with retry loops
- Docker-isolated execution
- Feedback-driven improvement
---
### 4. Context Retrieval Agent
**Purpose**: Retrieves relevant code and documentation context from the Neo4j knowledge graph.
**Location**: `prometheus/lang_graph/subgraphs/context_retrieval_subgraph.py`
**Workflow**:
1. Converts natural language query to knowledge graph query
2. Uses LLM with graph traversal tools to find relevant context
3. Selects and extracts useful code snippets
4. Optionally refines query and retries if context is insufficient
5. Returns structured context (code, AST nodes, documentation)
**Key Features**:
- Iterative query refinement (2-4 loops)
- Tool-augmented LLM with Neo4j access
- Traverses file hierarchy, AST structure, and text chunks
**Used By**: All other agents for context gathering
---
### 5. Issue Resolution Agent
**Purpose**: Generates and validates bug fix patches for verified bugs.
**Location**: `prometheus/lang_graph/subgraphs/issue_verified_bug_subgraph.py`
**Workflow**:
1. Retrieves fix-relevant code context
2. Analyzes bug root cause using LLM
3. Generates code patch to fix the bug
4. Applies patch and creates git diff
5. Validates patch against:
- Reproduction test (must pass)
- Regression tests (optional)
- Existing test suite (optional)
6. Generates multiple candidate patches
7. Selects best patch based on test results
8. Retries with error feedback if tests fail
**Output**:
- `edit_patch`: Final selected fix patch
- Test pass/fail results
**Key Features**:
- Multi-candidate patch generation
- Multi-level validation (reproduction, regression, existing tests)
- Feedback-driven iteration
- Best patch selection using LLM
---
## Agent Coordination
### Main Issue Processing Flow
```
User Issue -> Issue Classification Agent
|
[Route by issue type]
|
+-----+-----+
| |
BUG QUESTION
| |
v v
Bug Pipeline Question Pipeline
```
### Bug Resolution Pipeline
```
Bug Issue -> Context Retrieval Agent (select regression tests)
-> Bug Reproduction Agent (verify bug exists)
-> [If reproduced] -> Issue Resolution Agent (generate fix)
-> [If not reproduced] -> Direct resolution without reproduction
-> Response Generation
```
### Question Answering Pipeline
```
Question -> Context Retrieval Agent (gather relevant code/docs)
-> Question Analysis Agent (LLM with tools)
-> Response Generation
```
---
## Agent Communication
Agents communicate through **shared state** managed by LangGraph:
- Each subgraph has a typed state dictionary
- State flows through nodes and is updated progressively
- Parent states are inherited by child subgraphs
- Results are passed back through state returns
---
## Technology Stack
- **LangGraph**: State machine orchestration
- **LangChain**: LLM integration and tool calling
- **Neo4j**: Knowledge graph storage and retrieval
- **Docker**: Isolated test execution environment
- **Tree-sitter**: Code parsing and AST generation
- **Git**: Patch management and version control
---
## Future Enhancements
- **Environment Build Agent**: Complete implementation for automatic setup
- **Pull Request Review Agent**: Automated code review
- **Feature Implementation Agent**: Handle feature requests
- **Documentation Generation Agent**: Auto-generate docs from code

Binary file not shown.

After

Width:  |  Height:  |  Size: 103 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 270 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 398 KiB

View File

@@ -0,0 +1 @@
<?xml version="1.0" encoding="UTF-8"?><svg id="_图层_2" xmlns="http://www.w3.org/2000/svg" viewBox="0 0 734.6 182.7"><defs><style>.cls-1,.cls-2{fill:#000;stroke-width:0px;}.cls-2{fill-rule:evenodd;}</style></defs><g id="_图层_1-2"><path class="cls-2" d="M21.9,0l59.2,27v128.8l-59.2,27L0,169.8V13L21.9,0ZM18.8,9.3l-8,4.7,8,2.9v-7.6ZM18.8,23.7l-12.1-4.4v144.1l12.1-4.4V23.7ZM25.5,156.6V26.1l30,10.8v108.8l-30,10.8ZM18.8,165.8l-8,2.9,8,4.7v-7.6ZM25.5,174.1v-10.7l33.4-12.1,10,3.1-43.4,19.8h0ZM74.5,149.5l-12.4-3.8V37.2l12.4-3.8v116.1ZM68.9,28.4l-10,3-33.4-12.1v-10.7l43.3,19.7h0Z"/><path class="cls-1" d="M204.8,65.4c-4.3-7.5-10.4-13.3-18.3-17.3-7.9-4-17.2-6-27.6-6h-32.1v98.6h32.1c10.5,0,19.8-2,27.6-5.9,7.8-3.9,14-9.6,18.3-16.9,4.3-7.3,6.5-16.1,6.5-26.1s-2.2-18.8-6.5-26.3ZM185.9,118.7c-6.3,6.3-15.4,9.6-27,9.6h-16.6V54.6h16.6c11.7,0,20.8,3.3,27.1,9.9,6.2,6.5,9.4,15.7,9.4,27.3s-3.2,20.6-9.4,26.9Z"/><path class="cls-1" d="M279.3,66.4c-5.7-3.2-12.3-4.8-19.6-4.8s-14.4,1.7-20.3,4.9c-5.8,3.3-10.4,8-13.7,14-3.2,6-4.9,13.2-4.9,21.1s1.7,15.1,5,21.1c3.3,6,8,10.8,13.8,14.1,5.8,3.3,12.6,5,20,5s16.8-2.3,22.9-6.8c5.9-4.4,10.1-10.1,12.5-16.9h-16.5c-3.7,7.3-10.1,11-18.9,11s-11.5-2-15.8-5.9c-4.3-3.9-6.7-9.1-7.2-15.5v-.6s.5,0,.5,0h59.8c.3-2.2.5-4.8.5-7.5,0-7.5-1.6-14.2-4.8-20-3.2-5.7-7.7-10.3-13.4-13.5ZM281,95.3h-44.2v-.6c.8-6.2,3.3-11.2,7.4-14.8,4.1-3.6,9.1-5.5,14.8-5.5s11.7,1.9,15.9,5.6c4.3,3.7,6.5,8.7,6.6,14.8v.5s-.5,0-.5,0Z"/><rect class="cls-1" x="309.2" y="27.7" width="15.3" height="113"/><polygon class="cls-1" points="375.1 122.7 353 62.9 336 62.9 366.8 139.7 366.8 139.9 366.8 140.1 351 177.7 366.8 177.7 414.7 62.9 399 62.9 376.1 122.7 375.6 124 375.1 122.7"/><path class="cls-1" d="M468,100.6c-3.6-1.6-8.3-3.1-14-4.6-4.3-1.2-7.6-2.3-9.8-3.1-2.2-.8-4.2-2-5.8-3.5-1.6-1.5-2.5-3.4-2.5-5.7s1.2-5.1,3.7-6.9c2.4-1.7,5.8-2.5,10.2-2.5s8.1,1.1,10.8,3.3c2.6,2.1,4.1,4.9,4.4,8.3h15.3c-.5-7.4-3.4-13.3-8.6-17.6-5.4-4.4-12.5-6.7-21.3-6.7s-11.2,1-15.7,3c-4.5,2-8,4.7-10.4,8-2.4,3.4-3.6,7.1-3.6,11.2s1.3,9.1,3.9,12.1c2.7,3.1,5.8,5.4,9.5,6.9,3.7,1.5,8.5,3.1,14.4,4.7,6.1,1.7,10.6,3.3,13.5,4.8,3,1.5,4.5,3.9,4.5,7.1s-1.4,5.4-4,7.2c-2.6,1.8-6.3,2.7-11,2.7s-8.3-1.2-11.3-3.5c-2.9-2.2-4.5-5-4.9-8.3h-15.9c.3,4.4,1.8,8.5,4.4,12.1,2.8,3.8,6.7,6.8,11.5,9,4.8,2.2,10.4,3.3,16.4,3.3s11.3-1,15.7-3c4.4-2,7.9-4.7,10.3-8.1,2.4-3.4,3.6-7.4,3.6-11.7,0-4.9-1.4-8.9-4-11.8-2.6-3-5.7-5.3-9.3-6.8Z"/><path class="cls-1" d="M501.8,27.7c-2.8,0-5.2,1-7.1,2.8-1.9,1.9-2.8,4.3-2.8,7.1s1,5.2,2.8,7.1c1.9,1.9,4.3,2.8,7.1,2.8s5-1,6.9-2.8c1.9-1.9,2.8-4.3,2.8-7.1s-1-5.2-2.8-7.1c-1.9-1.9-4.2-2.8-6.9-2.8Z"/><rect class="cls-1" x="494.1" y="62.9" width="15.3" height="77.8"/><path class="cls-1" d="M543.6,122.9c-3.6-3.8-5.4-9.4-5.4-16.6v-43.4h-15.1v45.8c0,7,1.4,13.1,4.2,18.1,2.7,4.9,6.6,8.7,11.4,11.2,4.8,2.5,10.3,3.8,16.4,3.8s9-.9,13-2.7c4.1-1.8,7.4-4.3,9.9-7.5l.9-1.2v10.3h15.3V62.9h-15.3v43.4c0,7.2-1.9,12.7-5.5,16.6-3.7,3.9-8.7,5.8-14.9,5.8s-11.2-2-14.8-5.8Z"/><path class="cls-1" d="M719.1,65.4c-4.8-2.5-10.3-3.8-16.4-3.8s-11.1,1.4-16.1,4.2c-4.9,2.7-8.6,6.5-10.9,11.1l-.5.9-.5-.9c-2.6-4.9-6.4-8.8-11.2-11.4-4.9-2.6-10.5-3.9-16.6-3.9s-8.9.9-12.9,2.7c-4,1.8-7.4,4.3-10,7.4l-.9,1.1v-10h-15.3v77.8h15.3v-43.5c0-7.2,1.9-12.7,5.5-16.6,3.7-3.9,8.7-5.8,14.9-5.8s11.2,2,14.8,5.8c3.6,3.9,5.4,9.4,5.4,16.6v43.5h15.1v-43.5c0-7.2,1.9-12.7,5.5-16.6,3.7-3.9,8.7-5.8,14.9-5.8s11.2,2,14.8,5.8c3.6,3.9,5.4,9.4,5.4,16.6v43.5h15.1v-46c0-7-1.4-13.1-4.2-18.1-2.8-4.9-6.6-8.7-11.4-11.2Z"/></g></svg>

After

Width:  |  Height:  |  Size: 3.4 KiB

BIN
prometheus/docs/static/images/icon.jpg vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 10 KiB

47
prometheus/example.env Normal file
View File

@@ -0,0 +1,47 @@
# Logging
PROMETHEUS_LOGGING_LEVEL=DEBUG
# General settings
PROMETHEUS_ENVIRONMENT=local
PROMETHEUS_BACKEND_CORS_ORIGINS=["*"]
PROMETHEUS_ENABLE_AUTHENTICATION=false
# Neo4j settings
PROMETHEUS_NEO4J_URI=bolt://neo4j:7687
PROMETHEUS_NEO4J_USERNAME=neo4j
PROMETHEUS_NEO4J_PASSWORD=password
PROMETHEUS_NEO4J_BATCH_SIZE=1000
# Knowledge Graph settings
PROMETHEUS_WORKING_DIRECTORY=working_dir/
PROMETHEUS_KNOWLEDGE_GRAPH_MAX_AST_DEPTH=1
PROMETHEUS_KNOWLEDGE_GRAPH_CHUNK_SIZE=5000
PROMETHEUS_KNOWLEDGE_GRAPH_CHUNK_OVERLAP=500
# LLM model settings
PROMETHEUS_ADVANCED_MODEL=gpt-4o
PROMETHEUS_BASE_MODEL=gpt-4o
# API keys for various LLM providers
PROMETHEUS_ANTHROPIC_API_KEY=anthropic_api_key
PROMETHEUS_GEMINI_API_KEY=gemini_api_key
PROMETHEUS_OPENAI_FORMAT_BASE_URL=https://api.openai.com/v1
PROMETHEUS_OPENAI_FORMAT_API_KEY=your_api_key
# Model settings
PROMETHEUS_ADVANCED_MODEL_TEMPERATURE=0.5
PROMETHEUS_BASE_MODEL_TEMPERATURE=0.5
# Tavily API settings (You can sign up for a free API key at https://www.tavily.com/)
PROMETHEUS_TAVILY_API_KEY=your_tavily_api_key
# Database settings
PROMETHEUS_DATABASE_URL=postgresql+asyncpg://postgres:password@postgres:5432/postgres
# JWT settings(If authentication is enabled)
PROMETHEUS_JWT_SECRET_KEY=your_jwt_secret_key
# Athena memory service settings (Athena memory is private for now, community version can forget this and comment this line)
PROMETHEUS_ATHENA_BASE_URL=http://localhost:9003/v0.1.0

69
prometheus/install.sh Executable file
View File

@@ -0,0 +1,69 @@
#!/bin/bash
# Prometheus Installation Script for Claude Code
# Multi-agent AI code analysis and issue resolution system
set -euo pipefail
PROMETHEUS_DIR="${HOME}/.claude/prometheus"
VENV_DIR="${PROMETHEUS_DIR}/venv"
LOG_DIR="${PROMETHEUS_DIR}/logs"
mkdir -p "$LOG_DIR"
log() {
echo "[$(date -u +"%Y-%m-%d %H:%M:%S UTC")] [prometheus-install] $*" | tee -a "${LOG_DIR}/install.log"
}
log "Starting Prometheus installation..."
# Check Python version
if ! command -v python3 >/dev/null 2>&1; then
log "ERROR: Python 3 not found"
exit 1
fi
# Create virtual environment
log "Creating virtual environment at ${VENV_DIR}..."
cd "${PROMETHEUS_DIR}"
python3 -m venv "$VENV_DIR"
# Activate and install dependencies
log "Installing Python dependencies..."
source "${VENV_DIR}/bin/activate"
pip install --upgrade pip setuptools wheel
# Install core dependencies
log "Installing core dependencies..."
pip install \
langgraph \
langchain \
langchain-community \
langchain-openai \
neo4j \
docker \
pytest \
pydantic \
python-dotenv \
rich \
typer \
uvicorn \
fastapi \
httpx \
aiostream
log "Prometheus installation complete!"
log "Activate with: source ${VENV_DIR}/bin/activate"
log "Python path: ${VENV_DIR}/bin/python"
# Save paths for later use
cat > "${PROMETHEUS_DIR}/paths.sh" << 'PATHS_EOF'
#!/bin/bash
export PROMETHEUS_DIR="${HOME}/.claude/prometheus"
export PROMETHEUS_VENV="${PROMETHEUS_DIR}/venv"
export PROMETHEUS_PYTHON="${PROMETHEUS_VENV}/bin/python"
export PROMETHEUS_LOGS="${PROMETHEUS_DIR}/logs"
PATHS_EOF
chmod +x "${PROMETHEUS_DIR}/paths.sh"
log "Installation successful!"

443
prometheus/integrate-all.sh Executable file
View File

@@ -0,0 +1,443 @@
#!/bin/bash
# Prometheus Deep Integration - All Agents, Tools, and Skills
# Integrates every Prometheus component into Claude Code CLI
set -euo pipefail
PROMETHEUS_DIR="${HOME}/.claude/prometheus"
COMMANDS_DIR="${HOME}/.claude/commands"
SKILLS_DIR="${HOME}/.claude/skills"
TOOLS_DIR="${HOME}/.claude/tools"
INTEGRATION_DIR="${PROMETHEUS_DIR}/integration"
mkdir -p "$INTEGRATION_DIR" "$TOOLS_DIR" "$SKILLS_DIR/prometheus-agents" "$SKILLS_DIR/prometheus-tools"
log() {
echo "[$(date -u +"%Y-%m-%d %H:%M:%S UTC")] $*"
}
log "Starting deep Prometheus integration..."
# 1. Create individual agent commands
log "Creating agent commands..."
# Issue Classifier Agent
cat > "${COMMANDS_DIR}/prometheus-classify.md" << 'AGENT1'
---
description: "Classify incoming issues as bug, feature, question, or documentation using Prometheus classifier agent"
---
Invoke the Prometheus Issue Classifier to categorize the user's input.
**Task:**
{{USER_MESSAGE}}
**Instructions:**
1. Use ~/.claude/hooks/prometheus-wrapper.sh with --issue-classify mode
2. Analyze the issue type (bug/feature/question/documentation)
3. Provide classification with confidence score
4. Return routing recommendation
AGENT1
# Bug Analyzer Agent
cat > "${COMMANDS_DIR}/prometheus-bug.md" << 'AGENT2'
---
description: "Analyze and reproduce bugs using Prometheus bug analyzer agent with context retrieval and reproduction steps"
---
Invoke the Prometheus Bug Analyzer to investigate and reproduce the reported bug.
**Task:**
{{USER_MESSAGE}}
**Instructions:**
1. Use ~/.claude/hooks/prometheus-wrapper.sh with --bug mode
2. Retrieve relevant code context
3. Create reproduction steps
4. Generate bug analysis report
AGENT2
# Feature Analyzer Agent
cat > "${COMMANDS_DIR}/prometheus-feature.md" << 'AGENT3'
---
description: "Analyze feature requests and create implementation plans using Prometheus feature analyzer agent"
---
Invoke the Prometheus Feature Analyzer to analyze and plan the feature implementation.
**Task:**
{{USER_MESSAGE}}
**Instructions:**
1. Use ~/.claude/hooks/prometheus-wrapper.sh with --feature mode
2. Analyze requirements and dependencies
3. Create implementation plan
4. Identify potential risks and alternatives
AGENT3
# Context Provider Agent
cat > "${COMMANDS_DIR}/prometheus-context.md" << 'AGENT4'
---
description: "Retrieve intelligent code context using Prometheus knowledge graph and semantic search"
---
Invoke the Prometheus Context Provider to retrieve relevant code context.
**Query:**
{{USER_MESSAGE}}
**Instructions:**
1. Use ~/.claude/hooks/prometheus-wrapper.sh with --context mode
2. Search knowledge graph for relevant code
3. Retrieve semantic context
4. Provide structured context with file references
AGENT4
# Edit Generator Agent
cat > "${COMMANDS_DIR}/prometheus-edit.md" << 'AGENT5'
---
description: "Generate code edits and patches using Prometheus edit generator with validation"
---
Invoke the Prometheus Edit Generator to create code changes.
**Task:**
{{USER_MESSAGE}}
**Instructions:**
1. Use ~/.claude/hooks/prometheus-wrapper.sh with --edit mode
2. Generate patch with context awareness
3. Validate edit safety
4. Provide diff and application instructions
AGENT5
# Test Runner Agent
cat > "${COMMANDS_DIR}/prometheus-test.md" << 'AGENT6'
---
description: "Run tests in containerized environment using Prometheus test runner agent"
---
Invoke the Prometheus Test Runner to execute and validate tests.
**Task:**
{{USER_MESSAGE}}
**Instructions:**
1. Use ~/.claude/hooks/prometheus-wrapper.sh with --test mode
2. Set up containerized test environment
3. Execute test suite
4. Provide detailed results and coverage
AGENT6
log "Created 6 core agent commands"
# 2. Create tool wrappers
log "Creating tool wrappers..."
# File Operations Tool
cat > "${SKILLS_DIR}/prometheus-tools/file-ops.md" << 'TOOL1'
---
name: prometheus-file-ops
description: "Advanced file operations with AST parsing and semantic understanding"
---
# Prometheus File Operations Tool
Provides advanced file operations beyond basic read/write:
- AST-based code parsing
- Semantic code understanding
- Multi-language support
- Safe refactoring operations
**Usage:**
- "Parse the AST of src/main.py"
- "Find all function calls to authenticate()"
- "Refactor the User class across all files"
- "Extract the authentication logic"
TOOL1
# Graph Traversal Tool
cat > "${SKILLS_DIR}/prometheus-tools/graph-traverse.md" << 'TOOL2'
---
name: prometheus-graph-traverse
description: "Navigate codebase knowledge graph for semantic code relationships"
---
# Prometheus Graph Traversal Tool
Navigate the unified knowledge graph to understand code relationships:
- Find dependencies between components
- Trace data flow through the system
- Identify impact areas for changes
- Discover related code patterns
**Usage:**
- "Trace the data flow from API to database"
- "Find all functions that depend on User.authenticate()"
- "Show the call graph for payment processing"
- "What files would be affected by changing User model?"
TOOL2
# Container Command Tool
cat > "${SKILLS_DIR}/prometheus-tools/container.md" << 'TOOL3'
---
name: prometheus-container
description: "Execute commands in Docker container for isolated testing"
---
# Prometheus Container Tool
Run commands in isolated Docker containers:
- Safe code execution
- Environment isolation
- Dependency management
- Reproducible testing
**Usage:**
- "Test this code in a container"
- "Run the test suite in Docker"
- "Build the project in isolated environment"
- "Execute the reproduction steps safely"
TOOL3
# Web Search Tool
cat > "${SKILLS_DIR}/prometheus-tools/web-search.md" << 'TOOL4'
---
name: prometheus-web-search
description: "Search web for documentation, similar issues, and solutions"
---
# Prometheus Web Search Tool
Intelligent web search for code-related queries:
- Documentation lookup
- Similar issue search
- Solution discovery
- Best practices research
**Usage:**
- "Search for solutions to authentication timeout issues"
- "Find documentation for Django REST framework"
- "Look up similar bugs in open source projects"
- "Research best practices for API design"
TOOL4
log "Created 4 tool wrappers"
# 3. Create master integration skill
cat > "${SKILLS_DIR}/prometheus-master.md" << 'MASTER'
---
name: prometheus-master
description: "Master Prometheus integration - orchestrates all Prometheus agents and tools based on task requirements"
---
# Prometheus Master Integration
This skill automatically selects and orchestrates the appropriate Prometheus agents and tools based on your task.
## Available Capabilities
### Agents (via /prometheus-*)
- `/prometheus-classify` - Classify issues (bug/feature/question/doc)
- `/prometheus-bug` - Analyze and reproduce bugs
- `/prometheus-feature` - Plan feature implementations
- `/prometheus-context` - Retrieve intelligent code context
- `/prometheus-edit` - Generate validated code edits
- `/prometheus-test` - Run containerized tests
### Tools
- **File Operations** - AST-based code parsing and refactoring
- **Graph Traversal** - Navigate knowledge graph
- **Container** - Isolated Docker execution
- **Web Search** - Intelligent documentation lookup
## Automatic Selection
This skill automatically:
1. Analyzes your task type
2. Selects appropriate Prometheus agent(s)
3. Orchestrates tool usage
4. Provides unified results
## Usage Examples
```
"Fix the authentication bug" → Uses bug analyzer + context provider + edit generator
"Add rate limiting" → Uses feature analyzer + context provider + edit generator
"Explain the payment flow" → Uses context provider + graph traversal
"Test the login system" → Uses test runner + container tool
```
Simply describe your task and this skill will orchestrate the appropriate Prometheus components.
MASTER
log "Created master integration skill"
# 4. Update prometheus wrapper with modes
log "Updating Prometheus wrapper..."
cat > "${HOME}/.claude/hooks/prometheus-wrapper.sh" << 'WRAPPER'
#!/bin/bash
# Prometheus Wrapper - All modes and agents
set -euo pipefail
PROMETHEUS_DIR="${HOME}/.claude/prometheus"
VENV_DIR="${PROMETHEUS_DIR}/venv"
LOG_DIR="${PROMETHEUS_DIR}/logs"
mkdir -p "$LOG_DIR"
log_prometheus() {
echo "[$(date -u +"%Y-%m-%d %H:%M:%S UTC")] [prometheus] $*" | tee -a "${LOG_DIR}/wrapper.log"
}
check_installed() {
if [[ ! -d "$VENV_DIR" ]]; then
echo "Prometheus not installed. Run: bash ${PROMETHEUS_DIR}/install.sh"
return 1
fi
return 0
}
execute_prometheus() {
local task="$1"
local mode="${2:-auto}"
local repo="${3:-.}"
log_prometheus "Mode: $mode, Repo: $repo"
log_prometheus "Task: $task"
if ! check_installed; then
echo "ERROR: Prometheus not installed. Run install script first." >&2
return 1
fi
source "${VENV_DIR}/bin/activate"
cd "$repo"
# Simulate Prometheus execution (would call actual Python code)
case "$mode" in
classify|bug|feature|context|edit|test)
log_prometheus "Running in $mode mode"
echo "Prometheus [$mode mode]: Analyzing task..."
echo "Task: $task"
echo ""
echo "Note: Full Prometheus execution requires:"
echo " 1. Run: bash ~/.claude/prometheus/install.sh"
echo " 2. Configure: API keys and Neo4j (optional)"
echo " 3. Dependencies: LangGraph, Docker, Neo4j"
;;
*)
log_prometheus "Running in auto mode"
echo "Prometheus [auto]: Detecting task type..."
echo "Task: $task"
;;
esac
}
main() {
local task=""
local mode="auto"
local repo="."
while [[ $# -gt 0 ]]; do
case "$1" in
--classify|--bug|--feature|--context|--edit|--test)
mode="${1#--}"
shift
;;
--repo|-r)
repo="$2"
shift 2
;;
*)
task="$1"
shift
;;
esac
done
[[ -z "$task" ]] && task=$(cat)
[[ -z "$task" ]] && echo "Usage: $0 [--classify|--bug|--feature|--context|--edit|--test] [--repo <path>] <task>" && exit 1
execute_prometheus "$task" "$mode" "$repo"
}
main "$@"
WRAPPER
chmod +x "${HOME}/.claude/hooks/prometheus-wrapper.sh"
log "Updated Prometheus wrapper"
# 5. Create summary
cat > "${INTEGRATION_DIR}/SUMMARY.md" << 'SUMMARY'
# Prometheus Deep Integration Complete
## Integrated Components
### 6 Agent Commands
- `/prometheus-classify` - Issue classification
- `/prometheus-bug` - Bug analysis and reproduction
- `/prometheus-feature` - Feature planning
- `/prometheus-context` - Intelligent context retrieval
- `/prometheus-edit` - Code edit generation
- `/prometheus-test` - Containerized testing
### 4 Tool Skills
- **File Operations** - AST-based code operations
- **Graph Traversal** - Knowledge graph navigation
- **Container** - Docker execution
- **Web Search** - Documentation lookup
### 1 Master Skill
- `prometheus-master` - Automatic orchestration
## Usage
```bash
# Quick commands
/prometheus-classify "User gets error after login"
/prometheus-bug "Fix authentication timeout"
/prometheus-feature "Add OAuth2 support"
/prometheus-context "How does payment work?"
/prometheus-edit "Refactor User class"
/prometheus-test "Run test suite"
# Master skill (automatic selection)
"Use prometheus to fix the login bug"
"Analyze this with prometheus"
```
## Installation
To activate full Prometheus:
```bash
bash ~/.claude/prometheus/install.sh
```
## Files Created
- Commands: ~/.claude/commands/prometheus-*.md (6 files)
- Tools: ~/.claude/skills/prometheus-tools/*.md (4 files)
- Master: ~/.claude/skills/prometheus-master.md
- Wrapper: ~/.claude/hooks/prometheus-wrapper.sh (updated)
SUMMARY
log ""
log "╔════════════════════════════════════════════════════════════╗"
log "║ Prometheus Deep Integration Complete! ║"
log "║ ║"
log "║ Agent Commands: 6 ║"
log "║ Tool Skills: 4 ║"
log "║ Master Orchestrator: 1 ║"
log "║ ║"
log "║ Usage: /prometheus-bug, /prometheus-feature, etc. ║"
log "║ Or just: 'Use prometheus to ...' ║"
log "║ ║"
log "║ Install: bash ~/.claude/prometheus/install.sh ║"
log "╚════════════════════════════════════════════════════════════╝"
log ""
echo "Integration complete! See ${INTEGRATION_DIR}/SUMMARY.md for details."

View File

@@ -0,0 +1,49 @@
# Prometheus Deep Integration Complete
## Integrated Components
### 6 Agent Commands
- `/prometheus-classify` - Issue classification
- `/prometheus-bug` - Bug analysis and reproduction
- `/prometheus-feature` - Feature planning
- `/prometheus-context` - Intelligent context retrieval
- `/prometheus-edit` - Code edit generation
- `/prometheus-test` - Containerized testing
### 4 Tool Skills
- **File Operations** - AST-based code operations
- **Graph Traversal** - Knowledge graph navigation
- **Container** - Docker execution
- **Web Search** - Documentation lookup
### 1 Master Skill
- `prometheus-master` - Automatic orchestration
## Usage
```bash
# Quick commands
/prometheus-classify "User gets error after login"
/prometheus-bug "Fix authentication timeout"
/prometheus-feature "Add OAuth2 support"
/prometheus-context "How does payment work?"
/prometheus-edit "Refactor User class"
/prometheus-test "Run test suite"
# Master skill (automatic selection)
"Use prometheus to fix the login bug"
"Analyze this with prometheus"
```
## Installation
To activate full Prometheus:
```bash
bash ~/.claude/prometheus/install.sh
```
## Files Created
- Commands: ~/.claude/commands/prometheus-*.md (6 files)
- Tools: ~/.claude/skills/prometheus-tools/*.md (4 files)
- Master: ~/.claude/skills/prometheus-master.md
- Wrapper: ~/.claude/hooks/prometheus-wrapper.sh (updated)

View File

View File

View File

@@ -0,0 +1,16 @@
from fastapi import APIRouter
from prometheus.app.api.routes import auth, github, invitation_code, issue, repository, user
from prometheus.configuration.config import settings
api_router = APIRouter()
api_router.include_router(repository.router, prefix="/repository", tags=["repository"])
api_router.include_router(issue.router, prefix="/issue", tags=["issue"])
api_router.include_router(github.router, prefix="/github", tags=["github"])
if settings.ENABLE_AUTHENTICATION:
api_router.include_router(auth.router, prefix="/auth", tags=["auth"])
api_router.include_router(
invitation_code.router, prefix="/invitation-code", tags=["invitation_code"]
)
api_router.include_router(user.router, prefix="/user", tags=["user"])

View File

@@ -0,0 +1,68 @@
from fastapi import APIRouter, Request
from prometheus.app.models.requests.auth import CreateUserRequest, LoginRequest
from prometheus.app.models.response.auth import LoginResponse
from prometheus.app.models.response.response import Response
from prometheus.app.services.invitation_code_service import InvitationCodeService
from prometheus.app.services.user_service import UserService
from prometheus.configuration.config import settings
from prometheus.exceptions.server_exception import ServerException
router = APIRouter()
@router.post(
"/login/",
summary="Login to the system",
description="Login to the system using username, email, and password. Returns an access token.",
response_description="Returns an access token for authenticated requests",
response_model=Response[LoginResponse],
)
async def login(login_request: LoginRequest, request: Request) -> Response[LoginResponse]:
"""
Login to the system using username, email, and password.
Returns an access token for authenticated requests.
"""
user_service: UserService = request.app.state.service["user_service"]
access_token = await user_service.login(
username=login_request.username,
email=login_request.email,
password=login_request.password,
)
return Response(data=LoginResponse(access_token=access_token))
@router.post(
"/register/",
summary="Register a new user",
description="Register a new user with username, email, password and invitation code.",
response_description="Returns a success message upon successful registration",
response_model=Response,
)
async def register(request: Request, create_user_request: CreateUserRequest) -> Response:
"""
Register a new user with username, email, password and invitation code.
Returns a success message upon successful registration.
"""
invitation_code_service: InvitationCodeService = request.app.state.service[
"invitation_code_service"
]
user_service: UserService = request.app.state.service["user_service"]
# Check if the invitation code is valid
if not await invitation_code_service.check_invitation_code(create_user_request.invitation_code):
raise ServerException(code=400, message="Invalid or expired invitation code")
# Create the user
await user_service.create_user(
username=create_user_request.username,
email=create_user_request.email,
password=create_user_request.password,
issue_credit=settings.DEFAULT_USER_ISSUE_CREDIT,
)
# Mark the invitation code as used
await invitation_code_service.mark_code_as_used(create_user_request.invitation_code)
return Response(message="User registered successfully")

View File

@@ -0,0 +1,45 @@
from typing import Dict, Optional
from fastapi import APIRouter
from prometheus.app.models.response.response import Response
from prometheus.exceptions.github_exception import GithubException
from prometheus.exceptions.server_exception import ServerException
from prometheus.utils.github_utils import get_github_issue, is_repository_public
router = APIRouter()
@router.get(
"/issue/",
summary="Get GitHub issue details",
description="Get Github Issue details including title, body, and comments.",
response_description="Returns an object containing issue details",
response_model=Response[Dict],
)
async def get_github_issue_(
repo: str, issue_number: int, github_token: Optional[str] = None
) -> Response[Dict]:
"""
Get GitHub issue details including title, body, and comments.
Args:
repo (str): The GitHub repository in the format "owner/repo".
issue_number (int): The issue number to retrieve.
github_token (Optional[str]): The GitHub token to use. Optional for public repositories.
Returns:
Response[Dict]: A response object containing issue details.
"""
is_repository_public_ = await is_repository_public(repo)
if not is_repository_public_ and not github_token:
raise ServerException(
code=400,
message="The repository is private or not exists. Please provide a valid GitHub token.",
)
try:
issue_data = await get_github_issue(repo, issue_number, github_token)
except GithubException as e:
raise ServerException(code=400, message=str(e))
return Response(data=issue_data)

View File

@@ -0,0 +1,57 @@
from typing import Sequence
from fastapi import APIRouter, Request
from prometheus.app.decorators.require_login import requireLogin
from prometheus.app.entity.invitation_code import InvitationCode
from prometheus.app.models.response.response import Response
from prometheus.app.services.user_service import UserService
from prometheus.exceptions.server_exception import ServerException
router = APIRouter()
@router.post(
"/create/",
summary="Create a new invitation code",
description="Generates a new invitation code for user registration.",
response_description="Returns the newly created invitation code",
response_model=Response[InvitationCode],
)
@requireLogin
async def create_invitation_code(request: Request) -> Response[InvitationCode]:
"""
Create a new invitation code.
"""
# Check if the user is an admin
user_service: UserService = request.app.state.service["user_service"]
if not await user_service.is_admin(request.state.user_id):
raise ServerException(code=403, message="Only admins can create invitation codes")
# Create a new invitation code
invitation_code_service = request.app.state.service["invitation_code_service"]
invitation_code = await invitation_code_service.create_invitation_code()
return Response(data=invitation_code)
@router.get(
"/list/",
summary="List all invitation codes",
description="Retrieves a list of all invitation codes.",
response_description="Returns a list of invitation codes",
response_model=Response[Sequence[InvitationCode]],
)
@requireLogin
async def list_invitation_codes(request: Request) -> Response[Sequence[InvitationCode]]:
"""
List all invitation codes.
"""
# Check if the user is an admin
user_service: UserService = request.app.state.service["user_service"]
if not await user_service.is_admin(request.state.user_id):
raise ServerException(code=403, message="Only admins can list invitation codes")
# List all invitation codes
invitation_code_service = request.app.state.service["invitation_code_service"]
invitation_codes = await invitation_code_service.list_invitation_codes()
return Response(data=invitation_codes)

View File

@@ -0,0 +1,154 @@
import asyncio
from fastapi import APIRouter, Request
from prometheus.app.decorators.require_login import requireLogin
from prometheus.app.models.requests.issue import IssueRequest
from prometheus.app.models.response.issue import IssueResponse
from prometheus.app.models.response.response import Response
from prometheus.app.services.issue_service import IssueService
from prometheus.app.services.knowledge_graph_service import KnowledgeGraphService
from prometheus.app.services.repository_service import RepositoryService
from prometheus.app.services.user_service import UserService
from prometheus.configuration.config import settings
from prometheus.exceptions.server_exception import ServerException
router = APIRouter()
@router.post(
"/answer/",
summary="Process and generate a response for an issue",
description="Analyzes an issue, generates patches if needed, runs optional builds and tests, and can push changes "
"to a remote branch.",
response_description="Returns the patch, test results, and issue response",
response_model=Response[IssueResponse],
)
@requireLogin
async def answer_issue(issue: IssueRequest, request: Request) -> Response[IssueResponse]:
# Retrieve necessary services from the application state
repository_service: RepositoryService = request.app.state.service["repository_service"]
user_service: UserService = request.app.state.service["user_service"]
issue_service: IssueService = request.app.state.service["issue_service"]
knowledge_graph_service: KnowledgeGraphService = request.app.state.service[
"knowledge_graph_service"
]
# Fetch the repository by ID
repository = await repository_service.get_repository_by_id(issue.repository_id)
# Ensure the repository exists
if not repository:
raise ServerException(code=404, message="Repository not found")
# Ensure the user has access to the repository
if settings.ENABLE_AUTHENTICATION and repository.user_id != request.state.user_id:
raise ServerException(code=403, message="You do not have access to this repository")
# Check issue credit
user_issue_credit = None
if settings.ENABLE_AUTHENTICATION:
user_issue_credit = await user_service.get_issue_credit(request.state.user_id)
if user_issue_credit <= 0:
raise ServerException(
code=403,
message="Insufficient issue credits. Please purchase more to continue.",
)
# Validate Dockerfile and workdir inputs
if issue.dockerfile_content or issue.image_name:
if issue.workdir is None:
raise ServerException(
code=400,
message="workdir must be provided for user defined environment",
)
# Validate build and test commands if required
if issue.run_build and not issue.build_commands:
raise ServerException(
code=400, message="No build commands available, please provide build commands"
)
if issue.run_existing_test and not issue.test_commands:
raise ServerException(
code=400, message="No test commands available, please provide test commands"
)
# Ensure the repository is not currently being used
if repository.is_working:
raise ServerException(
code=400,
message="The repository is currently being used. Please try again later.",
)
# Load the git repository and knowledge graph
git_repository = repository_service.get_repository(repository.playground_path)
knowledge_graph = await knowledge_graph_service.get_knowledge_graph(
repository.kg_root_node_id,
repository.kg_max_ast_depth,
repository.kg_chunk_size,
repository.kg_chunk_overlap,
)
# Update the repository status to working
await repository_service.update_repository_status(repository.id, is_working=True)
# Process the issue in a separate thread to avoid blocking the event loop
(
patch,
passed_reproducing_test,
passed_regression_test,
passed_existing_test,
issue_response,
issue_type,
) = await asyncio.to_thread(
issue_service.answer_issue,
repository=git_repository,
knowledge_graph=knowledge_graph,
repository_id=repository.id,
issue_title=issue.issue_title,
issue_body=issue.issue_body,
issue_comments=issue.issue_comments if issue.issue_comments else [],
issue_type=issue.issue_type,
run_build=issue.run_build,
run_existing_test=issue.run_existing_test,
run_regression_test=issue.run_regression_test,
run_reproduce_test=issue.run_reproduce_test,
number_of_candidate_patch=issue.number_of_candidate_patch,
dockerfile_content=issue.dockerfile_content,
image_name=issue.image_name,
workdir=issue.workdir,
build_commands=issue.build_commands,
test_commands=issue.test_commands,
)
# Update the repository status to not working
await repository_service.update_repository_status(repository.id, is_working=False)
# Check if all outputs are in their initial state, indicating a failure
if (
patch,
passed_reproducing_test,
passed_regression_test,
passed_existing_test,
issue_response,
issue_type,
) == (None, False, False, False, None, None):
raise ServerException(
code=500,
message="Failed to process the issue. Please try again later.",
)
# Deduct issue credit after successful processing
if settings.ENABLE_AUTHENTICATION:
await user_service.update_issue_credit(request.state.user_id, user_issue_credit - 1)
# Return the response
return Response(
data=IssueResponse(
patch=patch,
passed_reproducing_test=passed_reproducing_test,
passed_regression_test=passed_regression_test,
passed_existing_test=passed_existing_test,
issue_response=issue_response,
issue_type=issue_type,
)
)

View File

@@ -0,0 +1,222 @@
from typing import Sequence
import git
from fastapi import APIRouter, Request
from prometheus.app.decorators.require_login import requireLogin
from prometheus.app.models.requests.repository import (
CreateBranchAndPushRequest,
UploadRepositoryRequest,
)
from prometheus.app.models.response.repository import RepositoryResponse
from prometheus.app.models.response.response import Response
from prometheus.app.services.knowledge_graph_service import KnowledgeGraphService
from prometheus.app.services.repository_service import RepositoryService
from prometheus.app.services.user_service import UserService
from prometheus.configuration.config import settings
from prometheus.exceptions.memory_exception import MemoryException
from prometheus.exceptions.server_exception import ServerException
from prometheus.utils.github_utils import is_repository_public
from prometheus.utils.memory_utils import delete_repository_memory
router = APIRouter()
async def get_github_token(request: Request, github_token: str | None = None) -> str | None:
"""Retrieve GitHub token from the request or user profile.
Returns:
str | None: GitHub token if available, None for public repositories
"""
# If the token is provided in the request, use it directly
if github_token:
return github_token
# If the user is authenticated, get the user service and fetch the token
if settings.ENABLE_AUTHENTICATION:
user_service: UserService = request.app.state.service["user_service"]
user = await user_service.get_user_by_id(request.state.user_id)
github_token = user.github_token if user else None
return github_token
@router.post(
"/upload/",
description="""
Upload a GitHub repository to Prometheus, default to the latest commit in the main branch.
""",
response_model=Response,
)
@requireLogin
async def upload_github_repository(
upload_repository_request: UploadRepositoryRequest, request: Request
):
# Get the repository and knowledge graph services
repository_service: RepositoryService = request.app.state.service["repository_service"]
knowledge_graph_service: KnowledgeGraphService = request.app.state.service[
"knowledge_graph_service"
]
# Check if the repository already exists
if settings.ENABLE_AUTHENTICATION:
repository = await repository_service.get_repository_by_url_commit_id_and_user_id(
upload_repository_request.https_url,
upload_repository_request.commit_id,
request.state.user_id,
)
# If the repository already exists, return its ID
if repository:
return Response(
message="Repository already exists", data={"repository_id": repository.id}
)
# Check if the number of repositories exceeds the limit
if settings.ENABLE_AUTHENTICATION:
user_repositories = await repository_service.get_repositories_by_user_id(
request.state.user_id
)
if len(user_repositories) >= settings.DEFAULT_USER_REPOSITORY_LIMIT:
raise ServerException(
code=400,
message=f"You have reached the maximum number of repositories ({settings.DEFAULT_USER_REPOSITORY_LIMIT}). Please delete some repositories before uploading new ones.",
)
# Get the GitHub token (may be None for public repositories)
github_token = await get_github_token(request, upload_repository_request.github_token)
# Check if the repository is public or private
is_repository_public_ = await is_repository_public(upload_repository_request.https_url)
if not is_repository_public_ and not github_token:
raise ServerException(
code=400,
message="This appears to be a private repository. Please provide a GitHub token.",
)
# Clone the repository
try:
saved_path = await repository_service.clone_github_repo(
github_token, upload_repository_request.https_url, upload_repository_request.commit_id
)
except git.exc.GitCommandError:
raise ServerException(
code=400, message=f"Unable to clone {upload_repository_request.https_url}"
)
# Build and save the knowledge graph from the cloned repository
root_node_id = await knowledge_graph_service.build_and_save_knowledge_graph(saved_path)
repository_id = await repository_service.create_new_repository(
url=upload_repository_request.https_url,
commit_id=upload_repository_request.commit_id,
playground_path=str(saved_path),
user_id=request.state.user_id if settings.ENABLE_AUTHENTICATION else None,
kg_root_node_id=root_node_id,
)
return Response(data={"repository_id": repository_id})
@router.post(
"/create-branch-and-push/",
description="""
Create a new branch in the repository, commit changes, and push to remote.
""",
response_model=Response,
)
@requireLogin
async def create_branch_and_push(
create_branch_and_push_request: CreateBranchAndPushRequest, request: Request
):
# Get the repository service
repository_service: RepositoryService = request.app.state.service["repository_service"]
# Get the repository by ID
repository = await repository_service.get_repository_by_id(
create_branch_and_push_request.repository_id
)
if not repository:
raise ServerException(code=404, message="Repository not found")
# Check if the user has permission to modify the repository
if settings.ENABLE_AUTHENTICATION and repository.user_id != request.state.user_id:
raise ServerException(
code=403, message="You do not have permission to modify this repository"
)
# Get the Git Repository
git_repo = repository_service.get_repository(repository.playground_path)
try:
await git_repo.create_and_push_branch(
branch_name=create_branch_and_push_request.branch_name,
commit_message=create_branch_and_push_request.commit_message,
patch=create_branch_and_push_request.patch,
)
except git.exc.GitCommandError:
raise ServerException(code=400, message="Failed to create branch and push changes")
return Response()
@router.get(
"/list/",
description="""
List all repositories uploaded to Prometheus by the authenticated user.
""",
response_model=Response[Sequence[RepositoryResponse]],
)
@requireLogin
async def list_repositories(request: Request):
repository_service: RepositoryService = request.app.state.service["repository_service"]
if settings.ENABLE_AUTHENTICATION:
repositories = await repository_service.get_repositories_by_user_id(request.state.user_id)
else:
repositories = await repository_service.get_all_repositories()
return Response(data=[RepositoryResponse.model_validate(repo) for repo in repositories])
@router.delete(
"/delete/",
description="""
Delete the repository uploaded to Prometheus, along with other information.
""",
response_model=Response,
)
@requireLogin
async def delete(
repository_id: int,
request: Request,
force: bool = False,
):
knowledge_graph_service: KnowledgeGraphService = request.app.state.service[
"knowledge_graph_service"
]
repository_service: RepositoryService = request.app.state.service["repository_service"]
# Get the repository by ID
repository = await repository_service.get_repository_by_id(repository_id)
# Check if the repository exists
if not repository:
raise ServerException(code=404, message="Repository not found")
# Check if the user has permission to delete the repository
if settings.ENABLE_AUTHENTICATION and repository.user_id != request.state.user_id:
raise ServerException(
code=403, message="You do not have permission to delete this repository"
)
# Check if the repository is being processed
if repository.is_working and not force:
raise ServerException(
code=400, message="Repository is currently being processed, please try again later"
)
# Clear the knowledge graph and repository data
await knowledge_graph_service.clear_kg(repository.kg_root_node_id)
repository_service.clean_repository(repository)
# Remove semantic memory associated with the repository
try:
delete_repository_memory(repository.id)
except MemoryException:
pass
# Delete the repository from the database
await repository_service.delete_repository(repository)
return Response()

View File

@@ -0,0 +1,58 @@
from typing import Sequence
from fastapi import APIRouter, Request
from prometheus.app.decorators.require_login import requireLogin
from prometheus.app.entity.user import User
from prometheus.app.models.requests.user import SetGithubTokenRequest
from prometheus.app.models.response.response import Response
from prometheus.app.models.response.user import UserResponse
from prometheus.app.services.user_service import UserService
from prometheus.exceptions.server_exception import ServerException
router = APIRouter()
@router.get(
"/list/",
summary="List all users in the database",
description="Retrieves a list of all users.",
response_description="Returns a list of users",
response_model=Response[Sequence[UserResponse]],
)
@requireLogin
async def list_users(request: Request) -> Response[Sequence[User]]:
"""
List all users in the database.
"""
# Check if the user is an admin
user_service: UserService = request.app.state.service["user_service"]
if not await user_service.is_admin(request.state.user_id):
raise ServerException(code=403, message="Only admins can list users")
# List all users
users = await user_service.list_users()
return Response(data=[UserResponse.model_validate(user) for user in users])
@router.put(
"/set-github-token/",
summary="Set GitHub token for the user",
description="Sets the GitHub token for the authenticated user.",
response_description="Returns the updated user information",
response_model=Response,
)
@requireLogin
async def set_github_token(
request: Request, set_github_token_request: SetGithubTokenRequest
) -> Response:
"""
Set GitHub token for the user.
"""
user_service: UserService = request.app.state.service["user_service"]
# Update the user's GitHub token
await user_service.set_github_token(
request.state.user_id, set_github_token_request.github_token
)
return Response()

View File

@@ -0,0 +1,20 @@
import inspect
from functools import wraps
def requireLogin(func):
"""
Decorator to indicate that a route requires user authentication.
This decorator can be used to mark routes that should only be accessible to authenticated users.
"""
@wraps(func)
async def wrapper(*args, **kwargs):
if inspect.iscoroutinefunction(func):
return await func(*args, **kwargs)
else:
return func(*args, **kwargs)
# Set a custom attribute to indicate that this route requires login
setattr(wrapper, "_require_login", True)
return wrapper

View File

@@ -0,0 +1,77 @@
"""Initializes and configures all prometheus services."""
from prometheus.app.services.base_service import BaseService
from prometheus.app.services.database_service import DatabaseService
from prometheus.app.services.invitation_code_service import InvitationCodeService
from prometheus.app.services.issue_service import IssueService
from prometheus.app.services.knowledge_graph_service import KnowledgeGraphService
from prometheus.app.services.llm_service import LLMService
from prometheus.app.services.neo4j_service import Neo4jService
from prometheus.app.services.repository_service import RepositoryService
from prometheus.app.services.user_service import UserService
from prometheus.configuration.config import settings
def initialize_services() -> dict[str, BaseService]:
"""Initializes and configures the complete prometheus service stack.
This function creates and configures all required services for prometheus
operation, using settings from the configuration module. It ensures proper
initialization order and service dependencies.
Note:
This function assumes all required settings are properly configured in
the settings module using Dynaconf. The following settings are required:
- NEO4J_URI, NEO4J_USERNAME, NEO4J_PASSWORD
- LITELLM_MODEL
- NEO4J_BATCH_SIZE
- KNOWLEDGE_GRAPH_MAX_AST_DEPTH
- WORKING_DIRECTORY
- GITHUB_ACCESS_TOKEN
Returns:
A fully configured ServiceCoordinator instance managing all services.
"""
neo4j_service = Neo4jService(
settings.NEO4J_URI, settings.NEO4J_USERNAME, settings.NEO4J_PASSWORD
)
database_service = DatabaseService(settings.DATABASE_URL)
llm_service = LLMService(
settings.ADVANCED_MODEL,
settings.BASE_MODEL,
settings.ADVANCED_MODEL_TEMPERATURE,
settings.BASE_MODEL_TEMPERATURE,
settings.OPENAI_FORMAT_API_KEY,
settings.OPENAI_FORMAT_BASE_URL,
settings.ANTHROPIC_API_KEY,
settings.GEMINI_API_KEY,
)
knowledge_graph_service = KnowledgeGraphService(
neo4j_service,
settings.NEO4J_BATCH_SIZE,
settings.KNOWLEDGE_GRAPH_MAX_AST_DEPTH,
settings.KNOWLEDGE_GRAPH_CHUNK_SIZE,
settings.KNOWLEDGE_GRAPH_CHUNK_OVERLAP,
)
repository_service = RepositoryService(
knowledge_graph_service, database_service, settings.WORKING_DIRECTORY
)
issue_service = IssueService(
llm_service,
settings.WORKING_DIRECTORY,
settings.LOGGING_LEVEL,
)
user_service = UserService(database_service)
invitation_code_service = InvitationCodeService(database_service)
return {
"neo4j_service": neo4j_service,
"llm_service": llm_service,
"knowledge_graph_service": knowledge_graph_service,
"repository_service": repository_service,
"issue_service": issue_service,
"database_service": database_service,
"user_service": user_service,
"invitation_code_service": invitation_code_service,
}

View File

@@ -0,0 +1,21 @@
from datetime import datetime, timedelta, timezone
from sqlalchemy import TIMESTAMP
from sqlmodel import Column, Field, SQLModel
from prometheus.configuration.config import settings
class InvitationCode(SQLModel, table=True):
"""
InvitationCode model for managing invitation codes.
"""
id: int = Field(primary_key=True, description="ID")
code: str = Field(index=True, unique=True, max_length=36, description="Invitation code")
is_used: bool = Field(default=False, description="Whether the invitation code has been used")
expiration_time: datetime = Field(
default=datetime.now(timezone.utc) + timedelta(days=settings.INVITATION_CODE_EXPIRE_TIME),
description="Expiration time of the invitation code",
sa_column=Column(TIMESTAMP(timezone=True)),
)

View File

@@ -0,0 +1,39 @@
from sqlmodel import Field, SQLModel
class Repository(SQLModel, table=True):
id: int = Field(primary_key=True, description="ID")
url: str = Field(
index=True,
max_length=200,
description="The URL of the repository.",
)
commit_id: str = Field(
index=True,
nullable=True,
min_length=40,
max_length=40,
description="The commit id of the repository.",
)
playground_path: str = Field(
unique=True,
max_length=300,
description="The playground path of the repository where the repository was cloned.",
)
is_working: bool = Field(
default=False,
description="Indicates whether the repository is currently being used for processing or not.",
)
user_id: int = Field(
index=True, nullable=True, description="The ID of the user who upload this repository."
)
kg_root_node_id: int = Field(
index=True, unique=True, description="The ID of the root node of the knowledge graph."
)
kg_max_ast_depth: int = Field(description="The maximum AST depth of the knowledge graph.")
kg_chunk_size: int = Field(
description="The size of the chunks used in the knowledge graph.",
)
kg_chunk_overlap: int = Field(
description="The overlap of the chunks used in the knowledge graph."
)

View File

@@ -0,0 +1,23 @@
from sqlmodel import Field, SQLModel
class User(SQLModel, table=True):
id: int = Field(primary_key=True, description="User ID")
username: str = Field(
index=True, unique=True, max_length=20, description="Username of the user"
)
email: str = Field(
index=True, unique=True, max_length=30, description="Email address of the user"
)
password_hash: str = Field(max_length=128, description="Hashed password of the user")
github_token: str = Field(
default=None,
nullable=True,
description="Optional GitHub token for integrations",
max_length=100,
)
issue_credit: int = Field(default=0, ge=0, description="Number of issue credits the user has")
is_superuser: bool = Field(default=False, description="Whether the user is a superuser")

View File

@@ -0,0 +1,17 @@
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
from prometheus.exceptions.server_exception import ServerException
def register_exception_handlers(app: FastAPI):
"""Global exception handlers for the FastAPI application."""
@app.exception_handler(ServerException)
async def custom_exception_handler(_request: Request, exc: ServerException):
"""
Custom exception handler for ServerException.
"""
return JSONResponse(
status_code=exc.code, content={"code": exc.code, "message": exc.message, "data": None}
)

View File

@@ -0,0 +1,89 @@
import inspect
from contextlib import asynccontextmanager
from datetime import datetime, timezone
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.routing import APIRoute
from prometheus.app import dependencies
from prometheus.app.api.main import api_router
from prometheus.app.exception_handler import register_exception_handlers
from prometheus.app.middlewares.jwt_middleware import JWTMiddleware
from prometheus.app.register_login_required_routes import (
login_required_routes,
register_login_required_routes,
)
from prometheus.configuration.config import settings
from prometheus.utils.logger_manager import get_logger
# Create main thread logger with file handler - ONE LINE!
logger = get_logger(__name__)
@asynccontextmanager
async def lifespan(app: FastAPI):
# Initialization on startup
app.state.service = dependencies.initialize_services()
logger.info("Starting services...")
for service in app.state.service.values():
# Start each service, handling both async and sync start methods
if inspect.iscoroutinefunction(service.start):
await service.start()
else:
service.start()
# Initialization Completed
yield
# Cleanup on shutdown
logger.info("Shutting down services...")
for service in app.state.service.values():
# Close each service, handling both async and sync close methods
if inspect.iscoroutinefunction(service.close):
await service.close()
else:
service.close()
def custom_generate_unique_id(route: APIRoute) -> str:
"""
Custom function to generate unique IDs for API routes based on their tags and names.
"""
return f"{route.tags[0]}-{route.name}"
app = FastAPI(
lifespan=lifespan,
title=settings.PROJECT_NAME, # Title on generated documentation
openapi_url=f"{settings.BASE_URL}/openapi.json", # Path to generated OpenAPI documentation
generate_unique_id_function=custom_generate_unique_id, # Custom function for generating unique route IDs
version=settings.version, # Version of the API
debug=True if settings.ENVIRONMENT == "local" else False,
)
# Register middlewares
if settings.ENABLE_AUTHENTICATION:
app.add_middleware(
JWTMiddleware,
login_required_routes=login_required_routes,
)
# Add CORS middleware
app.add_middleware(
CORSMiddleware,
allow_origins=settings.BACKEND_CORS_ORIGINS, # Configure appropriately for production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Include the API router with a prefix
app.include_router(api_router, prefix=settings.BASE_URL)
# Register the exception handlers
register_exception_handlers(app)
# Register the login-required routes
register_login_required_routes(app)
@app.get("/health", tags=["health"])
def health_check():
return {"status": "healthy", "timestamp": datetime.now(timezone.utc).isoformat()}

View File

@@ -0,0 +1,57 @@
from typing import Set, Tuple
from fastapi import FastAPI, Request
from fastapi.security.utils import get_authorization_scheme_param
from starlette.middleware.base import BaseHTTPMiddleware
from starlette.responses import JSONResponse
from prometheus.exceptions.jwt_exception import JWTException
from prometheus.utils.jwt_utils import JWTUtils
class JWTMiddleware(BaseHTTPMiddleware):
def __init__(self, app: FastAPI, login_required_routes: Set[Tuple[str, str]]):
super().__init__(app)
self.jwt_utils = JWTUtils() # Initialize the JWT utility
self.login_required_routes = (
login_required_routes # List of paths to exclude from JWT validation
)
async def dispatch(self, request: Request, call_next):
# Allow OPTIONS requests to pass through without authentication (for CORS preflight)
if request.method == "OPTIONS":
response = await call_next(request)
return response
# Check if the request path is in excluded paths
path = request.url.path
if (request.method, path) not in self.login_required_routes:
# Proceed to the next middleware or route handler if the path is excluded
response = await call_next(request)
return response
# Retrieve the Authorization header from the request
authorization: str = request.headers.get("Authorization")
# Extract the scheme (e.g., "Bearer") and the token from the header
scheme, token = get_authorization_scheme_param(authorization)
# Check if authorization header is missing or incorrect scheme
if not authorization or scheme.lower() != "bearer":
return JSONResponse(
status_code=401,
content={"code": 401, "message": "Valid JWT Token is missing", "data": None},
)
try:
# Attempt to decode and validate the JWT token
payload = self.jwt_utils.decode_token(token)
except JWTException as e:
# If token validation fails, return an error response with details
return JSONResponse(
status_code=e.code,
content={"code": e.code, "message": e.message, "data": None},
)
request.state.user_id = payload.get("user_id", None)
# Proceed to the next middleware or route handler if validation succeeds
response = await call_next(request)
return response

View File

@@ -0,0 +1,64 @@
import re
from pydantic import BaseModel, Field, field_validator, model_validator
class LoginRequest(BaseModel):
username: str = Field(description="username of the user", max_length=20)
email: str = Field(
description="email of the user",
examples=["your_email@gmail.com"],
max_length=30,
)
password: str = Field(
description="password of the user",
examples=["P@ssw0rd!"],
min_length=8,
max_length=30,
)
@field_validator("email", mode="after")
def validate_email_format(cls, v: str) -> str:
# Allow empty email
if not v:
return v
# Simple regex for email validation
pattern = r"^[^@\s]+@[^@\s]+\.[^@\s]+$"
if not re.match(pattern, v):
raise ValueError("Invalid email format")
return v
@model_validator(mode="after")
def check_username_or_email(self) -> "LoginRequest":
if not self.username and not self.email:
raise ValueError("At least one of 'username' or 'email' must be provided.")
return self
class CreateUserRequest(BaseModel):
username: str = Field(description="username of the user", max_length=20)
email: str = Field(
description="email of the user",
examples=["your_email@gmail.com"],
max_length=30,
)
password: str = Field(
description="password of the user",
examples=["P@ssw0rd!"],
min_length=8,
max_length=30,
)
invitation_code: str = Field(
description="invitation code for registration",
examples=["abcd-efgh-ijkl-mnop"],
max_length=36,
min_length=36,
)
@field_validator("email", mode="after")
def validate_email_format(cls, v: str) -> str:
pattern = r"^[^@\s]+@[^@\s]+\.[^@\s]+$"
if not re.match(pattern, v):
raise ValueError("Invalid email format")
return v

View File

@@ -0,0 +1,88 @@
from typing import Mapping, Optional, Sequence
from pydantic import BaseModel, Field
from prometheus.lang_graph.graphs.issue_state import IssueType
class IssueRequest(BaseModel):
repository_id: int = Field(
description="The ID of the repository this issue belongs to.", examples=[1]
)
issue_title: str = Field(
description="The title of the issue", examples=["There is a memory leak"]
)
issue_body: str = Field(
description="The description of the issue", examples=["foo/bar.c is causing a memory leak"]
)
issue_comments: Optional[Sequence[Mapping[str, str]]] = Field(
default=None,
description="Comments on the issue",
examples=[
[
{"username": "user1", "comment": "I've experienced this issue as well."},
{
"username": "user2",
"comment": "A potential fix is to adjust the memory settings.",
},
]
],
)
issue_type: IssueType = Field(
default=IssueType.AUTO,
description="The type of the issue, set to auto if you do not know",
examples=[IssueType.AUTO],
)
run_build: Optional[bool] = Field(
default=False,
description="When editing the code, whenever we should run the build to verify the fix",
examples=[False],
)
run_existing_test: Optional[bool] = Field(
default=False,
description="When editing the code, whenever we should run the existing test to verify the fix",
examples=[False],
)
run_regression_test: Optional[bool] = Field(
default=True,
description="When editing the code, whenever we should run regression tests to verify the fix",
examples=[True],
)
run_reproduce_test: Optional[bool] = Field(
default=True,
description="When editing the code, whenever we should run the reproduce test to verify the fix",
examples=[False],
)
number_of_candidate_patch: Optional[int] = Field(
default=3,
description="When the patch is not verified (through build or test), "
"number of candidate patches we generate to select the best one",
examples=[5],
)
dockerfile_content: Optional[str] = Field(
default=None,
description="Specify the containerized environment with dockerfile content",
examples=["FROM python:3.11\nWORKDIR /app\nCOPY . /app"],
)
image_name: Optional[str] = Field(
default=None,
description="Specify the containerized environment with image name that should be pulled from dockerhub",
examples=["python:3.11-slim"],
)
workdir: Optional[str] = Field(
default=None,
description="If you specified the container environment, you must also specify the workdir",
examples=["/app"],
)
build_commands: Optional[Sequence[str]] = Field(
default=None,
description="If you specified dockerfile_content and run_build is True, "
"you must also specify the build commands.",
examples=[["pip install -r requirements.txt", "python -m build"]],
)
test_commands: Optional[Sequence[str]] = Field(
default=None,
description="If you specified dockerfile_content and run_test is True, "
"you must also specify the test commands.",
examples=[["pytest ."]],
)

View File

@@ -0,0 +1,80 @@
import re
from pydantic import BaseModel, Field, field_validator
class UploadRepositoryRequest(BaseModel):
https_url: str = Field(description="The URL of the repository", max_length=100)
commit_id: str | None = Field(
default=None,
description="The commit id of the repository, "
"if not provided, the latest commit in the main branch will be used.",
min_length=40,
max_length=40,
)
github_token: str | None = Field(
default=None,
description="GitHub token for private repository clone. Optional for public repositories.",
max_length=100,
)
class CreateBranchAndPushRequest(BaseModel):
repository_id: int = Field(
description="The ID of the repository this branch belongs to.", examples=[1]
)
patch: str = Field(
description="The patch to apply to the repository", examples=["diff --git a/foo.c b/foo.c"]
)
branch_name: str = Field(
description="The name of the branch to create", examples=["feature/new-feature"]
)
commit_message: str = Field(
description="The commit message for the changes", examples=["Add new feature"]
)
@field_validator("branch_name", mode="after")
def validate_branch_name_format(cls, name: str) -> str:
"""
Check if a branch name is valid according to Git's rules.
Reference: https://git-scm.com/docs/git-check-ref-format
"""
if not name or name in (".", "..") or name.strip() != name:
raise ValueError(
f"Invalid branch name '{name}': name cannot be empty, "
f"'.' or '..', and cannot have leading/trailing spaces."
)
# Cannot start or end with '/'
if name.startswith("/") or name.endswith("/"):
raise ValueError(
f"Invalid branch name '{name}': branch name cannot start or end with '/'. Example: 'feature/new'."
)
# Cannot contain consecutive slashes
if "//" in name:
raise ValueError(
f"Invalid branch name '{name}': branch name cannot contain consecutive slashes '//'."
)
# Cannot contain ASCII control characters or space
if re.search(r"[\000-\037\177\s]", name):
raise ValueError(
f"Invalid branch name '{name}': branch name cannot contain spaces or control characters. "
f"Use '-' or '_' instead of spaces."
)
# Cannot end with .lock
if name.endswith(".lock"):
raise ValueError(f"Invalid branch name '{name}': branch name cannot end with '.lock'.")
# Cannot contain these special sequences
forbidden = ["@", "\\", "?", "[", "~", "^", ":", "*", "..", "@{"]
for token in forbidden:
if token in name:
raise ValueError(
f"Invalid branch name '{name}': contains forbidden sequence or character {token}. "
f"Avoid '@', '?', '*', '..', '@{{', etc."
)
return name

View File

@@ -0,0 +1,5 @@
from pydantic import BaseModel, Field
class SetGithubTokenRequest(BaseModel):
github_token: str = Field(description="GitHub token of the user", max_length=100)

View File

@@ -0,0 +1,9 @@
from pydantic import BaseModel
class LoginResponse(BaseModel):
"""
Response model for user login.
"""
access_token: str

View File

@@ -0,0 +1,12 @@
from pydantic import BaseModel
from prometheus.lang_graph.graphs.issue_state import IssueType
class IssueResponse(BaseModel):
patch: str | None = None
passed_reproducing_test: bool
passed_regression_test: bool
passed_existing_test: bool
issue_response: str | None = None
issue_type: IssueType | None = None

View File

@@ -0,0 +1,20 @@
from pydantic import BaseModel
class RepositoryResponse(BaseModel):
"""
Response model for a repository.
"""
model_config = {
"from_attributes": True,
}
id: int
url: str
commit_id: str | None
is_working: bool
user_id: int | None
kg_max_ast_depth: int
kg_chunk_size: int
kg_chunk_overlap: int

View File

@@ -0,0 +1,15 @@
from typing import Generic, TypeVar
from pydantic import BaseModel
T = TypeVar("T")
class Response(BaseModel, Generic[T]):
"""
Generic response model for API responses.
"""
code: int = 200
message: str = "success"
data: T | None = None

View File

@@ -0,0 +1,17 @@
from pydantic import BaseModel
class UserResponse(BaseModel):
"""
Response model for a user.
"""
model_config = {
"from_attributes": True,
}
id: int
username: str
email: str
issue_credit: int
is_superuser: bool

View File

@@ -0,0 +1,16 @@
from typing import Set, Tuple
from fastapi import FastAPI
from fastapi.routing import APIRoute
# Set to store routes that require login
login_required_routes: Set[Tuple[str, str]] = set()
def register_login_required_routes(app: FastAPI):
for route in app.routes:
if isinstance(route, APIRoute):
endpoint = route.endpoint
if getattr(endpoint, "_require_login", False):
for method in route.methods:
login_required_routes.add((method, route.path))

View File

@@ -0,0 +1,18 @@
class BaseService:
"""
Base class for all services in the Prometheus application.
"""
def start(self):
"""
Start the service.
This method should be overridden by subclasses to implement specific startup logic.
"""
pass
def close(self):
"""
Close the service and release any resources.
This method should be overridden by subclasses to implement specific cleanup logic.
"""
pass

View File

@@ -0,0 +1,31 @@
from sqlalchemy.ext.asyncio import create_async_engine
from sqlmodel import SQLModel
from prometheus.app.services.base_service import BaseService
from prometheus.utils.logger_manager import get_logger
class DatabaseService(BaseService):
def __init__(self, DATABASE_URL: str):
self.engine = create_async_engine(DATABASE_URL, echo=True)
self._logger = get_logger(__name__)
# Create the database and tables
async def create_db_and_tables(self):
async with self.engine.begin() as conn:
await conn.run_sync(SQLModel.metadata.create_all)
async def start(self):
"""
Start the database service by creating the database and tables.
This method is called when the service is initialized.
"""
await self.create_db_and_tables()
self._logger.info("Database and tables created successfully.")
async def close(self):
"""
Close the database connection and release any resources.
"""
await self.engine.dispose()
self._logger.info("Database connection closed.")

View File

@@ -0,0 +1,79 @@
import datetime
import uuid
from typing import Sequence
from sqlalchemy.ext.asyncio import AsyncSession
from sqlmodel import select
from prometheus.app.entity.invitation_code import InvitationCode
from prometheus.app.services.base_service import BaseService
from prometheus.app.services.database_service import DatabaseService
from prometheus.utils.logger_manager import get_logger
class InvitationCodeService(BaseService):
def __init__(self, database_service: DatabaseService):
self.database_service = database_service
self.engine = database_service.engine
self._logger = get_logger(__name__)
async def create_invitation_code(self) -> InvitationCode:
"""
Create a new invitation code and commit it to the database.
Returns:
InvitationCode: The created invitation code instance.
"""
async with AsyncSession(self.engine) as session:
code = str(uuid.uuid4())
invitation_code = InvitationCode(code=code)
session.add(invitation_code)
await session.commit()
await session.refresh(invitation_code)
return invitation_code
async def list_invitation_codes(self) -> Sequence[InvitationCode]:
"""
List all invitation codes from the database.
Returns:
Sequence[InvitationCode]: A list of all invitation code instances.
"""
async with AsyncSession(self.engine) as session:
statement = select(InvitationCode)
result = await session.execute(statement)
return result.scalars().all()
async def check_invitation_code(self, code: str) -> bool:
"""
Check if an invitation code is valid (exists, not used and not expired).
"""
async with AsyncSession(self.engine) as session:
statement = select(InvitationCode).where(InvitationCode.code == code)
result = await session.execute(statement)
invitation_code = result.scalar_one_or_none()
if not invitation_code:
return False
if invitation_code.is_used:
return False
exp = invitation_code.expiration_time
# If our database returned a naive datetime, assume it's UTC
if exp.tzinfo is None:
exp = exp.replace(tzinfo=datetime.timezone.utc)
if exp < datetime.datetime.now(datetime.timezone.utc):
return False
return True
async def mark_code_as_used(self, code: str) -> None:
"""
Mark an invitation code as used.
"""
async with AsyncSession(self.engine) as session:
statement = select(InvitationCode).where(InvitationCode.code == code)
result = await session.execute(statement)
invitation_code = result.scalar_one_or_none()
if invitation_code:
invitation_code.is_used = True
session.add(invitation_code)
await session.commit()

View File

@@ -0,0 +1,149 @@
import logging
import threading
import traceback
from datetime import datetime
from pathlib import Path
from typing import Mapping, Optional, Sequence
from prometheus.app.services.base_service import BaseService
from prometheus.app.services.llm_service import LLMService
from prometheus.docker.general_container import GeneralContainer
from prometheus.docker.user_defined_container import UserDefinedContainer
from prometheus.git.git_repository import GitRepository
from prometheus.graph.knowledge_graph import KnowledgeGraph
from prometheus.lang_graph.graphs.issue_graph import IssueGraph
from prometheus.lang_graph.graphs.issue_state import IssueType
class IssueService(BaseService):
def __init__(
self,
llm_service: LLMService,
working_directory: str,
logging_level: str,
):
self.llm_service = llm_service
self.working_directory = working_directory
# Create a directory for answer issue logs
self.answer_issue_log_dir = Path(self.working_directory) / "answer_issue_logs"
self.answer_issue_log_dir.mkdir(parents=True, exist_ok=True)
self.logging_level = logging_level
def answer_issue(
self,
knowledge_graph: KnowledgeGraph,
repository: GitRepository,
repository_id: int,
issue_title: str,
issue_body: str,
issue_comments: Sequence[Mapping[str, str]],
issue_type: IssueType,
run_build: bool,
run_existing_test: bool,
run_regression_test: bool,
run_reproduce_test: bool,
number_of_candidate_patch: int,
build_commands: Optional[Sequence[str]],
test_commands: Optional[Sequence[str]],
dockerfile_content: Optional[str] = None,
image_name: Optional[str] = None,
workdir: Optional[str] = None,
) -> tuple[None, bool, bool, bool, None, None] | tuple[str, bool, bool, bool, str, IssueType]:
"""
Processes an issue, generates patches if needed, runs optional builds and tests, and returning the results.
Args:
repository (GitRepository): The Git repository instance.
repository_id (int): The repository ID.
knowledge_graph (KnowledgeGraph): The knowledge graph instance.
issue_title (str): The title of the issue.
issue_body (str): The body of the issue.
issue_comments (Sequence[Mapping[str, str]]): Comments on the issue.
issue_type (IssueType): The type of the issue (BUG or QUESTION).
run_build (bool): Whether to run the build commands.
run_existing_test (bool): Whether to run existing tests.
run_regression_test (bool): Whether to run regression tests.
run_reproduce_test (bool): Whether to run reproduce tests.
number_of_candidate_patch (int): Number of candidate patches to generate.
dockerfile_content (Optional[str]): Content of the Dockerfile for user-defined environments.
image_name (Optional[str]): Name of the Docker image.
workdir (Optional[str]): Working directory for the container.
build_commands (Optional[Sequence[str]]): Commands to build the project.
test_commands (Optional[Sequence[str]]): Commands to test the project.
Returns:
Tuple containing:
- edit_patch (str): The generated patch for the issue.
- passed_reproducing_test (bool): Whether the reproducing test passed.
- passed_regression_test (bool): Whether the regression tests passed.
- passed_existing_test (bool): Whether the existing tests passed.
- issue_response (str): Response generated for the issue.
- issue_type (IssueType): The type of the issue (BUG or QUESTION).
"""
# Set up a dedicated logger for this thread
logger = logging.getLogger(f"thread-{threading.get_ident()}.prometheus")
logger.setLevel(getattr(logging, self.logging_level))
formatter = logging.Formatter("%(asctime)s - %(name)s - %(levelname)s - %(message)s")
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
log_file = self.answer_issue_log_dir / f"{timestamp}_{threading.get_ident()}.log"
file_handler = logging.FileHandler(log_file)
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)
# Construct the working directory
if dockerfile_content or image_name:
container = UserDefinedContainer(
project_path=repository.get_working_directory(),
workdir=workdir,
build_commands=build_commands,
test_commands=test_commands,
dockerfile_content=dockerfile_content,
image_name=image_name,
)
else:
container = GeneralContainer(
project_path=repository.get_working_directory(),
build_commands=build_commands,
test_commands=test_commands,
)
# Initialize the IssueGraph with the provided services and parameters
issue_graph = IssueGraph(
advanced_model=self.llm_service.advanced_model,
base_model=self.llm_service.base_model,
kg=knowledge_graph,
git_repo=repository,
container=container,
repository_id=repository_id,
test_commands=test_commands,
)
try:
# Invoke the issue graph with the provided parameters
output_state = issue_graph.invoke(
issue_title,
issue_body,
issue_comments,
issue_type,
run_build,
run_existing_test,
run_regression_test,
run_reproduce_test,
number_of_candidate_patch,
)
return (
output_state["edit_patch"],
output_state["passed_reproducing_test"],
output_state["passed_regression_test"],
output_state["passed_existing_test"],
output_state["issue_response"],
output_state["issue_type"],
)
except Exception as e:
logger.error(f"Error in answer_issue: {str(e)}\n{traceback.format_exc()}")
return None, False, False, False, None, None
finally:
# Remove multi-thread file handler
logger.removeHandler(file_handler)
file_handler.close()

View File

@@ -0,0 +1,85 @@
"""Service for managing and interacting with Knowledge Graphs in Neo4j."""
import asyncio
from pathlib import Path
from prometheus.app.services.base_service import BaseService
from prometheus.app.services.neo4j_service import Neo4jService
from prometheus.graph.knowledge_graph import KnowledgeGraph
from prometheus.neo4j import knowledge_graph_handler
from prometheus.utils.logger_manager import get_logger
class KnowledgeGraphService(BaseService):
"""Manages the lifecycle and operations of Knowledge Graphs.
This service handles the creation, persistence, and management of Knowledge Graphs
that represent the whole codebase structures. It provides capabilities for building graphs
from codebase, storing them in Neo4j, and managing their lifecycle.
"""
def __init__(
self,
neo4j_service: Neo4jService,
neo4j_batch_size: int,
max_ast_depth: int,
chunk_size: int,
chunk_overlap: int,
):
"""Initializes the Knowledge Graph service.
Args:
neo4j_service: Service providing Neo4j database access.
neo4j_batch_size: Number of nodes to process in each Neo4j batch operation.
max_ast_depth: Maximum depth to traverse when building AST representations.
chunk_size: Chunk size for processing text files.
chunk_overlap: Overlap size for processing text files.
"""
self.kg_handler = knowledge_graph_handler.KnowledgeGraphHandler(
neo4j_service.neo4j_driver, neo4j_batch_size
)
self.max_ast_depth = max_ast_depth
self.chunk_size = chunk_size
self.chunk_overlap = chunk_overlap
self.writing_lock = asyncio.Lock()
self._logger = get_logger(__name__)
async def start(self):
# Initialize the Neo4j database for Knowledge Graph operations
await self.kg_handler.init_database()
self._logger.info("Starting Knowledge Graph Service")
async def build_and_save_knowledge_graph(self, path: Path) -> int:
"""Builds a new Knowledge Graph from source code and saves it to Neo4j.
Creates a new Knowledge Graph representation of the codebase at the specified path,
optionally associating it with a repository URL and commit. Any existing
Knowledge Graph will be cleared before building the new one.
Args:
path: Path to the source code directory to analyze.
Returns:
The root node ID of the newly created Knowledge Graph.
"""
async with self.writing_lock: # Ensure only one build operation at a time
root_node_id = await self.kg_handler.get_new_knowledge_graph_root_node_id()
kg = KnowledgeGraph(
self.max_ast_depth, self.chunk_size, self.chunk_overlap, root_node_id
)
await kg.build_graph(path)
await self.kg_handler.write_knowledge_graph(kg)
return kg.root_node_id
async def clear_kg(self, root_node_id: int):
await self.kg_handler.clear_knowledge_graph(root_node_id)
async def get_knowledge_graph(
self,
root_node_id: int,
max_ast_depth: int,
chunk_size: int,
chunk_overlap: int,
) -> KnowledgeGraph:
return await self.kg_handler.read_knowledge_graph(
root_node_id, max_ast_depth, chunk_size, chunk_overlap
)

View File

@@ -0,0 +1,73 @@
from typing import Optional
from langchain_anthropic import ChatAnthropic
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_google_genai import ChatGoogleGenerativeAI
from prometheus.app.services.base_service import BaseService
from prometheus.chat_models.custom_chat_openai import CustomChatOpenAI
class LLMService(BaseService):
def __init__(
self,
advanced_model_name: str,
base_model_name: str,
advanced_model_temperature: float,
base_model_temperature: float,
openai_format_api_key: Optional[str] = None,
openai_format_base_url: Optional[str] = None,
anthropic_api_key: Optional[str] = None,
gemini_api_key: Optional[str] = None,
):
self.advanced_model = get_model(
advanced_model_name,
advanced_model_temperature,
openai_format_api_key,
openai_format_base_url,
anthropic_api_key,
gemini_api_key,
)
self.base_model = get_model(
base_model_name,
base_model_temperature,
openai_format_api_key,
openai_format_base_url,
anthropic_api_key,
gemini_api_key,
)
def get_model(
model_name: str,
temperature: float,
openai_format_api_key: Optional[str] = None,
openai_format_base_url: Optional[str] = None,
anthropic_api_key: Optional[str] = None,
gemini_api_key: Optional[str] = None,
) -> BaseChatModel:
if "claude" in model_name:
return ChatAnthropic(
model_name=model_name,
api_key=anthropic_api_key,
temperature=temperature,
max_retries=3,
)
elif "gemini" in model_name:
return ChatGoogleGenerativeAI(
model=model_name,
api_key=gemini_api_key,
temperature=temperature,
max_retries=3,
)
else:
"""
Custom OpenAI chat model with specific configuration.
"""
return CustomChatOpenAI(
model=model_name,
api_key=openai_format_api_key,
base_url=openai_format_base_url,
temperature=temperature,
max_retries=3,
)

View File

@@ -0,0 +1,22 @@
"""Service for managing Neo4j database driver."""
from neo4j import AsyncGraphDatabase
from prometheus.app.services.base_service import BaseService
from prometheus.utils.logger_manager import get_logger
class Neo4jService(BaseService):
def __init__(self, neo4j_uri: str, neo4j_username: str, neo4j_password: str):
self._logger = get_logger(__name__)
self.neo4j_driver = AsyncGraphDatabase.driver(
neo4j_uri,
auth=(neo4j_username, neo4j_password),
connection_timeout=1200,
max_transaction_retry_time=1200,
keep_alive=True,
)
async def close(self):
await self.neo4j_driver.close()
self._logger.info("Neo4j driver connection closed.")

View File

@@ -0,0 +1,234 @@
"""Service for managing repository (GitHub or local) operations."""
import shutil
import uuid
from pathlib import Path
from typing import Optional
from sqlalchemy.ext.asyncio import AsyncSession
from sqlmodel import select
from prometheus.app.entity.repository import Repository
from prometheus.app.services.base_service import BaseService
from prometheus.app.services.database_service import DatabaseService
from prometheus.app.services.knowledge_graph_service import KnowledgeGraphService
from prometheus.git.git_repository import GitRepository
class RepositoryService(BaseService):
"""Manages repository operations.
This service provides functionality for Git repository operations including
cloning repositories, managing commits, pushing changes, and maintaining
a clean working directory. It integrates with a knowledge graph service
to track repository state and avoid redundant operations.
"""
def __init__(
self,
kg_service: KnowledgeGraphService,
database_service: DatabaseService,
working_dir: str,
):
"""Initializes the repository service.
Args:
kg_service: Knowledge graph service instance for codebase tracking.
working_dir: Base directory for repository operations. A 'repositories'
subdirectory will be created under this path.
"""
self.kg_service = kg_service
self.database_service = database_service
self.engine = database_service.engine
self.target_directory = Path(working_dir) / "repositories"
self.target_directory.mkdir(parents=True, exist_ok=True)
def get_new_playground_path(self) -> Path:
"""Generates a new unique playground path for cloning a repository.
Returns:
A Path object representing the new unique playground directory.
"""
unique_id = uuid.uuid4().hex
new_path = self.target_directory / unique_id
while new_path.exists():
unique_id = uuid.uuid4().hex
new_path = self.target_directory / unique_id
new_path.mkdir(parents=True)
return new_path
async def clone_github_repo(
self, github_token: str | None, https_url: str, commit_id: Optional[str] = None
) -> Path:
"""Clones a GitHub repository to the local workspace.
Clones the specified repository and optionally checks out a specific commit.
If the repository is already present and matches the requested state,
the operation may be skipped.
Args:
github_token: GitHub access token for authentication. None for public repositories.
https_url: HTTPS URL of the GitHub repository.
commit_id: Optional specific commit to check out.
Returns:
Path to the local repository directory.
"""
git_repo = GitRepository()
await git_repo.from_clone_repository(
https_url, github_token, self.get_new_playground_path()
)
if commit_id:
git_repo.checkout_commit(commit_id)
return git_repo.get_working_directory()
async def create_new_repository(
self,
url: str,
commit_id: Optional[str],
playground_path: str,
user_id: Optional[int],
kg_root_node_id: int,
) -> int:
"""
Creates a new empty repository in the working directory.
Args:
url: The url of the repository to be created.
commit_id: Optional commit ID to associate with the repository.
playground_path: Path where the repository will be cloned.
user_id: Optional user ID associated with the repository.
kg_root_node_id: ID of the root node in the knowledge graph for this repository.
Returns:
The ID of the newly created repository in the database.
"""
async with AsyncSession(self.engine) as session:
repository = Repository(
url=url,
commit_id=commit_id,
playground_path=playground_path,
user_id=user_id,
kg_root_node_id=kg_root_node_id,
kg_max_ast_depth=self.kg_service.max_ast_depth,
kg_chunk_size=self.kg_service.chunk_size,
kg_chunk_overlap=self.kg_service.chunk_overlap,
)
session.add(repository)
await session.commit()
await session.refresh(repository)
return repository.id
async def get_repository_by_id(self, repository_id: int) -> Optional[Repository]:
"""
Retrieves a repository by its ID.
Args:
repository_id: The ID of the repository to retrieve.
Returns:
The Repository instance if found, otherwise None.
"""
async with AsyncSession(self.engine) as session:
return await session.get(Repository, repository_id)
async def get_repository_by_url_and_commit_id(
self, url: str, commit_id: str
) -> Optional[Repository]:
"""
Retrieves a repository by its URL and commit ID.
Args:
url: The URL of the repository.
commit_id: The commit ID of the repository.
Returns:
The Repository instance if found, otherwise None.
"""
async with AsyncSession(self.engine) as session:
statement = select(Repository).where(
Repository.url == url, Repository.commit_id == commit_id
)
result = await session.execute(statement)
return result.scalars().first()
async def get_repository_by_url_commit_id_and_user_id(
self, url: str, commit_id: str, user_id: int
) -> Optional[Repository]:
"""
Retrieves a repository by its URL commit ID and User ID.
Args:
url: The URL of the repository.
commit_id: The commit ID of the repository.
user_id: The user ID of the repository.
Returns:
The Repository instance if found, otherwise None.
"""
async with AsyncSession(self.engine) as session:
statement = select(Repository).where(
Repository.url == url,
Repository.commit_id == commit_id,
Repository.user_id == user_id,
)
result = await session.execute(statement)
return result.scalars().first()
async def update_repository_status(self, repository_id: int, is_working: bool):
"""
Updates the working status of a repository.
Args:
repository_id: The ID of the repository to update.
is_working: The new working status to set for the repository.
"""
async with AsyncSession(self.engine) as session:
repository = await session.get(Repository, repository_id)
if repository:
repository.is_working = is_working
session.add(repository)
await session.commit()
def clean_repository(self, repository: Repository):
path = Path(repository.playground_path)
if path.exists():
shutil.rmtree(repository.playground_path)
path.parent.rmdir()
async def delete_repository(self, repository: Repository):
"""
deletes a repository from the database.
Args:
repository: The repository instance to mark as cleaned.
"""
async with AsyncSession(self.engine) as session:
obj = await session.get(Repository, repository.id)
if obj:
await session.delete(obj)
await session.commit()
def get_repository(self, local_path) -> GitRepository:
git_repo = GitRepository()
git_repo.from_local_repository(Path(local_path))
return git_repo
async def get_repositories_by_user_id(self, user_id):
"""
Retrieves all repositories associated with a specific user ID.
"""
async with AsyncSession(self.engine) as session:
statement = select(Repository).where(Repository.user_id == user_id)
result = await session.execute(statement)
return result.scalars().all()
async def get_all_repositories(self):
"""
Retrieves all repositories in the database.
"""
async with AsyncSession(self.engine) as session:
statement = select(Repository)
result = await session.execute(statement)
return result.scalars().all()

View File

@@ -0,0 +1,185 @@
from typing import Optional, Sequence
from argon2 import PasswordHasher
from argon2.exceptions import VerifyMismatchError
from sqlalchemy.ext.asyncio import AsyncSession
from sqlmodel import or_, select
from prometheus.app.entity.user import User
from prometheus.app.services.base_service import BaseService
from prometheus.app.services.database_service import DatabaseService
from prometheus.exceptions.server_exception import ServerException
from prometheus.utils.jwt_utils import JWTUtils
from prometheus.utils.logger_manager import get_logger
class UserService(BaseService):
def __init__(self, database_service: DatabaseService):
self.database_service = database_service
self.engine = database_service.engine
self._logger = get_logger(__name__)
self.ph = PasswordHasher()
self.jwt_utils = JWTUtils()
async def create_user(
self,
username: str,
email: str,
password: str,
github_token: Optional[str] = None,
issue_credit: int = 0,
is_superuser: bool = False,
) -> None:
"""
Create a new superuser and commit it to the database.
Args:
username (str): Desired username.
email (str): Email address.
password (str): Plaintext password (will be hashed).
github_token (Optional[str]): Optional GitHub token.
issue_credit (int): Optional issue credit.
is_superuser (bool): Whether the user is a superuser.
Returns:
User: The created superuser instance.
"""
async with AsyncSession(self.engine) as session:
statement = select(User).where(User.username == username)
if (await session.execute(statement)).scalar_one_or_none():
raise ServerException(400, f"Username '{username}' already exists")
statement = select(User).where(User.email == email)
if (await session.execute(statement)).scalar_one_or_none():
raise ServerException(400, f"Email '{email}' already exists")
hashed_password = self.ph.hash(password)
user = User(
username=username,
email=email,
password_hash=hashed_password,
github_token=github_token,
issue_credit=issue_credit,
is_superuser=is_superuser,
)
session.add(user)
await session.commit()
await session.refresh(user)
async def login(self, username: str, email: str, password: str) -> str:
"""
Log in a user by verifying their credentials and return an access token.
Args:
username (str): Username of the user.
email (str): Email address of the user.
password (str): Plaintext password.
"""
async with AsyncSession(self.engine) as session:
statement = select(User).where(or_(User.username == username, User.email == email))
user = (await session.execute(statement)).scalar_one_or_none()
if not user:
raise ServerException(code=400, message="Invalid username or email")
try:
self.ph.verify(user.password_hash, password)
except VerifyMismatchError:
raise ServerException(code=400, message="Invalid password")
# Generate and return a JWT token for the user
token = self.jwt_utils.generate_token({"user_id": user.id})
return token
# Create a superuser and commit it to the database
async def create_superuser(
self,
username: str,
email: str,
password: str,
github_token: Optional[str] = None,
) -> None:
"""
Create a new superuser in the database.
This method creates a superuser with the provided credentials and commits it to the database.
"""
await self.create_user(
username, email, password, github_token, is_superuser=True, issue_credit=999999
)
self._logger.info(f"Superuser '{username}' created successfully.")
async def get_user_by_id(self, user_id: int) -> Optional[User]:
"""
Retrieve a user by their ID.
Args:
user_id (int): The ID of the user to retrieve.
Returns:
User: The user instance if found, otherwise None.
"""
async with AsyncSession(self.engine) as session:
statement = select(User).where(User.id == user_id)
return (await session.execute(statement)).scalar_one_or_none()
async def get_issue_credit(self, user_id: int) -> int:
"""
Retrieve the issue credit of a user by their ID.
Args:
user_id (int): The ID of the user.
Returns:
int: The issue credit of the user.
"""
async with AsyncSession(self.engine) as session:
statement = select(User.issue_credit).where(User.id == user_id)
result = (await session.execute(statement)).scalar_one_or_none()
return int(result) if result else 0
async def update_issue_credit(self, user_id: int, new_issue_credit) -> None:
"""
Update the issue credit of a user by their ID.
Args:
user_id (int): The ID of the user.
new_issue_credit (int): The new issue credit.
"""
async with AsyncSession(self.engine) as session:
statement = select(User).where(User.id == user_id)
user = (await session.execute(statement)).scalar_one_or_none()
if user:
user.issue_credit = new_issue_credit
session.add(user)
await session.commit()
async def is_admin(self, user_id):
"""
Check if a user is an admin (superuser) by their ID.
"""
async with AsyncSession(self.engine) as session:
statement = select(User).where(User.id == user_id)
user = (await session.execute(statement)).scalar_one_or_none()
return user.is_superuser if user else False
async def list_users(self) -> Sequence[User]:
"""
List all users in the database.
"""
async with AsyncSession(self.engine) as session:
statement = select(User)
users = (await session.execute(statement)).scalars().all()
return users
async def set_github_token(self, user_id: int, github_token: str):
"""
Set GitHub token for a user by their ID.
"""
async with AsyncSession(self.engine) as session:
statement = select(User).where(User.id == user_id)
user = (await session.execute(statement)).scalar_one_or_none()
if user:
user.github_token = github_token
session.add(user)
await session.commit()
await session.refresh(user)

View File

@@ -0,0 +1,33 @@
import logging
import threading
from typing import Any, Optional
from langchain_core.language_models import LanguageModelInput
from langchain_core.messages import BaseMessage
from langchain_core.runnables import RunnableConfig
from langchain_openai import ChatOpenAI
class CustomChatOpenAI(ChatOpenAI):
def __init__(self, *args: Any, **kwargs: Any):
super().__init__(*args, **kwargs)
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def bind_tools(self, tools, tool_choice=None, **kwargs):
kwargs["parallel_tool_calls"] = False
return super().bind_tools(tools, tool_choice=tool_choice, **kwargs)
def invoke(
self,
input: LanguageModelInput,
config: Optional[RunnableConfig] = None,
*,
stop: Optional[list[str]] = None,
**kwargs: Any,
) -> BaseMessage:
return super().invoke(
input=input,
config=config,
stop=stop,
**kwargs,
)

View File

@@ -0,0 +1,72 @@
from typing import List, Literal, Optional
from pydantic_settings import BaseSettings, SettingsConfigDict
class Settings(BaseSettings):
model_config = SettingsConfigDict(
env_file=".env", env_file_encoding="utf-8", env_prefix="PROMETHEUS_"
)
# General settings
version: str = "1.3"
BASE_URL: str = f"/v{version}"
PROJECT_NAME: str = "Prometheus"
ENVIRONMENT: Literal["local", "production"]
BACKEND_CORS_ORIGINS: List[str]
ENABLE_AUTHENTICATION: bool
# Logging
LOGGING_LEVEL: Literal["DEBUG", "INFO", "WARNING", "ERROR", "CRITICAL"]
# Neo4j
NEO4J_URI: str
NEO4J_USERNAME: str
NEO4J_PASSWORD: str
NEO4J_BATCH_SIZE: int
# Knowledge Graph
WORKING_DIRECTORY: str
KNOWLEDGE_GRAPH_MAX_AST_DEPTH: int
KNOWLEDGE_GRAPH_CHUNK_SIZE: int
KNOWLEDGE_GRAPH_CHUNK_OVERLAP: int
# LLM models
ADVANCED_MODEL: str
BASE_MODEL: str
# API Keys
ANTHROPIC_API_KEY: Optional[str] = None
GEMINI_API_KEY: Optional[str] = None
OPENAI_FORMAT_BASE_URL: Optional[str] = None
OPENAI_FORMAT_API_KEY: Optional[str] = None
# Model parameters
ADVANCED_MODEL_TEMPERATURE: Optional[float] = None
BASE_MODEL_TEMPERATURE: Optional[float] = None
# Database
DATABASE_URL: str
# JWT Configuration
JWT_SECRET_KEY: str
ACCESS_TOKEN_EXPIRE_TIME: int = 30 # days
# Invitation Code Expire Time
INVITATION_CODE_EXPIRE_TIME: int = 14 # days
# Default normal user issue credit
DEFAULT_USER_ISSUE_CREDIT: int = 20
# Default normal user repository number
DEFAULT_USER_REPOSITORY_LIMIT: int = 5
# tool for Websearch
TAVILY_API_KEY: str
# Athena semantic memory service
ATHENA_BASE_URL: Optional[str] = None
settings = Settings()

View File

View File

@@ -0,0 +1,365 @@
import logging
import shutil
import tarfile
import tempfile
import threading
from abc import ABC, abstractmethod
from pathlib import Path
from typing import Optional, Sequence
import docker
import pexpect
from prometheus.exceptions.docker_exception import DockerException
class BaseContainer(ABC):
"""An abstract base class for managing Docker containers with file synchronization capabilities.
This class provides core functionality for creating, managing, and interacting with Docker
containers. It handles container lifecycle operations including building images, starting
containers, updating files, and cleanup. The class is designed to be extended for specific
container implementations that specifies the Dockerfile, how to build and how to run the test.
Now supports persistent shell for maintaining command execution context.
"""
client: docker.DockerClient = docker.from_env()
tag_name: str
workdir: str = "/app"
container: docker.models.containers.Container
project_path: Path
timeout: int = 300 # Timeout for commands in seconds
logger: logging.Logger
shell: Optional[pexpect.spawn] = None # Persistent shell
def __init__(
self,
project_path: Path,
workdir: Optional[str] = None,
build_commands: Optional[Sequence[str]] = None,
test_commands: Optional[Sequence[str]] = None,
):
"""Initialize the container with a project directory.
Creates a temporary copy of the project directory to work with.
Args:
project_path: Path to the project directory to be containerized.
"""
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
temp_dir = Path(tempfile.mkdtemp())
temp_project_path = temp_dir / project_path.name
shutil.copytree(
project_path,
temp_project_path,
symlinks=True, # Don't follow symlinkdirectly copy symlink itself
ignore_dangling_symlinks=True,
)
self.project_path = temp_project_path.absolute()
self._logger.info(f"Created temporary project directory: {self.project_path}")
self.build_commands = build_commands
self.test_commands = test_commands
if workdir:
self.workdir = workdir
self._logger.debug(f"Using workdir: {self.workdir}")
self.container = None
self.shell = None
@abstractmethod
def get_dockerfile_content(self) -> str:
"""Get the content of the Dockerfile for building the container image.
Returns:
str: Content of the Dockerfile as a string.
"""
pass
def build_docker_image(self):
"""Build a Docker image using the Dockerfile content.
Creates a Dockerfile in the project directory and builds a Docker image
using the specified tag name.
"""
dockerfile_content = self.get_dockerfile_content()
dockerfile_path = self.project_path / "prometheus.Dockerfile"
dockerfile_path.write_text(dockerfile_content)
# Temporary move .dockerignore file
dockerignore_path = self.project_path / ".dockerignore"
backup_path = None
if dockerignore_path.exists():
backup_path = self.project_path / ".dockerignore.backup"
dockerignore_path.rename(backup_path)
self._logger.info("Temporarily renamed .dockerignore to avoid excluding files")
# Log the build process
self._logger.info(f"Building docker image {self.tag_name}")
self._logger.info(f"Build context path: {self.project_path}")
self._logger.info(f"Dockerfile: {dockerfile_path.name}")
# Build the Docker image with detailed logging
try:
build_logs = self.client.api.build(
path=str(self.project_path),
dockerfile=dockerfile_path.name,
tag=self.tag_name,
rm=True,
decode=True,
)
# Process build logs line by line
for log_entry in build_logs:
if "stream" in log_entry:
log_line = log_entry["stream"].strip()
if log_line:
self._logger.info(f"[BUILD] {log_line}")
elif "error" in log_entry:
error_msg = log_entry["error"].strip()
self._logger.error(f"[BUILD ERROR] {error_msg}")
raise docker.errors.BuildError(error_msg, build_logs)
elif "status" in log_entry:
status_msg = log_entry["status"].strip()
if status_msg:
self._logger.debug(f"[BUILD STATUS] {status_msg}")
self._logger.info(f"Successfully built docker image {self.tag_name}")
except docker.errors.BuildError as e:
self._logger.error(f"Docker build failed for image {self.tag_name}")
self._logger.error(f"Build error: {str(e)}")
raise
finally:
# Restore .dockerignore
if backup_path and backup_path.exists():
backup_path.rename(dockerignore_path)
self._logger.info("Restored .dockerignore file")
def start_container(self):
"""Start a Docker container from the built image.
Starts a detached container with TTY enabled and mounts the Docker socket.
Also initializes the persistent shell.
"""
self._logger.info(f"Starting container from image {self.tag_name}")
self.container = self.client.containers.run(
self.tag_name,
detach=True,
tty=True,
network_mode="host",
environment={"PYTHONPATH": f"{self.workdir}:$PYTHONPATH"},
volumes={"/var/run/docker.sock": {"bind": "/var/run/docker.sock", "mode": "rw"}},
)
# Initialize persistent shell
self._start_persistent_shell()
def _start_persistent_shell(self):
"""Start a persistent bash shell inside the container using pexpect."""
if not self.container:
self._logger.error("Container must be started before initializing shell")
return
self._logger.info("Starting persistent shell for interactive mode...")
try:
command = f"docker exec -it {self.container.id} /bin/bash"
self.shell = pexpect.spawn(command, encoding="utf-8", timeout=self.timeout)
# Wait for the initial shell prompt
self.shell.expect([r"\$", r"#"], timeout=60)
self._logger.info("Persistent shell is ready")
except pexpect.exceptions.TIMEOUT:
self._logger.error(
"Timeout waiting for shell prompt. The container might be slow to start or misconfigured."
)
if self.shell:
self.shell.close(force=True)
self.shell = None
raise DockerException("Timeout waiting for shell prompt.")
except Exception as e:
self._logger.error(f"Failed to start persistent shell: {e}")
if self.shell:
self.shell.close(force=True)
self.shell = None
raise DockerException(f"Failed to start persistent shell: {e}")
def _restart_shell_if_needed(self):
"""Restart the shell if it's not alive."""
if not self.shell or not self.shell.isalive():
self._logger.warning("Shell not found or died. Attempting to restart...")
if self.shell:
self.shell.close(force=True)
self._start_persistent_shell()
if self.shell is None:
raise DockerException("Failed to start or restart the persistent shell.")
def is_running(self) -> bool:
return bool(self.container)
def update_files(
self, project_root_path: Path, updated_files: Sequence[Path], removed_files: Sequence[Path]
):
"""Update files in the running container with files from a local directory.
Creates a tar archive of the new files and copies them into the workdir of the container.
Args:
project_root_path: Path to the project root directory.
updated_files: List of file paths (relative to project_root_path) to update in the container.
removed_files: List of file paths (relative to project_root_path) to remove from the container.
"""
if not project_root_path.is_absolute():
raise ValueError("project_root_path {project_root_path} must be a absolute path")
self._logger.info("Updating files in the container after edits.")
for file in removed_files:
self._logger.info(f"Removing file {file} in the container")
self.execute_command(f"rm {file}")
parent_dirs = {str(file.parent) for file in updated_files}
for dir_path in sorted(parent_dirs):
self._logger.info(f"Creating directory {dir_path} in the container")
self.execute_command(f"mkdir -p {dir_path}")
with tempfile.NamedTemporaryFile() as temp_tar:
with tarfile.open(fileobj=temp_tar, mode="w") as tar:
for file in updated_files:
local_absolute_file = project_root_path / file
self._logger.info(f"Updating {file} in the container")
tar.add(local_absolute_file, arcname=str(file))
temp_tar.seek(0)
self.container.put_archive(self.workdir, temp_tar.read())
self._logger.info("Files updated successfully")
def run_build(self) -> str:
"""Run build commands and return combined output."""
if not self.build_commands:
self._logger.error("No build commands defined")
return ""
command_output = ""
for build_command in self.build_commands:
command_output += f"$ {build_command}\n"
command_output += f"{self.execute_command(build_command)}\n"
return command_output
def run_test(self) -> str:
"""Run test commands and return combined output."""
if not self.test_commands:
self._logger.error("No test commands defined")
return ""
command_output = ""
for test_command in self.test_commands:
command_output += f"$ {test_command}\n"
command_output += f"{self.execute_command(test_command)}\n"
return command_output
def execute_command(self, command: str) -> str:
"""Execute a command in the running container using persistent shell.
Args:
command: Command to execute in the container.
Returns:
str: Output of the command.
"""
self._logger.debug(f"Executing command: {command}")
# Ensure shell is available
self._restart_shell_if_needed()
# Unique marker to identify command completion and exit code
marker = "---CMD_DONE---"
full_command = command.strip()
marker_command = f"echo {marker}$?"
try:
self.shell.sendline(full_command)
self.shell.sendline(marker_command)
# Wait for the marker with exit code
self.shell.expect(marker + r"(\d+)", timeout=self.timeout)
except pexpect.exceptions.TIMEOUT:
timeout_msg = f"""
*******************************************************************************
{command} timeout after {self.timeout} seconds
*******************************************************************************
"""
self._logger.error(f"Command '{command}' timed out after {self.timeout} seconds")
partial_output = getattr(self.shell, "before", "")
# Restart the shell to prevent cascade failures
self._logger.warning(
"Restarting shell due to command timeout to prevent cascade failures"
)
if self.shell:
self.shell.close(force=True)
self._start_persistent_shell()
return f"Command '{command}' timed out after {self.timeout} seconds. Partial output:\n{partial_output}{timeout_msg}"
except Exception as e:
raise DockerException(f"Error executing command '{command}': {e}")
exit_code = int(self.shell.match.group(1))
# Get the output before the marker
output_before_marker = self.shell.before
# Clean up the output by removing command echoes
all_lines = output_before_marker.splitlines()
clean_lines = []
for line in all_lines:
stripped_line = line.strip()
# Ignore the line if it's an echo of the original command OR our marker command
if stripped_line != full_command and marker_command not in stripped_line:
clean_lines.append(line)
# 3. Join the clean lines back together
cleaned_output = "\n".join(clean_lines)
# Wait for the next shell prompt to ensure the shell is ready
self.shell.expect([r"\$", r"#"], timeout=20)
self._logger.debug(f"Command exit code: {exit_code}")
self._logger.debug(f"Command output:\n{cleaned_output}")
return cleaned_output
def reset_repository(self):
"""Reset the git repository in the container to a clean state."""
self._logger.info("Resetting git repository in the container")
self.execute_command("git reset --hard")
self.execute_command("git clean -fd")
def cleanup(self):
"""Clean up container resources and temporary files.
Stops the persistent shell, stops and removes the container, removes the Docker image,
and deletes temporary project files.
"""
self._logger.info("Cleaning up container and temporary files")
# Close persistent shell first
if self.shell and self.shell.isalive():
self._logger.info("Closing persistent shell...")
self.shell.close(force=True)
self.shell = None
self._logger.info("Cleaning up container and temporary files")
if self.container:
self.container.stop(timeout=10)
self.container.remove(force=True)
self.container = None
self.client.images.remove(self.tag_name, force=True)
shutil.rmtree(self.project_path)

View File

@@ -0,0 +1,112 @@
import uuid
from pathlib import Path
from typing import Optional, Sequence
from prometheus.docker.base_container import BaseContainer
class GeneralContainer(BaseContainer):
"""A general-purpose container with a comprehensive development environment.
This container provides a full Ubuntu-based development environment with common
development tools and languages pre-installed, including Python, Node.js, Java,
and various build tools. It's designed to be a flexible container that can
handle various types of projects through direct command execution rather than
predefined build and test methods.
The container includes:
- Build tools (gcc, g++, cmake, make)
- Programming languages (Python 3, Node.js, Java)
- Development tools (git, gdb)
- Database clients (PostgreSQL, MySQL, SQLite)
- Text editors (vim, nano)
- Docker CLI for container management
- Various utility tools (curl, wget, zip, etc.)
Unlike specialized containers, this container does not implement run_build() or
run_test() methods. Instead, the agent will use execute_command() directly for
custom build and test operations.
"""
def __init__(
self,
project_path: Path,
build_commands: Optional[Sequence[str]] = None,
test_commands: Optional[Sequence[str]] = None,
):
"""Initialize the general container with a unique tag name.
Args:
project_path (Path): Path to the project directory to be containerized.
"""
super().__init__(
project_path=project_path, build_commands=build_commands, test_commands=test_commands
)
self.tag_name = f"prometheus_general_container_{uuid.uuid4().hex[:10]}"
def get_dockerfile_content(self) -> str:
"""Get the Dockerfile content for the general-purpose container.
The Dockerfile sets up an Ubuntu-based environment with a comprehensive
set of development tools and languages installed. It includes Python,
Node.js, Java, and various build tools, making it suitable for different
types of projects.
Returns:
str: Content of the Dockerfile as a string.
"""
DOCKERFILE_CONTENT = """\
FROM ubuntu:24.04
# Avoid timezone prompts during package installation
ENV DEBIAN_FRONTEND=noninteractive
ENV TZ=UTC
# Set working directory
WORKDIR /app
# Install essential build and development tools
RUN apt-get update && apt-get install -y \
build-essential \
cmake \
git \
curl \
wget \
python3 \
python3-pip \
python3-dev \
python3-venv \
nodejs \
npm \
default-jdk \
gcc \
g++ \
gdb \
postgresql-client \
mysql-client \
sqlite3 \
iputils-ping \
vim \
nano \
zip \
unzip \
ca-certificates \
gnupg \
lsb-release
RUN mkdir -p /etc/apt/keyrings \
&& curl -fsSL https://download.docker.com/linux/ubuntu/gpg | gpg --dearmor -o /etc/apt/keyrings/docker.gpg \
&& echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | tee /etc/apt/sources.list.d/docker.list > /dev/null
RUN apt-get update && apt-get install -y docker-ce-cli
RUN apt-get clean
RUN rm -rf /var/lib/apt/lists/*
RUN ln -s /usr/bin/python3 /usr/bin/python
# Copy project files
COPY . /app/
"""
return DOCKERFILE_CONTENT

View File

@@ -0,0 +1,38 @@
import uuid
from pathlib import Path
from typing import Optional, Sequence
from prometheus.docker.base_container import BaseContainer
class UserDefinedContainer(BaseContainer):
def __init__(
self,
project_path: Path,
workdir: Optional[str] = None,
dockerfile_content: Optional[str] = None,
image_name: Optional[str] = None,
build_commands: Optional[Sequence[str]] = None,
test_commands: Optional[Sequence[str]] = None,
):
super().__init__(project_path, workdir, build_commands, test_commands)
assert bool(dockerfile_content) != bool(image_name), (
"Exactly one of dockerfile_content or image_name must be provided"
)
self.tag_name = f"prometheus_user_defined_container_{uuid.uuid4().hex[:10]}"
self.dockerfile_content = dockerfile_content
self.image_name = image_name
def get_dockerfile_content(self) -> str:
return self.dockerfile_content
def build_docker_image(self):
if self.dockerfile_content:
super().build_docker_image()
else:
self._logger.info(f"Pulling docker image: {self.image_name}")
pulled_image = self.client.images.pull(self.image_name)
self._logger.info(f"Tagging pulled image as: {self.tag_name}")
pulled_image.tag(repository=self.tag_name)

View File

@@ -0,0 +1,4 @@
class DockerException(Exception):
"""Base class for Docker-related exceptions."""
pass

View File

@@ -0,0 +1,6 @@
class FileOperationException(Exception):
"""
Base class for file operation exceptions.
"""
pass

View File

@@ -0,0 +1,4 @@
class GithubException(Exception):
"""Base exception for GitHub-related errors."""
pass

View File

@@ -0,0 +1,12 @@
from prometheus.exceptions.server_exception import ServerException
class JWTException(ServerException):
"""
class for JWT exceptions.
This exception is raised when there is an issue with JWT operations,
such as token generation or validation.
"""
def __init__(self, code: int = 401, message: str = "An error occurred with the JWT operation."):
super().__init__(code, message)

View File

@@ -0,0 +1,4 @@
class LLMException(Exception):
"""Base exception for LLM-related errors."""
pass

View File

@@ -0,0 +1,4 @@
class MemoryException(Exception):
"""Custom exception for memory-related errors in Prometheus."""
pass

View File

@@ -0,0 +1,10 @@
class ServerException(Exception):
"""
Base class for server exceptions.
This exception is raised when there is an issue with server operations.
"""
def __init__(self, code: int, message: str):
super().__init__(message)
self.code = code
self.message = message

View File

@@ -0,0 +1,4 @@
class WebSearchToolException(Exception):
"""Custom exception for web search tool errors."""
pass

View File

View File

@@ -0,0 +1,178 @@
"""Git repository management module."""
import asyncio
import shutil
import tempfile
from pathlib import Path
from typing import Optional, Sequence
from git import Git, GitCommandError, InvalidGitRepositoryError, Repo
from prometheus.utils.logger_manager import get_logger
class GitRepository:
"""A class for managing Git repositories with support for both local and remote operations.
This class provides a unified interface for working with Git repositories,
whether they are local or remote (HTTPS). It supports common Git operations
such as cloning, checking out commits, switching branches, and pushing changes.
For remote repositories, it handles authentication using GitHub access tokens.
"""
def __init__(self):
"""
Initialize a GitRepository instance.
"""
self._logger = get_logger(__name__)
# Configure git command to use our logger
g = Git()
type(g).GIT_PYTHON_TRACE = "full"
git_cmd_logger = get_logger("git.cmd")
# Ensure git command output goes to our logger
for handler in git_cmd_logger.handlers:
git_cmd_logger.removeHandler(handler)
git_cmd_logger.parent = self._logger
git_cmd_logger.propagate = True
self.repo = None
self.playground_path = None
def _set_default_branch(self):
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
try:
self.default_branch = (
self.repo.remote().refs["HEAD"].reference.name.replace("refs/heads/", "")
)
except ValueError:
self.default_branch = self.repo.active_branch.name
async def from_clone_repository(
self, https_url: str, github_access_token: str | None, target_directory: Path
):
"""Clone a remote repository using HTTPS authentication.
Args:
https_url: HTTPS URL of the remote repository.
github_access_token: GitHub access token for authentication. None for public repositories.
target_directory: Directory where the repository will be cloned.
Returns:
Repo: GitPython Repo object representing the cloned repository.
"""
# Only modify the URL with token authentication if a token is provided
if github_access_token:
https_url = https_url.replace(
"https://", f"https://x-access-token:{github_access_token}@"
)
repo_name = https_url.split("/")[-1].split(".")[0]
local_path = target_directory / repo_name
if local_path.exists():
shutil.rmtree(local_path)
self.repo = await asyncio.to_thread(Repo.clone_from, https_url, local_path)
self.playground_path = local_path
self._set_default_branch()
def from_local_repository(self, local_path: Path):
"""Initialize the GitRepository from a local repository path.
Args:
local_path: Path to the local Git repository.
Raises:
InvalidGitRepositoryError: If the provided path is not a valid Git repository.
"""
if not local_path.is_dir() or not (local_path / ".git").exists():
raise InvalidGitRepositoryError(f"{local_path} is not a valid Git repository.")
self.repo = Repo(local_path)
self.playground_path = local_path
self._set_default_branch()
def checkout_commit(self, commit_sha: str):
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
self.repo.git.checkout(commit_sha)
def switch_branch(self, branch_name: str):
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
self.repo.git.checkout(branch_name)
def pull(self):
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
self.repo.git.pull()
def get_diff(self, excluded_files: Optional[Sequence[str]] = None) -> str:
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
self.repo.git.add("-A")
if excluded_files:
self.repo.git.reset(excluded_files)
diff = self.repo.git.diff("--staged")
if diff and not diff.endswith("\n"):
diff += "\n"
self.repo.git.reset()
return diff
def get_working_directory(self) -> Path:
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
return Path(self.repo.working_dir).absolute()
def reset_repository(self):
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
self.repo.git.reset("--hard")
self.repo.git.clean("-fd")
def remove_repository(self):
if self.repo is not None:
shutil.rmtree(self.repo.working_dir)
self.repo = None
def apply_patch(self, patch: str):
"""Apply a patch to the current repository."""
with tempfile.NamedTemporaryFile(mode="w", suffix=".patch") as tmp_file:
tmp_file.write(patch)
tmp_file.flush()
self.repo.git.apply(tmp_file.name)
async def create_and_push_branch(self, branch_name: str, commit_message: str, patch: str):
"""Create a new branch, commit changes, and push to remote.
This method creates a new branch, switches to it, stages all changes,
commits them with the provided message, and pushes the branch to the
remote repository.
Args:
branch_name: Name of the new branch to create.
commit_message: Message for the commit.
patch: Patch to apply to the branch.
"""
if self.repo is None:
raise InvalidGitRepositoryError("No repository is currently set.")
# Get the current commit SHA to ensure we can reset later
start_commit_sha = self.repo.head.commit.hexsha
try:
# create and checkout new branch
new_branch = self.repo.create_head(branch_name)
new_branch.checkout()
# Apply the patch and commit changes
self.apply_patch(patch)
self.repo.git.add(A=True)
self.repo.index.commit(commit_message)
await asyncio.to_thread(self.repo.git.push, "--set-upstream", "origin", branch_name)
except GitCommandError as e:
raise e
finally:
self.reset_repository()
# Reset to the original commit
self.checkout_commit(start_commit_sha)

View File

View File

@@ -0,0 +1,286 @@
"""Building knowledge graph for a single file."""
from collections import deque
from pathlib import Path
from typing import Sequence, Tuple
from langchain_core.documents import Document
from langchain_text_splitters import RecursiveCharacterTextSplitter
from prometheus.graph.graph_types import (
ASTNode,
KnowledgeGraphEdge,
KnowledgeGraphEdgeType,
KnowledgeGraphNode,
TextNode,
)
from prometheus.parser import tree_sitter_parser
class FileGraphBuilder:
"""A class for building knowledge graphs from individual files.
This class processes files and creates knowledge graph representations using different
strategies based on the file type. For source code files, it uses tree-sitter to
create an Abstract Syntax Tree (AST) representation. For markdown files, it creates
a chain of text nodes based on the document's structure.
The resulting knowledge graph consists of nodes (KnowledgeGraphNode) connected by
edges (KnowledgeGraphEdge) with different relationship types (KnowledgeGraphEdgeType).
"""
def __init__(self, max_ast_depth: int, chunk_size: int, chunk_overlap: int):
"""Initialize the FileGraphBuilder.
Args:
max_ast_depth: Maximum depth to traverse in the AST when processing source code files.
Higher values create more detailed but larger graphs.
chunk_size: The chunk size for text files.
chunk_overlap: The overlap size for text files.
"""
self.max_ast_depth = max_ast_depth
self.chunk_size = chunk_size
self.chunk_overlap = chunk_overlap
def support_code_file(self, file: Path) -> bool:
return tree_sitter_parser.supports_file(file)
def support_text_file(self, file: Path) -> bool:
return file.suffix in [".markdown", ".md", ".txt", ".rst"]
def supports_file(self, file: Path) -> bool:
"""Checks if we support building knowledge graph for this file."""
return self.support_code_file(file) or self.support_text_file(file)
def build_file_graph(
self, parent_node: KnowledgeGraphNode, file: Path, next_node_id: int
) -> Tuple[int, Sequence[KnowledgeGraphNode], Sequence[KnowledgeGraphEdge]]:
"""Build knowledge graph for a single file.
Args:
parent_node: The parent knowledge graph node that represent the file.
The node attribute should have type FileNode.
file: The file to build knowledge graph.
next_node_id: The next available node id.
Returns:
A tuple of (next_node_id, kg_nodes, kg_edges), where next_node_id is the
new next_node_id, kg_nodes is a list of all nodes created for the file,
and kg_edges is a list of all edges created for this file.
"""
# In this case, it is a file that tree sitter can parse (source code)
if self.support_code_file(file):
return self._tree_sitter_file_graph(parent_node, file, next_node_id)
# otherwise it is a text file that we can parse using langchain text splitter
else:
return self._text_file_graph(parent_node, file, next_node_id)
def _tree_sitter_file_graph(
self, parent_node: KnowledgeGraphNode, file: Path, next_node_id: int
) -> Tuple[int, Sequence[KnowledgeGraphNode], Sequence[KnowledgeGraphEdge]]:
"""
Parse a file into a tree-sitter based abstract syntax tree (AST) and build a corresponding knowledge graph.
This function uses tree-sitter to parse the source code file and constructs a subgraph where:
- Each AST node in the tree-sitter AST is represented as a KnowledgeGraphNode wrapping an ASTNode.
- Edges of type 'PARENT_OF' connect parent AST nodes to their children.
- The root AST node for the file is connected to the parent FileNode with an edge of type 'HAS_AST'.
Args:
parent_node (KnowledgeGraphNode): The parent knowledge graph node representing the file (should wrap a FileNode).
file (Path): The file to be parsed and included in the knowledge graph.
next_node_id (int): The next available node id (to ensure global uniqueness in the graph).
Returns:
Tuple[int, Sequence[KnowledgeGraphNode], Sequence[KnowledgeGraphEdge]]:
- The next available node id after all nodes are created.
- A list of all AST-related KnowledgeGraphNode objects for this file.
- A list of all edges (HAS_AST, PARENT_OF) created for this file's AST subgraph.
Algorithm:
1. Use tree-sitter to parse the file and obtain the AST.
2. Create a KnowledgeGraphNode for the root AST node and connect it to the parent file node with HAS_AST.
3. Traverse the AST in depth-first order using a stack:
- For each AST node, create a KnowledgeGraphNode and assign it a unique id.
- For each parent-child relationship in the AST, add a PARENT_OF edge.
- Traverse to the maximum depth specified by self.max_ast_depth.
4. Return the updated next_node_id, all nodes, and all edges for this file.
Notes:
- If the parsed tree is empty or contains errors, no nodes/edges are added.
- AST node text is decoded to utf-8 to ensure string compatibility.
- The function only builds the AST subgraph for one file; integration into the global graph is done by the caller.
"""
# Store created AST KnowledgeGraphNodes and edges for this file
tree_sitter_nodes = []
tree_sitter_edges = []
# Parse the file into a tree-sitter AST
tree = tree_sitter_parser.parse(file)
if tree.root_node.has_error or tree.root_node.child_count == 0:
# Return empty results if the file cannot be parsed properly
return next_node_id, tree_sitter_nodes, tree_sitter_edges
# Create the KnowledgeGraphNode for the root AST node
ast_root_node = ASTNode(
type=tree.root_node.type,
start_line=tree.root_node.start_point[0] + 1,
end_line=tree.root_node.end_point[0] + 1,
text=tree.root_node.text.decode("utf-8"),
)
kg_ast_root_node = KnowledgeGraphNode(next_node_id, ast_root_node)
next_node_id += 1
tree_sitter_nodes.append(kg_ast_root_node)
# Add the HAS_AST edge connecting the file node to its AST root node
tree_sitter_edges.append(
KnowledgeGraphEdge(parent_node, kg_ast_root_node, KnowledgeGraphEdgeType.has_ast)
)
# Use an explicit stack for depth-first traversal of the AST
node_stack = deque()
node_stack.append(
(tree.root_node, kg_ast_root_node, 1)
) # (tree_sitter_node, kg_node, depth)
while node_stack:
tree_sitter_node, kg_node, depth = node_stack.pop()
# Limit the maximum depth to self.max_ast_depth
if depth > self.max_ast_depth:
continue
# Process all children of the current AST node
for tree_sitter_child_node in tree_sitter_node.children:
# Create KnowledgeGraphNode for the child AST node
child_ast_node = ASTNode(
type=tree_sitter_child_node.type,
start_line=tree_sitter_child_node.start_point[0] + 1,
end_line=tree_sitter_child_node.end_point[0] + 1,
text=tree_sitter_child_node.text.decode("utf-8"),
)
kg_child_ast_node = KnowledgeGraphNode(next_node_id, child_ast_node)
next_node_id += 1
tree_sitter_nodes.append(kg_child_ast_node)
# Add a PARENT_OF edge from the parent to this child
tree_sitter_edges.append(
KnowledgeGraphEdge(kg_node, kg_child_ast_node, KnowledgeGraphEdgeType.parent_of)
)
# Add the child node to the stack to continue traversal
node_stack.append((tree_sitter_child_node, kg_child_ast_node, depth + 1))
# Return the updated next_node_id, all nodes, and all edges for this file's AST subgraph
return next_node_id, tree_sitter_nodes, tree_sitter_edges
def _text_file_graph(
self, parent_node: KnowledgeGraphNode, file: Path, next_node_id: int
) -> Tuple[int, Sequence[KnowledgeGraphNode], Sequence[KnowledgeGraphEdge]]:
text_splitter = RecursiveCharacterTextSplitter(
chunk_size=self.chunk_size, chunk_overlap=self.chunk_overlap, length_function=len
)
text = file.open(encoding="utf-8").read()
# Calculate line positions for the entire text
lines = text.split("\n")
line_positions = []
current_pos = 0
for line in lines:
line_positions.append(current_pos)
current_pos += len(line) + 1 # +1 for the newline character
documents = text_splitter.create_documents([text])
# Add line position metadata to each document
current_pos = 0
for document in documents:
# Find the position of this chunk in the original text
chunk_text = document.page_content
start_pos = text.find(chunk_text, current_pos)
if start_pos == -1:
# If not found, try from beginning
start_pos = text.find(chunk_text)
if start_pos == -1:
raise ValueError("Chunk text not found in original text.")
end_pos = start_pos + len(chunk_text)
current_pos = end_pos # Update for next iteration
# Find start line
for i, pos in enumerate(line_positions):
if pos > start_pos:
start_line = i # Line numbers are 0-indexed
break
else:
start_line = len(line_positions)
# Find end line
for i, pos in enumerate(line_positions):
if pos > end_pos:
end_line = i # Line numbers are 0-indexed
break
else:
end_line = len(line_positions)
# Store line positions in metadata
if document.metadata is None:
document.metadata = {}
document.metadata["start_line"] = start_line
document.metadata["end_line"] = end_line
return self._documents_to_file_graph(documents, parent_node, next_node_id)
def _documents_to_file_graph(
self,
documents: Sequence[Document],
parent_node: KnowledgeGraphNode,
next_node_id: int,
) -> Tuple[int, Sequence[KnowledgeGraphNode], Sequence[KnowledgeGraphEdge]]:
"""Convert the parsed langchain documents to a knowledge graph.
The parsed document will form a chain of nodes, where all nodes are connected
to the parent_node using the HAS_TEXT relationship. The nodes are connected using
the NEXT_CHUNK relationship in chronological order.
Args:
documents: The langchain documents used to create the TextNode.
parent_node: The parent knowledge graph node that represent the file.
The node attribute should have type FileNode.
next_node_id: The next available node id.
Returns:
A tuple of (next_node_id, kg_nodes, kg_edges), where next_node_id is the
new next_node_id, kg_nodes is a list of all nodes created for the file,
and kg_edges is a list of all edges created for this file.
"""
document_nodes = []
document_edges = []
previous_node = None
for document in documents:
# Extract line positions from metadata
start_line = document.metadata.get("start_line", 0) if document.metadata else 0
end_line = document.metadata.get("end_line", 0) if document.metadata else 0
text_node = TextNode(
text=document.page_content,
start_line=start_line,
end_line=end_line,
)
kg_text_node = KnowledgeGraphNode(next_node_id, text_node)
next_node_id += 1
document_nodes.append(kg_text_node)
document_edges.append(
KnowledgeGraphEdge(parent_node, kg_text_node, KnowledgeGraphEdgeType.has_text)
)
if previous_node:
document_edges.append(
KnowledgeGraphEdge(
previous_node, kg_text_node, KnowledgeGraphEdgeType.next_chunk
)
)
previous_node = kg_text_node
return next_node_id, document_nodes, document_edges

View File

@@ -0,0 +1,245 @@
"""Type definition for nodes and edges in the knowledge graph."""
import dataclasses
import enum
from typing import TypedDict, Union
@dataclasses.dataclass(frozen=True)
class FileNode:
"""A node representing a file/dir.
Attributes:
basename: The basename of a file/dir, like 'bar.py' or 'foo'.
relative_path: The relative path from the root path, like 'foo/bar/baz.java'.
"""
basename: str
relative_path: str
@dataclasses.dataclass(frozen=True)
class ASTNode:
"""A node representing a tree-sitter node.
Attributes:
type: The tree-sitter node type.
start_line: The starting line number. 0-indexed and inclusive.
end_line: The ending line number. 0-indexed and inclusive.
text: The source code correcpsonding to the node.
"""
type: str
start_line: int
end_line: int
text: str
@dataclasses.dataclass(frozen=True)
class TextNode:
"""A node representing a piece of text.
Attributes:
text: A string.
start_line: The starting line number. 0-indexed and inclusive.
end_line: The ending line number. 0-indexed and inclusive.
"""
text: str
start_line: int
end_line: int
@dataclasses.dataclass(frozen=True)
class KnowledgeGraphNode:
"""A node in the knowledge graph.
Attributes:
node_id: A id that uniquely identifies a node in the graph.
node: The node itself, can be a FileNode, ASTNode or TextNode.
"""
node_id: int
node: Union[FileNode, ASTNode, TextNode]
def to_neo4j_node(self) -> Union["Neo4jFileNode", "Neo4jASTNode", "Neo4jTextNode"]:
"""Convert the KnowledgeGraphNode into a Neo4j node format."""
match self.node:
case FileNode():
return Neo4jFileNode(
node_id=self.node_id,
basename=self.node.basename,
relative_path=self.node.relative_path,
)
case ASTNode():
return Neo4jASTNode(
node_id=self.node_id,
type=self.node.type,
start_line=self.node.start_line,
end_line=self.node.end_line,
text=self.node.text,
)
case TextNode():
return Neo4jTextNode(
node_id=self.node_id,
text=self.node.text,
start_line=self.node.start_line,
end_line=self.node.end_line,
)
case _:
raise ValueError("Unknown KnowledgeGraphNode.node type")
@classmethod
def from_neo4j_file_node(cls, node: "Neo4jFileNode") -> "KnowledgeGraphNode":
return cls(
node_id=node["node_id"],
node=FileNode(
basename=node["basename"],
relative_path=node["relative_path"],
),
)
@classmethod
def from_neo4j_ast_node(cls, node: "Neo4jASTNode") -> "KnowledgeGraphNode":
return cls(
node_id=node["node_id"],
node=ASTNode(
type=node["type"],
start_line=node["start_line"],
end_line=node["end_line"],
text=node["text"],
),
)
@classmethod
def from_neo4j_text_node(cls, node: "Neo4jTextNode") -> "KnowledgeGraphNode":
return cls(
node_id=node["node_id"],
node=TextNode(
text=node["text"],
start_line=node["start_line"],
end_line=node["end_line"],
),
)
class KnowledgeGraphEdgeType(enum.StrEnum):
"""Enum of all knowledge graph edge types"""
parent_of = "PARENT_OF" # ASTNode -> ASTNode
has_file = "HAS_FILE" # FileNode -> FileNode
has_ast = "HAS_AST" # FileNode -> ASTNode
has_text = "HAS_TEXT" # FileNode -> TextNode
next_chunk = "NEXT_CHUNK" # TextNode -> TextNode
@dataclasses.dataclass(frozen=True)
class KnowledgeGraphEdge:
"""An edge in the knowledge graph.
Attributes:
source: The source knowledge graph node.
target: The target knowledge graph node.
type: The knowledge graph edge type.
"""
source: KnowledgeGraphNode
target: KnowledgeGraphNode
type: KnowledgeGraphEdgeType
def to_neo4j_edge(
self,
) -> Union[
"Neo4jHasFileEdge",
"Neo4jHasASTEdge",
"Neo4jParentOfEdge",
"Neo4jHasTextEdge",
"Neo4jNextChunkEdge",
]:
"""Convert the KnowledgeGraphEdge into a Neo4j edge format."""
match self.type:
case KnowledgeGraphEdgeType.has_file:
return Neo4jHasFileEdge(
source=self.source.to_neo4j_node(),
target=self.target.to_neo4j_node(),
)
case KnowledgeGraphEdgeType.has_ast:
return Neo4jHasASTEdge(
source=self.source.to_neo4j_node(),
target=self.target.to_neo4j_node(),
)
case KnowledgeGraphEdgeType.parent_of:
return Neo4jParentOfEdge(
source=self.source.to_neo4j_node(),
target=self.target.to_neo4j_node(),
)
case KnowledgeGraphEdgeType.has_text:
return Neo4jHasTextEdge(
source=self.source.to_neo4j_node(),
target=self.target.to_neo4j_node(),
)
case KnowledgeGraphEdgeType.next_chunk:
return Neo4jNextChunkEdge(
source=self.source.to_neo4j_node(),
target=self.target.to_neo4j_node(),
)
case _:
raise ValueError(f"Unknown edge type: {self.type}")
###############################################################################
# Neo4j types #
###############################################################################
class Neo4jMetadataNode(TypedDict):
codebase_source: str
local_path: str
https_url: str
commit_id: str
class Neo4jFileNode(TypedDict):
node_id: int
basename: str
relative_path: str
class Neo4jASTNode(TypedDict):
node_id: int
type: str
start_line: int
end_line: int
text: str
class Neo4jTextNode(TypedDict):
node_id: int
text: str
start_line: int
end_line: int
class Neo4jHasFileEdge(TypedDict):
source: Neo4jFileNode
target: Neo4jFileNode
class Neo4jHasASTEdge(TypedDict):
source: Neo4jFileNode
target: Neo4jASTNode
class Neo4jParentOfEdge(TypedDict):
source: Neo4jASTNode
target: Neo4jASTNode
class Neo4jHasTextEdge(TypedDict):
source: Neo4jFileNode
target: Neo4jTextNode
class Neo4jNextChunkEdge(TypedDict):
source: Neo4jTextNode
target: Neo4jTextNode

View File

@@ -0,0 +1,465 @@
"""The in-memory knowledge graph representation of a codebase.
In the knowledge graph, we have the following node types:
* FileNode: Represent a file/dir
* ASTNode: Represent a tree-sitter node
* TextNode: Represent a string
and the following edge types:
* HAS_FILE: Relationship between two FileNode, if one FileNode is the parent dir of another FileNode.
* HAS_AST: Relationship between FileNode and ASTNode, if the ASTNode is the root AST node for FileNode.
* HAS_TEXT: Relationship between FileNode and TextNode, if the TextNode is a chunk of text from FileNode.
* PARENT_OF: Relationship between two ASTNode, if one ASTNode is the parent of another ASTNode.
* NEXT_CHUNK: Relationship between two TextNode, if one TextNode is the next chunk of text of another TextNode.
In this way, we have all the directory structure, source code, and text information in a single knowledge graph.
This knowledge graph will be persisted in a graph database (neo4j), where an AI can use it to traverse the
codebase to find the most relevant context for the user query.
"""
import asyncio
import itertools
from collections import defaultdict, deque
from pathlib import Path
from typing import Mapping, Optional, Sequence
import igittigitt
from prometheus.graph.file_graph_builder import FileGraphBuilder
from prometheus.graph.graph_types import (
ASTNode,
FileNode,
KnowledgeGraphEdge,
KnowledgeGraphEdgeType,
KnowledgeGraphNode,
Neo4jASTNode,
Neo4jFileNode,
Neo4jHasASTEdge,
Neo4jHasFileEdge,
Neo4jHasTextEdge,
Neo4jNextChunkEdge,
Neo4jParentOfEdge,
Neo4jTextNode,
TextNode,
)
from prometheus.utils.logger_manager import get_logger
class KnowledgeGraph:
def __init__(
self,
max_ast_depth: int,
chunk_size: int,
chunk_overlap: int,
root_node_id: int,
root_node: Optional[KnowledgeGraphNode] = None,
knowledge_graph_nodes: Optional[Sequence[KnowledgeGraphNode]] = None,
knowledge_graph_edges: Optional[Sequence[KnowledgeGraphEdge]] = None,
):
"""Initializes the knowledge graph.
Args:
max_ast_depth: The maximum depth of tree-sitter nodes to parse.
chunk_size: The chunk size for text files.
chunk_overlap: The overlap size for text files.
root_node_id: The root_node_id.
root_node: The root node for the knowledge graph.
knowledge_graph_nodes: The initial list of knowledge graph nodes.
knowledge_graph_edges: The initial list of knowledge graph edges.
"""
self.max_ast_depth = max_ast_depth
self.root_node_id = root_node_id
self._root_node = root_node
self._knowledge_graph_nodes = (
knowledge_graph_nodes if knowledge_graph_nodes is not None else []
)
self._knowledge_graph_edges = (
knowledge_graph_edges if knowledge_graph_edges is not None else []
)
self._next_node_id = root_node_id + len(self._knowledge_graph_nodes)
self._file_graph_builder = FileGraphBuilder(max_ast_depth, chunk_size, chunk_overlap)
self._logger = get_logger(__name__)
async def build_graph(self, root_dir: Path):
"""Asynchronously builds knowledge graph for a codebase at a location.
Args:
root_dir: The codebase root directory.
"""
await asyncio.to_thread(self._build_graph, root_dir)
def _build_graph(self, root_dir: Path):
"""Builds knowledge graph for a codebase at a location.
Args:
root_dir: The codebase root directory.
"""
root_dir = root_dir.absolute()
gitignore_parser = igittigitt.IgnoreParser()
gitignore_parser.parse_rule_files(root_dir)
gitignore_parser.add_rule(".git", root_dir)
# The root node for the whole graph
root_dir_node = FileNode(basename=root_dir.name, relative_path=".")
kg_root_dir_node = KnowledgeGraphNode(self._next_node_id, root_dir_node)
self._next_node_id += 1
self._knowledge_graph_nodes.append(kg_root_dir_node)
self._root_node = kg_root_dir_node
file_stack = deque()
file_stack.append((root_dir, kg_root_dir_node))
# Now we traverse the file system to parse all the files and create all relationships
while file_stack:
file, kg_file_path_node = file_stack.pop()
# If the file is a directory, we create FileNode for all supported children files.
if file.is_dir():
self._logger.info(f"Processing directory {file}")
for child_file in sorted(file.iterdir()):
# Skip if the file does not exist (broken symlink).
if not child_file.exists():
self._logger.info(f"Skip parsing {child_file} because it does not exist")
continue
# Skip if the child is not a file or it is not supported by the file graph builder.
if child_file.is_file() and not self._file_graph_builder.supports_file(
child_file
):
self._logger.info(f"Skip parsing {child_file} because it is not supported")
continue
if gitignore_parser.match(child_file):
self._logger.info(f"Skipping {child_file} because it is ignored")
continue
child_file_node = FileNode(
basename=child_file.name,
relative_path=child_file.relative_to(root_dir).as_posix(),
)
kg_child_file_node = KnowledgeGraphNode(self._next_node_id, child_file_node)
self._next_node_id += 1
self._knowledge_graph_nodes.append(kg_child_file_node)
self._knowledge_graph_edges.append(
KnowledgeGraphEdge(
kg_file_path_node,
kg_child_file_node,
KnowledgeGraphEdgeType.has_file,
)
)
file_stack.append((child_file, kg_child_file_node))
# Process the file otherwise.
else:
self._logger.info(f"Processing file {file}")
try:
next_node_id, kg_nodes, kg_edges = self._file_graph_builder.build_file_graph(
kg_file_path_node, file, self._next_node_id
)
except UnicodeDecodeError:
self._logger.warning(f"UnicodeDecodeError when processing {file}")
continue
self._next_node_id = next_node_id
self._knowledge_graph_nodes.extend(kg_nodes)
self._knowledge_graph_edges.extend(kg_edges)
@classmethod
def from_neo4j(
cls,
root_node_id: int,
max_ast_depth: int,
chunk_size: int,
chunk_overlap: int,
file_nodes: Sequence[KnowledgeGraphNode],
ast_nodes: Sequence[KnowledgeGraphNode],
text_nodes: Sequence[KnowledgeGraphNode],
parent_of_edges_ids: Sequence[Mapping[str, int]],
has_file_edges_ids: Sequence[Mapping[str, int]],
has_ast_edges_ids: Sequence[Mapping[str, int]],
has_text_edges_ids: Sequence[Mapping[str, int]],
next_chunk_edges_ids: Sequence[Mapping[str, int]],
):
"""Creates a knowledge graph from nodes and edges stored in neo4j."""
# All nodes
knowledge_graph_nodes = [x for x in itertools.chain(file_nodes, ast_nodes, text_nodes)]
# All edges
node_id_to_node = {x.node_id: x for x in knowledge_graph_nodes}
parent_of_edges = [
KnowledgeGraphEdge(
node_id_to_node[parent_of_edge_ids["source_id"]],
node_id_to_node[parent_of_edge_ids["target_id"]],
KnowledgeGraphEdgeType.parent_of,
)
for parent_of_edge_ids in parent_of_edges_ids
]
has_file_edges = [
KnowledgeGraphEdge(
node_id_to_node[has_file_edge_ids["source_id"]],
node_id_to_node[has_file_edge_ids["target_id"]],
KnowledgeGraphEdgeType.has_file,
)
for has_file_edge_ids in has_file_edges_ids
]
has_ast_edges = [
KnowledgeGraphEdge(
node_id_to_node[has_ast_edge_ids["source_id"]],
node_id_to_node[has_ast_edge_ids["target_id"]],
KnowledgeGraphEdgeType.has_ast,
)
for has_ast_edge_ids in has_ast_edges_ids
]
has_text_edges = [
KnowledgeGraphEdge(
node_id_to_node[has_text_edge_ids["source_id"]],
node_id_to_node[has_text_edge_ids["target_id"]],
KnowledgeGraphEdgeType.has_text,
)
for has_text_edge_ids in has_text_edges_ids
]
next_chunk_edges = [
KnowledgeGraphEdge(
node_id_to_node[next_chunk_edge_ids["source_id"]],
node_id_to_node[next_chunk_edge_ids["target_id"]],
KnowledgeGraphEdgeType.next_chunk,
)
for next_chunk_edge_ids in next_chunk_edges_ids
]
knowledge_graph_edges = [
x
for x in itertools.chain(
parent_of_edges, has_file_edges, has_ast_edges, has_text_edges, next_chunk_edges
)
]
# Root node
root_node = None
for node in knowledge_graph_nodes:
if node.node_id == root_node_id:
root_node = node
break
if root_node is None:
raise ValueError(f"Node with node_id {root_node_id} not found.")
return cls(
max_ast_depth=max_ast_depth,
chunk_size=chunk_size,
chunk_overlap=chunk_overlap,
root_node_id=root_node_id,
root_node=root_node,
knowledge_graph_nodes=knowledge_graph_nodes,
knowledge_graph_edges=knowledge_graph_edges,
)
def get_file_tree(self, max_depth: int = 5, max_lines: int = 5000) -> str:
"""Generate a tree-like string representation of the file structure.
Creates an ASCII tree visualization of the file hierarchy, similar to the Unix 'tree'
command output. The tree is generated using Unicode box-drawing characters and
indentation to show the hierarchical relationship between files and directories.
Example:
project/
├── src/
│ ├── main.py
│ └── utils/
│ ├── helpers.py
│ └── config.py
└── tests/
├── test_main.py
└── test_utils.py
Args:
max_depth: Maximum depth of the tree to display. Nodes beyond this depth will
be omitted. Default to 5.
max_lines: Maximum number of lines in the output string. Useful for truncating
very large trees. Default to 5000.
Returns:
str: A string representation of the file tree, where each line represents a file
or directory, with appropriate indentation and connecting lines showing
the hierarchy.
Algorithm:
- Uses a stack-based depth-first traversal to walk the file tree.
- Maintains a prefix string to build up the correct indentation and connectors.
- For each node, determines whether it is the last child in its directory to use
the correct tree connector (├── or └──).
- Accumulates results in `result_lines` until either max_depth or max_lines is reached.
"""
file_node_adjacency_dict = (
self._get_file_node_adjacency_dict()
) # Maps nodes to their children
# Each stack entry contains: (current_node, depth, prefix_string, is_last_child)
stack = deque()
stack.append((self._root_node, 0, "", None))
result_lines = []
# Box-drawing characters and indentation constants
SPACE = " " # Indentation for levels without children
BRANCH = "| " # Vertical line for intermediate children
TEE = "├── " # Entry for a non-final child
LAST = "└── " # Entry for the last child
while stack and (len(result_lines)) < max_lines:
file_node, depth, prefix, is_last = stack.pop()
# Skip if we've exceeded max_depth
if depth > max_depth:
continue
# Choose the connector character depending on whether this is the last child
pointer = LAST if is_last else TEE
line_prefix = "" if depth == 0 else prefix + pointer
# Add the current file or directory to the result lines
result_lines.append(line_prefix + file_node.node.basename)
# Get the current node's children and sort them alphabetically by name
sorted_children_file_node = sorted(
file_node_adjacency_dict[file_node], key=lambda x: x.node.basename
)
# Traverse the children in reverse order to maintain the correct tree shape
for i in range(len(sorted_children_file_node) - 1, -1, -1):
extension = SPACE if is_last else BRANCH # Update prefix for children
new_prefix = "" if depth == 0 else prefix + extension
stack.append(
(
sorted_children_file_node[i],
depth + 1,
new_prefix,
i == len(sorted_children_file_node) - 1, # True if last child
)
)
# Join all lines into a single string for output
return "\n".join(result_lines)
def get_all_ast_node_types(self) -> Sequence[str]:
ast_node_types = set()
for ast_node in self.get_ast_nodes():
ast_node_types.add(ast_node.node.type)
return list(ast_node_types)
def _get_file_node_adjacency_dict(
self,
) -> Mapping[KnowledgeGraphNode, Sequence[KnowledgeGraphNode]]:
file_node_adjacency_dict = defaultdict(list)
for has_file_edge in self.get_has_file_edges():
file_node_adjacency_dict[has_file_edge.source].append(has_file_edge.target)
return file_node_adjacency_dict
def get_file_nodes(self) -> Sequence[KnowledgeGraphNode]:
return [
kg_node for kg_node in self._knowledge_graph_nodes if isinstance(kg_node.node, FileNode)
]
def get_ast_nodes(self) -> Sequence[KnowledgeGraphNode]:
return [
kg_node for kg_node in self._knowledge_graph_nodes if isinstance(kg_node.node, ASTNode)
]
def get_text_nodes(self) -> Sequence[KnowledgeGraphNode]:
return [
kg_node for kg_node in self._knowledge_graph_nodes if isinstance(kg_node.node, TextNode)
]
def get_has_ast_edges(self) -> Sequence[KnowledgeGraphEdge]:
return [
kg_edge
for kg_edge in self._knowledge_graph_edges
if kg_edge.type == KnowledgeGraphEdgeType.has_ast
]
def get_has_file_edges(self) -> Sequence[KnowledgeGraphEdge]:
return [
kg_edge
for kg_edge in self._knowledge_graph_edges
if kg_edge.type == KnowledgeGraphEdgeType.has_file
]
def get_has_text_edges(self) -> Sequence[KnowledgeGraphEdge]:
return [
kg_edge
for kg_edge in self._knowledge_graph_edges
if kg_edge.type == KnowledgeGraphEdgeType.has_text
]
def get_next_chunk_edges(self) -> Sequence[KnowledgeGraphEdge]:
return [
kg_edge
for kg_edge in self._knowledge_graph_edges
if kg_edge.type == KnowledgeGraphEdgeType.next_chunk
]
def get_parent_of_edges(self) -> Sequence[KnowledgeGraphEdge]:
return [
kg_edge
for kg_edge in self._knowledge_graph_edges
if kg_edge.type == KnowledgeGraphEdgeType.parent_of
]
def get_neo4j_file_nodes(self) -> Sequence[Neo4jFileNode]:
return [kg_node.to_neo4j_node() for kg_node in self.get_file_nodes()]
def get_neo4j_ast_nodes(self) -> Sequence[Neo4jASTNode]:
return [kg_node.to_neo4j_node() for kg_node in self.get_ast_nodes()]
def get_neo4j_text_nodes(self) -> Sequence[Neo4jTextNode]:
return [kg_node.to_neo4j_node() for kg_node in self.get_text_nodes()]
def get_neo4j_has_ast_edges(self) -> Sequence[Neo4jHasASTEdge]:
return [kg_edge.to_neo4j_edge() for kg_edge in self.get_has_ast_edges()]
def get_neo4j_has_file_edges(self) -> Sequence[Neo4jHasFileEdge]:
return [kg_edge.to_neo4j_edge() for kg_edge in self.get_has_file_edges()]
def get_neo4j_has_text_edges(self) -> Sequence[Neo4jHasTextEdge]:
return [kg_edge.to_neo4j_edge() for kg_edge in self.get_has_text_edges()]
def get_neo4j_next_chunk_edges(self) -> Sequence[Neo4jNextChunkEdge]:
return [kg_edge.to_neo4j_edge() for kg_edge in self.get_next_chunk_edges()]
def get_neo4j_parent_of_edges(self) -> Sequence[Neo4jParentOfEdge]:
return [kg_edge.to_neo4j_edge() for kg_edge in self.get_parent_of_edges()]
def get_parent_to_children_map(self) -> Mapping[int, Sequence[KnowledgeGraphNode]]:
"""
Returns a mapping from parent AST node IDs to their child AST nodes.
"""
parent_of_edges = self.get_parent_of_edges()
parent_to_children = {}
for edge in parent_of_edges:
parent_id = edge.source.node_id
if parent_id not in parent_to_children:
parent_to_children[parent_id] = []
parent_to_children[parent_id].append(edge.target)
return parent_to_children
def __eq__(self, other: "KnowledgeGraph") -> bool:
if not isinstance(other, KnowledgeGraph):
return False
self._knowledge_graph_nodes.sort(key=lambda x: x.node_id)
other._knowledge_graph_nodes.sort(key=lambda x: x.node_id)
for self_kg_node, other_kg_node in itertools.zip_longest(
self._knowledge_graph_nodes, other._knowledge_graph_nodes, fillvalue=None
):
if self_kg_node != other_kg_node:
return False
self._knowledge_graph_edges.sort(key=lambda x: (x.source.node_id, x.target.node_id, x.type))
other._knowledge_graph_edges.sort(
key=lambda x: (x.source.node_id, x.target.node_id, x.type)
)
for self_kg_edge, other_kg_edge in itertools.zip_longest(
self._knowledge_graph_edges, other._knowledge_graph_edges, fillvalue=None
):
if self_kg_edge != other_kg_edge:
return False
return True

View File

@@ -0,0 +1,167 @@
from typing import Mapping, Optional, Sequence
from langchain_core.language_models.chat_models import BaseChatModel
from langgraph.graph import END, StateGraph
from prometheus.docker.base_container import BaseContainer
from prometheus.git.git_repository import GitRepository
from prometheus.graph.knowledge_graph import KnowledgeGraph
from prometheus.lang_graph.graphs.issue_state import IssueState, IssueType
from prometheus.lang_graph.nodes.issue_bug_subgraph_node import IssueBugSubgraphNode
from prometheus.lang_graph.nodes.issue_classification_subgraph_node import (
IssueClassificationSubgraphNode,
)
from prometheus.lang_graph.nodes.issue_documentation_subgraph_node import (
IssueDocumentationSubgraphNode,
)
from prometheus.lang_graph.nodes.issue_feature_subgraph_node import IssueFeatureSubgraphNode
from prometheus.lang_graph.nodes.issue_question_subgraph_node import IssueQuestionSubgraphNode
from prometheus.lang_graph.nodes.noop_node import NoopNode
class IssueGraph:
"""
A LangGraph-based workflow to handle and triage GitHub issues with LLM assistance.
Attributes:
git_repo (GitRepository): The Git repository to work with.
graph (StateGraph): The state graph representing the issue handling workflow.
"""
def __init__(
self,
advanced_model: BaseChatModel,
base_model: BaseChatModel,
kg: KnowledgeGraph,
git_repo: GitRepository,
container: BaseContainer,
repository_id: int,
test_commands: Optional[Sequence[str]] = None,
):
self.git_repo = git_repo
# Entrance point for the issue handling workflow
issue_type_branch_node = NoopNode()
# Subgraph nodes for issue classification
issue_classification_subgraph_node = IssueClassificationSubgraphNode(
advanced_model=advanced_model,
model=base_model,
kg=kg,
local_path=git_repo.playground_path,
repository_id=repository_id,
)
# Subgraph node for handling bug issues
issue_bug_subgraph_node = IssueBugSubgraphNode(
advanced_model=advanced_model,
base_model=base_model,
container=container,
kg=kg,
git_repo=git_repo,
repository_id=repository_id,
test_commands=test_commands,
)
# Subgraph node for handling question issues
issue_question_subgraph_node = IssueQuestionSubgraphNode(
advanced_model=advanced_model,
base_model=base_model,
kg=kg,
git_repo=git_repo,
repository_id=repository_id,
)
# Subgraph node for handling feature request issues
issue_feature_subgraph_node = IssueFeatureSubgraphNode(
advanced_model=advanced_model,
base_model=base_model,
container=container,
kg=kg,
git_repo=git_repo,
repository_id=repository_id,
)
# Subgraph node for handling documentation issues
issue_documentation_subgraph_node = IssueDocumentationSubgraphNode(
advanced_model=advanced_model,
base_model=base_model,
kg=kg,
git_repo=git_repo,
repository_id=repository_id,
)
# Create the state graph for the issue handling workflow
workflow = StateGraph(IssueState)
# Add nodes to the workflow
workflow.add_node("issue_type_branch_node", issue_type_branch_node)
workflow.add_node("issue_classification_subgraph_node", issue_classification_subgraph_node)
workflow.add_node("issue_bug_subgraph_node", issue_bug_subgraph_node)
workflow.add_node("issue_question_subgraph_node", issue_question_subgraph_node)
workflow.add_node("issue_feature_subgraph_node", issue_feature_subgraph_node)
workflow.add_node("issue_documentation_subgraph_node", issue_documentation_subgraph_node)
# Set the entry point for the workflow
workflow.set_entry_point("issue_type_branch_node")
# Define the edges and conditions for the workflow
# Classify the issue type if not provided
workflow.add_conditional_edges(
"issue_type_branch_node",
lambda state: state["issue_type"],
{
IssueType.AUTO: "issue_classification_subgraph_node",
IssueType.BUG: "issue_bug_subgraph_node",
IssueType.FEATURE: "issue_feature_subgraph_node",
IssueType.DOCUMENTATION: "issue_documentation_subgraph_node",
IssueType.QUESTION: "issue_question_subgraph_node",
},
)
# Add edges for the issue classification subgraph
workflow.add_conditional_edges(
"issue_classification_subgraph_node",
lambda state: state["issue_type"],
{
IssueType.BUG: "issue_bug_subgraph_node",
IssueType.FEATURE: "issue_feature_subgraph_node",
IssueType.DOCUMENTATION: "issue_documentation_subgraph_node",
IssueType.QUESTION: "issue_question_subgraph_node",
},
)
# Add edges for ending the workflow
workflow.add_edge("issue_bug_subgraph_node", END)
workflow.add_edge("issue_question_subgraph_node", END)
workflow.add_edge("issue_feature_subgraph_node", END)
workflow.add_edge("issue_documentation_subgraph_node", END)
self.graph = workflow.compile()
def invoke(
self,
issue_title: str,
issue_body: str,
issue_comments: Sequence[Mapping[str, str]],
issue_type: IssueType,
run_build: bool,
run_existing_test: bool,
run_regression_test: bool,
run_reproduce_test: bool,
number_of_candidate_patch: int,
):
"""
Invoke the issue handling workflow with the provided parameters.
"""
config = None
input_state = {
"issue_title": issue_title,
"issue_body": issue_body,
"issue_comments": issue_comments,
"issue_type": issue_type,
"run_build": run_build,
"run_existing_test": run_existing_test,
"run_regression_test": run_regression_test,
"run_reproduce_test": run_reproduce_test,
"number_of_candidate_patch": number_of_candidate_patch,
}
output_state = self.graph.invoke(input_state, config)
return output_state

View File

@@ -0,0 +1,31 @@
from enum import StrEnum
from typing import Mapping, Sequence, TypedDict
class IssueType(StrEnum):
AUTO = "auto"
BUG = "bug"
FEATURE = "feature"
DOCUMENTATION = "documentation"
QUESTION = "question"
class IssueState(TypedDict):
# Attributes provided by the user
issue_title: str
issue_body: str
issue_comments: Sequence[Mapping[str, str]]
issue_type: IssueType
run_build: bool
run_existing_test: bool
run_regression_test: bool
run_reproduce_test: bool
number_of_candidate_patch: int
edit_patch: str
passed_regression_test: bool
passed_reproducing_test: bool
passed_existing_test: bool
issue_response: str

View File

@@ -0,0 +1,52 @@
import logging
import threading
from langchain_core.messages import HumanMessage
from prometheus.lang_graph.subgraphs.context_retrieval_state import ContextRetrievalState
class AddContextRefinedQueryMessageNode:
"""Node for converting refined query to string and adding it to context_provider_messages."""
HUMAN_PROMPT = """
Please search for code and documentations that can help to answer the following query.
{query_message}
DO NOT do anything else other than searching for relevant code and documentations that can help to answer the query!
Your every action must be searching for relevant code and documentations that can help to answer the query!
"""
def __init__(self):
"""Initialize the add context refined query message node."""
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def __call__(self, state: ContextRetrievalState):
"""
Convert refined query to string and add to context_provider_messages.
Args:
state: Current state containing refined_query
Returns:
State update with context_provider_messages
"""
refined_query = state["refined_query"]
# Build the query message
query_parts = [f"Essential query: {refined_query.essential_query}"]
if refined_query.extra_requirements:
query_parts.append(f"\nExtra requirements: {refined_query.extra_requirements}")
if refined_query.purpose:
query_parts.append(f"\nPurpose: {refined_query.purpose}")
query_message = "".join(query_parts)
self._logger.info("Creating context provider message from refined query")
self._logger.debug(f"Query message: {query_message}")
# Create HumanMessage and add to context_provider_messages
human_message = HumanMessage(content=query_message)
return {"context_provider_messages": [human_message]}

View File

@@ -0,0 +1,51 @@
import logging
import threading
from prometheus.lang_graph.subgraphs.context_retrieval_state import ContextRetrievalState
from prometheus.utils.knowledge_graph_utils import deduplicate_contexts, sort_contexts
class AddResultContextNode:
"""Node for adding new_contexts to context and deduplicating the result."""
def __init__(self):
"""Initialize the add result context node."""
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def __call__(self, state: ContextRetrievalState):
"""
Add new_contexts to context and deduplicate.
Args:
state: Current state containing context and new_contexts
Returns:
State update with deduplicated context
"""
existing_context = state.get("context", [])
new_contexts = state.get("new_contexts", [])
if not new_contexts:
self._logger.info("No new contexts to add")
return {"context": existing_context}
self._logger.info(
f"Adding {len(new_contexts)} new contexts to {len(existing_context)} existing contexts"
)
# Combine existing and new contexts
combined_contexts = list(existing_context) + list(new_contexts)
# Deduplicate
deduplicated_contexts = deduplicate_contexts(combined_contexts)
self._logger.info(
f"After deduplication: {len(deduplicated_contexts)} total contexts "
f"(removed {len(combined_contexts) - len(deduplicated_contexts)} duplicates)"
)
# Sort contexts before returning and record previous queries
return {
"context": sort_contexts(deduplicated_contexts),
"previous_refined_queries": [state["refined_query"]],
}

View File

@@ -0,0 +1,61 @@
import logging
import threading
from langchain_core.language_models.chat_models import BaseChatModel
from langgraph.errors import GraphRecursionError
from prometheus.docker.base_container import BaseContainer
from prometheus.git.git_repository import GitRepository
from prometheus.lang_graph.subgraphs.bug_fix_verification_subgraph import BugFixVerificationSubgraph
from prometheus.lang_graph.subgraphs.issue_verified_bug_state import IssueVerifiedBugState
class BugFixVerificationSubgraphNode:
def __init__(
self,
model: BaseChatModel,
container: BaseContainer,
git_repo: GitRepository,
):
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
self.git_repo = git_repo
self.subgraph = BugFixVerificationSubgraph(
model=model,
container=container,
git_repo=self.git_repo,
)
def __call__(self, state: IssueVerifiedBugState):
self._logger.info("Enter bug_fix_verification_subgraph_node")
self._logger.debug(f"reproduced_bug_file: {state['reproduced_bug_file']}")
self._logger.debug(f"reproduced_bug_commands: {state['reproduced_bug_commands']}")
self._logger.debug(f"reproduced_bug_patch: {state['reproduced_bug_patch']}")
self._logger.debug(f"edit_patch: {state['edit_patch']}")
try:
output_state = self.subgraph.invoke(
reproduced_bug_file=state["reproduced_bug_file"],
reproduced_bug_commands=state["reproduced_bug_commands"],
reproduced_bug_patch=state["reproduced_bug_patch"],
edit_patch=state["edit_patch"],
)
except GraphRecursionError:
self._logger.info("Recursion limit reached, returning empty output state")
return {
"reproducing_test_fail_log": "Recursion limit reached during bug fix verification.",
}
finally:
self.git_repo.reset_repository()
self._logger.info(
f"Passing bug reproducing test: {not bool(output_state['reproducing_test_fail_log'])}"
)
self._logger.debug(
f"reproducing_test_fail_log: {output_state['reproducing_test_fail_log']}"
)
return {
"reproducing_test_fail_log": output_state["reproducing_test_fail_log"],
"pass_reproduction_test_patches": [state["edit_patch"]]
if not bool(output_state["reproducing_test_fail_log"])
else [],
}

View File

@@ -0,0 +1,90 @@
import functools
import logging
import threading
from langchain.tools import StructuredTool
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.messages import HumanMessage, SystemMessage
from prometheus.docker.base_container import BaseContainer
from prometheus.lang_graph.subgraphs.bug_fix_verification_state import BugFixVerificationState
from prometheus.tools.container_command import ContainerCommandTool
class BugFixVerifyNode:
SYS_PROMPT = """\
You are a bug fix verification agent. Your role is to verify whether a bug has been fixed by running the reproduction steps and reporting the results accurately.
Your tasks are to:
1. Execute the provided reproduction commands on the given bug reproduction file
2. If a command fails due to simple environment issues (like missing "./" prefix), make minimal adjustments to make it work
3. Report the exact output of the successful commands
Guidelines for command execution:
- Start by running commands exactly as provided
- If a command fails, you may make minimal adjustments like:
* Adding "./" for executable files
* Using appropriate path separators for the environment
* Adding basic command prefixes if clearly needed (e.g., "python" for .py files)
- Do NOT modify the core logic or parameters of the commands
- Do NOT attempt to fix bugs or modify test logic
- DO NOT ASSUME ALL DEPENDENCIES ARE INSTALLED.
- Once the executed reproduction test failed or passed (Successfully executed), return immediately!
REMINDER:
- Install dependencies if needed!
Format your response as:
```
Result:
[exact output/result]
```
Remember: Your only job is to execute the commands and report results faithfully. Do not offer suggestions, analyze results, or try to fix issues.
"""
HUMAN_PROMPT = """\
Reproducing bug file:
{reproduced_bug_file}
Reproducing bug commands:
{reproduced_bug_commands}
"""
def __init__(self, model: BaseChatModel, container: BaseContainer):
self.container_command_tool = ContainerCommandTool(container)
self.tools = self._init_tools()
self.model_with_tools = model.bind_tools(self.tools)
self.system_prompt = SystemMessage(self.SYS_PROMPT)
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def _init_tools(self):
tools = []
run_command_fn = functools.partial(self.container_command_tool.run_command)
run_command_tool = StructuredTool.from_function(
func=run_command_fn,
name=self.container_command_tool.run_command.__name__,
description=self.container_command_tool.run_command_spec.description,
args_schema=self.container_command_tool.run_command_spec.input_schema,
)
tools.append(run_command_tool)
return tools
def format_human_message(self, state: BugFixVerificationState) -> HumanMessage:
return HumanMessage(
self.HUMAN_PROMPT.format(
reproduced_bug_file=state["reproduced_bug_file"],
reproduced_bug_commands=state["reproduced_bug_commands"],
)
)
def __call__(self, state: BugFixVerificationState):
human_message = self.format_human_message(state)
message_history = [self.system_prompt, human_message] + state["bug_fix_verify_messages"]
response = self.model_with_tools.invoke(message_history)
self._logger.debug(response)
return {"bug_fix_verify_messages": [response]}

View File

@@ -0,0 +1,104 @@
import logging
import threading
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from prometheus.lang_graph.subgraphs.bug_fix_verification_state import BugFixVerificationState
from prometheus.utils.lang_graph_util import get_last_message_content
class BugFixVerifyStructureOutput(BaseModel):
reproducing_test_fail_log: str = Field(
description="If the test failed, contains the complete test failure log. Otherwise empty string"
)
class BugFixVerifyStructuredNode:
SYS_PROMPT = """\
You are a test result parser. Your only task is to check if the bug reproducing test now passes after code changes.
Your task is to:
1. Check if the test passes by looking for test pass indicators:
- Test summary showing "passed" or "PASSED"
- Warning is ok
- No "FAILURES" section
2. If the test fails, capture the complete failure output
Return:
- reproducing_test_fail_log: empty string if test passes, complete test output if it fails
Example of Fixed Bug (Test Passes):
```
Test Execute Messages:
run_command: pytest tests/test_json_parser.py
Output:
============================= test session starts ==============================
platform linux -- Python 3.9.20, pytest-7.4.4, pluggy-1.0.0
collected 1 item
tests/test_json_parser.py . [100%]
============================== 1 passed in 0.16s =============================
```
Example Output for Fixed Bug:
{
"reproducing_test_fail_log": ""
}
Example of Unfixed Bug (Test Still Fails):
```
Test Execute Messages:
run_command: pytest tests/test_json_parser.py
Output:
============================= test session starts ==============================
platform linux -- Python 3.9.20, pytest-7.4.4, pluggy-1.0.0
collected 1 item
tests/test_json_parser.py F [100%]
================================= FAILURES ==================================
_________________________ test_empty_array_parsing _________________________
def test_empty_array_parsing():
> assert parser.parse_array(['[', ']']) == []
E ValueError: Empty array not supported
tests/test_json_parser.py:7: ValueError
=========================== short test summary info ==========================
FAILED tests/test_json_parser.py::test_empty_array_parsing - ValueError
============================== 1 failed in 0.16s =============================
```
Example Output for Unfixed Bug:
{
"reproducing_test_fail_log": "<complete test output above>"
}
Important:
- Only look at test pass/fail status
- A single failing test means the bug isn't fixed
- Include complete test output in failure log
- Any error or failure means the bug isn't fixed yet
""".replace("{", "{{").replace("}", "}}")
def __init__(self, model: BaseChatModel):
prompt = ChatPromptTemplate.from_messages(
[("system", self.SYS_PROMPT), ("human", "{bug_reproducing_logs}")]
)
structured_llm = model.with_structured_output(BugFixVerifyStructureOutput)
self.model = prompt | structured_llm
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def __call__(self, state: BugFixVerificationState):
bug_fix_verify_message = get_last_message_content(state["bug_fix_verify_messages"])
response = self.model.invoke({"bug_reproducing_logs": bug_fix_verify_message})
self._logger.debug(response)
return {
"reproducing_test_fail_log": response.reproducing_test_fail_log,
}

View File

@@ -0,0 +1,101 @@
import logging
import threading
from prometheus.lang_graph.subgraphs.bug_get_regression_tests_state import (
BugGetRegressionTestsState,
)
from prometheus.utils.issue_util import format_issue_info
class BugGetRegressionContextMessageNode:
SELECT_REGRESSION_QUERY = """\
We are currently solving the following issue within our repository. Here is the issue text:
--- BEGIN ISSUE ---
{issue_info}
--- END ISSUE ---
And we need to find relevant existing tests that can be used as regression tests for this issue.
OBJECTIVE: Find {number_of_selected_regression_tests} relevant existing test cases that most likely to break existing functionality if this issue is fixed or new changes apply.
including ALL necessary imports, test setup, mocking, assertions, and any test method used in the test case.
<reasoning>
1. Analyze bug characteristics:
- Core functionality being tested
- Input parameters and configurations
- Expected error conditions
- Environmental dependencies
2. Search requirements:
- Required imports and dependencies
- Test files exercising similar functionality
- Mock/fixture setup patterns
- Assertion styles
- Error handling tests
3. Focus areas:
- All necessary imports (standard library, testing frameworks, mocking utilities)
- Dependencies and third-party packages
- Test setup and teardown
- Mock object configuration
- Network/external service simulation
- Error condition verification
</reasoning>
REQUIREMENTS:
- Return {number_of_selected_regression_tests} complete, self-contained test cases that most likely to break existing functionality if this issue is fixed or new changes apply.
- Must include the identification of the test case (e.g., class name and method name)
- Must preserve exact file paths and line numbers
<examples>
--- BEGIN ISSUE ---
Title: parse_iso8601 drops timezone information for 'Z' suffix
Body: The helper `parse_iso8601` in `utils/datetime.py` incorrectly returns a naive datetime when the input ends with 'Z' (UTC). For example, "2024-10-12T09:15:00Z" becomes a naive dt instead of timezone-aware UTC. This breaks downstream scheduling.
Expected: Return timezone-aware datetime in UTC for 'Z' inputs and preserve offsets like "+09:00".
--- END ISSUE ---
--- BEGIN TEST CASES ---
File: tests/test_datetime.py
Line Number: 118-156
Content:
import datetime
import pytest
from utils.datetime import parse_iso8601 # target under test
def test_z_suffix_returns_utc_aware(self):
# Input ending with 'Z' should be interpreted as UTC and be timezone-aware
s = "2024-10-12T09:15:00Z"
dt = parse_iso8601(s)
assert isinstance(dt, datetime.datetime)
assert dt.tzinfo is not None
# Use UTC comparison that works across pytz/zoneinfo
assert dt.utcoffset() == datetime.timedelta(0)
def test_offset_preserved(self):
# Offset like +09:00 should be preserved (e.g., Asia/Tokyo offset)
s = "2024-10-12T18:00:00+09:00"
dt = parse_iso8601(s)
assert isinstance(dt, datetime.datetime)
assert dt.tzinfo is not None
assert dt.utcoffset() == datetime.timedelta(hours=9)
--- END TEST CASES ---
</example>
"""
def __init__(self):
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def __call__(self, state: BugGetRegressionTestsState):
select_regression_query = self.SELECT_REGRESSION_QUERY.format(
issue_info=format_issue_info(
state["issue_title"], state["issue_body"], state["issue_comments"]
),
number_of_selected_regression_tests=state["number_of_selected_regression_tests"] + 3,
)
self._logger.debug(
f"Sending query to context provider subgraph:\n{select_regression_query}"
)
return {"select_regression_query": select_regression_query}

View File

@@ -0,0 +1,121 @@
import logging
import threading
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.prompts import ChatPromptTemplate
from pydantic import BaseModel, Field
from prometheus.lang_graph.subgraphs.bug_get_regression_tests_state import (
BugGetRegressionTestsState,
)
from prometheus.utils.issue_util import format_issue_info
class RegressionTestStructuredOutPut(BaseModel):
reasoning: str = Field(description="Your step-by-step reasoning why this test is selected")
test_identifier: str = Field(
description="The test identifier that you select (e.g., class name and method name)"
)
class RegressionTestsStructuredOutPut(BaseModel):
selected_tests: list[RegressionTestStructuredOutPut] = Field(
description="List of selected regression tests with reasoning and identifiers"
)
class BugGetRegressionTestsSelectionNode:
SYS_PROMPT = """\
You are an expert programming assistant specialized in evaluating and selecting regression tests for a given issue among multiple candidate test cases.
Your goal is to analyze each test and select appropriate regression tests based on the following prioritized criteria:
1. The test is relevant to the issue at hand, fixing the bug could affect this test
2. The test cases that most likely to break existing functionality if this issue is fixed or new changes apply.
3. Do NOT select any test cases that may be skipped during normal test runs!
Analysis Process:
1. First, understand the issue from the provided issue info
2. Examine each tests carefully, considering:
- Is it relevant to the issue at hand?
- Does fixing the bug could affect this test?
- Does this test case is most likely to break existing functionality if this issue is fixed or new changes apply?
3. Compare tests systematically against each criterion
4. Provide detailed reasoning for your selection
Output Requirements:
- You MUST provide structured output in the following format:
{{
"selected_tests": [
{{
"reasoning": "", # Your step-by-step reasoning why this test is selected
"test_identifier": "" # The test identifier that you select (e.g., class name and method name)
}}
]
}}
ALL fields are REQUIRED!
EXAMPLE OUTPUT:
```json
{{
"selected_tests": [
{{
"reasoning": "1. Relevance to issue: The test directly exercises the functionality described in the issue, specifically handling edge cases that are likely to be affected by the bug fix.\n2. Impact likelihood: Given the test's focus on critical paths mentioned in the issue, it is highly probable that fixing the bug will influence this test's behavior.",
"test_identifier": "pvlib/tests/test_iam.py::test_ashrae"
}}
]
}}
```
Remember:
- Always analyze all available tests thoroughly
- Provide clear, step-by-step reasoning for your selection
- Select the tests that best balances the prioritized criteria
- Do NOT select any test cases that may be skipped during normal test runs!
"""
HUMAN_PROMPT = """\
PLEASE SELECT {number_of_selected_regression_tests} RELEVANT REGRESSION TESTS FOR THE FOLLOWING ISSUE:
--- BEGIN ISSUE ---
{issue_info}
--- END ISSUE ---
Select Regression Tests Context:
{select_regression_context}
You MUST select {number_of_selected_regression_tests} regression tests that are most likely to break existing functionality if this issue is fixed or new changes apply.
"""
def __init__(self, model: BaseChatModel):
prompt = ChatPromptTemplate.from_messages(
[("system", self.SYS_PROMPT), ("human", "{human_prompt}")]
)
structured_llm = model.with_structured_output(RegressionTestsStructuredOutPut)
self.model = prompt | structured_llm
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def format_human_message(self, state: BugGetRegressionTestsState):
return self.HUMAN_PROMPT.format(
issue_info=format_issue_info(
state["issue_title"], state["issue_body"], state["issue_comments"]
),
select_regression_context="\n\n".join(
[str(context) for context in state["select_regression_context"]]
),
number_of_selected_regression_tests=state["number_of_selected_regression_tests"],
)
def __call__(self, state: BugGetRegressionTestsState):
# If no context gathered return empty tests
if not state["select_regression_context"]:
return {"selected_regression_tests": []}
human_prompt = self.format_human_message(state)
response = self.model.invoke({"human_prompt": human_prompt})
self._logger.debug(f"Model response: {response}")
self._logger.debug(f"{len(response.selected_tests)} tests selected as regression tests")
# Return only the identifiers of the selected regression tests
return {
"selected_regression_tests": [test.test_identifier for test in response.selected_tests]
}

View File

@@ -0,0 +1,50 @@
import logging
import threading
from typing import Dict
from langchain_core.language_models.chat_models import BaseChatModel
from langgraph.errors import GraphRecursionError
from prometheus.docker.base_container import BaseContainer
from prometheus.git.git_repository import GitRepository
from prometheus.graph.knowledge_graph import KnowledgeGraph
from prometheus.lang_graph.subgraphs.bug_get_regression_tests_subgraph import (
BugGetRegressionTestsSubgraph,
)
class BugGetRegressionTestsSubgraphNode:
def __init__(
self,
advanced_model: BaseChatModel,
base_model: BaseChatModel,
container: BaseContainer,
kg: KnowledgeGraph,
git_repo: GitRepository,
repository_id: int,
):
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
self.subgraph = BugGetRegressionTestsSubgraph(
advanced_model=advanced_model,
base_model=base_model,
container=container,
kg=kg,
git_repo=git_repo,
repository_id=repository_id,
)
def __call__(self, state: Dict):
self._logger.info("Enter bug_get_regression_tests_subgraph_node")
try:
output_state = self.subgraph.invoke(
issue_title=state["issue_title"],
issue_body=state["issue_body"],
issue_comments=state["issue_comments"],
)
except GraphRecursionError:
self._logger.info("Recursion limit reached, returning empty regression tests")
return {"selected_regression_tests": []}
self._logger.debug(
f"Selected {len(output_state['regression_tests'])} regression tests: {output_state['regression_tests']}"
)
return {"selected_regression_tests": output_state["regression_tests"]}

View File

@@ -0,0 +1,125 @@
import functools
import logging
import threading
from pathlib import Path
from typing import Optional, Sequence
from langchain.tools import StructuredTool
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.messages import AIMessage, HumanMessage, SystemMessage
from prometheus.docker.base_container import BaseContainer
from prometheus.lang_graph.subgraphs.bug_reproduction_state import BugReproductionState
from prometheus.tools.container_command import ContainerCommandTool
from prometheus.utils.issue_util import format_test_commands
from prometheus.utils.patch_util import get_updated_files
class BugReproducingExecuteNode:
SYS_PROMPT = """\
You are a testing expert focused solely on executing THE SINGLE bug reproduction test file.
Your only goal is to run the test file created by the previous agent and return its output as it is.
Adapt the user provided test command to execute the single bug reproduction test file.
Rules:
* DO NOT CHECK IF THE TEST FILE EXISTS. IT IS GUARANTEED TO EXIST.
* DO NOT EXECUTE THE WHOLE TEST SUITE. ONLY EXECUTE THE SINGLE BUG REPRODUCTION TEST FILE.
* DO NOT EDIT ANY FILES.
* DO NOT ASSUME ALL DEPENDENCIES ARE INSTALLED.
* STOP TRYING IF THE TEST EXECUTES.
REMINDER:
* Install dependencies if needed!
"""
HUMAN_PROMPT = """\
ISSUE INFORMATION:
Title: {title}
Description: {body}
Comments: {comments}
Bug reproducing file:
{reproduced_bug_file}
User provided test commands:
{test_commands}
"""
def __init__(
self,
model: BaseChatModel,
container: BaseContainer,
test_commands: Optional[Sequence[str]] = None,
):
self.test_commands = test_commands
self.container_command_tool = ContainerCommandTool(container)
self.tools = self._init_tools()
self.model_with_tools = model.bind_tools(self.tools)
self.system_prompt = SystemMessage(self.SYS_PROMPT)
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def _init_tools(self):
tools = []
run_command_fn = functools.partial(self.container_command_tool.run_command)
run_command_tool = StructuredTool.from_function(
func=run_command_fn,
name=self.container_command_tool.run_command.__name__,
description=self.container_command_tool.run_command_spec.description,
args_schema=self.container_command_tool.run_command_spec.input_schema,
)
tools.append(run_command_tool)
return tools
def added_test_filename(self, state: BugReproductionState) -> Path:
added_files, modified_file, removed_files = get_updated_files(
state["bug_reproducing_patch"]
)
if removed_files:
raise ValueError("The bug reproducing patch delete files")
if modified_file:
raise ValueError("The bug reproducing patch modified existing files")
if len(added_files) != 1:
raise ValueError("The bug reproducing patch added not one files")
return added_files[0]
def format_human_message(
self, state: BugReproductionState, reproduced_bug_file: str
) -> HumanMessage:
test_commands_str = ""
if self.test_commands:
test_commands_str = format_test_commands(self.test_commands)
return HumanMessage(
self.HUMAN_PROMPT.format(
title=state["issue_title"],
body=state["issue_body"],
comments=state["issue_comments"],
reproduced_bug_file=reproduced_bug_file,
test_commands=test_commands_str,
)
)
def __call__(self, state: BugReproductionState):
try:
reproduced_bug_file = self.added_test_filename(state)
except ValueError as e:
self._logger.error(f"Error in bug reproducing execute node: {e}")
return {
"bug_reproducing_execute_messages": [
AIMessage(f"THE TEST WAS NOT EXECUTED BECAUSE OF AN ERROR: {str(e)}")
],
}
message_history = [
self.system_prompt,
self.format_human_message(state, str(reproduced_bug_file)),
] + state["bug_reproducing_execute_messages"]
response = self.model_with_tools.invoke(message_history)
self._logger.debug(response)
return {
"bug_reproducing_execute_messages": [response],
"reproduced_bug_file": reproduced_bug_file,
}

View File

@@ -0,0 +1,88 @@
import functools
import logging
import threading
from langchain.tools import StructuredTool
from langchain_core.language_models.chat_models import BaseChatModel
from langchain_core.messages import HumanMessage, SystemMessage
from prometheus.graph.knowledge_graph import KnowledgeGraph
from prometheus.lang_graph.subgraphs.bug_reproduction_state import BugReproductionState
from prometheus.tools.file_operation import FileOperationTool
from prometheus.utils.lang_graph_util import get_last_message_content
class BugReproducingFileNode:
SYS_PROMPT = """\
You are a test file manager. Your task is to save the provided bug reproducing code in the project. You should:
1. Examine the project structure to identify existing test file naming patterns and test folder organization
2. Use the create_file tool to save the bug reproducing code in a SINGLE new test file that do not yet exists,
the name should follow the project's existing test filename conventions
3. After creating the file, return its relative path
Tools available:
- create_file: Create a new SINGLE file with specified content
If create_file fails because there is already a file with that names, use another name.
Respond with the created file's relative path.
"""
HUMAN_PROMPT = """\
Save this bug reproducing code in the project:
{bug_reproducing_code}
Current project structure:
{project_structure}
"""
def __init__(self, model: BaseChatModel, kg: KnowledgeGraph, local_path: str):
self.kg = kg
self.file_operation_tool = FileOperationTool(local_path, kg)
self.tools = self._init_tools()
self.model_with_tools = model.bind_tools(self.tools)
self.system_prompt = SystemMessage(self.SYS_PROMPT)
self._logger = logging.getLogger(f"thread-{threading.get_ident()}.{__name__}")
def _init_tools(self):
"""Initializes file operation tools."""
tools = []
read_file_fn = functools.partial(self.file_operation_tool.read_file)
read_file_tool = StructuredTool.from_function(
func=read_file_fn,
name=FileOperationTool.read_file.__name__,
description=FileOperationTool.read_file_spec.description,
args_schema=FileOperationTool.read_file_spec.input_schema,
)
tools.append(read_file_tool)
create_file_fn = functools.partial(self.file_operation_tool.create_file)
create_file_tool = StructuredTool.from_function(
func=create_file_fn,
name=FileOperationTool.create_file.__name__,
description=FileOperationTool.create_file_spec.description,
args_schema=FileOperationTool.create_file_spec.input_schema,
)
tools.append(create_file_tool)
return tools
def format_human_message(self, state: BugReproductionState) -> HumanMessage:
return HumanMessage(
self.HUMAN_PROMPT.format(
bug_reproducing_code=get_last_message_content(
state["bug_reproducing_write_messages"]
),
project_structure=self.kg.get_file_tree(),
)
)
def __call__(self, state: BugReproductionState):
message_history = [self.system_prompt, self.format_human_message(state)] + state[
"bug_reproducing_file_messages"
]
response = self.model_with_tools.invoke(message_history)
self._logger.debug(response)
return {"bug_reproducing_file_messages": [response]}

Some files were not shown because too many files have changed in this diff Show More