-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multi-Agents will have redundant content across different sections #548
Comments
Hey @DandinPower that's a great discovery and definitely a huge improvement to the experience. At the moment (mostly for cost purposes) indeed the report does not take into consideration previous sub topics. To enable this every iteration would need to see the entire report generated before that which would be very costly in terms LLMs calls. I guess it's a tradeoff but definitely something we can consider adding as an option |
Hello @assafelovic , |
Hey, I played with the researcher and noticed same issue. IMO having the tradeoff is well worth it, since the alternative is to remove redundant sections by hand. Maintaining coherence across the entire report seems paramount for real-world applications and complex subjects. |
Hello @antoremin, I have forked the repo and written a brief draft about the modification design. If you are interested, you are welcome to check the |
Hey @DandinPower would love help with a PR for this! Currently working on a new front end UX/UI |
Hello @assafelovic, thank you for your invitation! I am willing to help with a PR after I finish the feature about the workflow I mentioned before. |
@DandinPower @antoremin Thought about it deeper and I think there's a pretty decent approach that can work well: |
Hello @assafelovic @antoremin, I think this is a very good approach to reduce redundant content across different subsections. It also allows for generating content in complete parallel since we can retrieve the data independently. Additionally, moving extra LLMs calls into embedding models and vector searches is generally quicker and cheaper. However, I have some concerns about using the vector similarity-based approach:
In this figure, it shows that the three subsection topic embeddings (red points) are close to each other, and the other gray points are all written content chunks. The shallow yellow circles represent the relevant range based on the similarity threshold we set. This demonstrates the difficulty in finding a perfect way to retrieve the data without redundancy while getting all the content we want. |
This is a great point! But consider that what we want is the LLM to take previously written sections into consideration when writing the next sections. So it might be enough to get similar chunks just for reasoning of generating new content. I'd be happy to see how we can push this forward anyone want to take a stab at it? @antoremin @DandinPower |
Hey @assafelovic @antoremin, If I’ve got this right, I’m up for helping push this idea forward. Here’s a quick plan on how we could tweak the current
|
This sounds like a good plan, @DandinPower! I have experience with converting transcripts of interviews, which might jump between topics, into a coherent report that keeps details (unlike a "summarize this meeting" prompt would). My workflow was:
There is still some redundancy, but not as much, as if not using the vectordb, and token efficiency increases a lot. In your proposed workflow I could imagine the editor (or someone else) gathering abstract summary knowledge from the researchers, and writing a guided outline with headlines, and what facts/questions need to be addressed where. |
This sounds great guys, who is helping with leading this PR? :) |
@danieldekay Thank you for sharing your experience and workflow! Your approach with interview transcripts is quite interesting and offers some valuable insights. @assafelovic I'm willing to take the lead on this PR and implement the improvements we've been discussing. Is there anything specific you'd like me to focus on or consider as I develop this feature? |
@DandinPower, LangGraph also has a human-in-the-loop feature, and we might also want to ask a human on the highest abstraction level of the report structure to provide editorial feedback. This could be what it takes to bring from 60% quality to 85%. |
@danieldekay That sounds like a good feature! In my opinion, I think there are two approaches to incorporate human feedback into the workflow: |
@DandinPower ping me on Discord if you'd like or we can open a channel for this feature and invite whomever would like to contribute/test it. Generally take in mind that GPT Researcher is in change of generating a research report based on research tasks, and the long detailed report leverages it multiple times. I assume the logic should be |
@assafelovic Okay, I'll ping you on Discord later! Thanks for your guidelines. I will make sure I understand the detailed report logic before pushing forward with the progress. |
@DandinPower , I am also reading the STORM paper (https://arxiv.org/pdf/2402.14207) which has loads more insights into possible processes. |
@danieldekay Hey! Thanks for introducing this paper. I think the current GPT researcher's detailed report type is also inspired by this paper. I will take a look at it. |
Hey @DandinPower we're all eager to see this go live! :D |
Hey @assafelovic, I was previously busy, but now I have more time to push this forward! I am eager to see this move forward too! |
Additionally I noticed that by having long descriptions (follow_guidelines does not work for me, throws error which is reported in discussion and also in #684 ) with explicit section logic helps. |
Hey all, we've release a proposal solution for this. Please try it out and reopen this issue if you still find it an issue! |
Hello, I have tried the multi-agent implementation found in the
multi-agents
folder. I also read the multi-agent blog first. I discovered that even though each subsection focuses on different topics, it is easy for the content to overlap and discuss the same concepts across different subsections. As a result, the final report often contains a lot of redundant content, which is not useful at all.I initially tried to add guidelines like "Each subsection must not have redundant content across different subsections." However, since the reviewer and reviser agents only work on individual subsections, the guidelines can only be applied at the subsection level, not to the entire report. Consequently, the reviewers and revisers are unaware of what is happening in other subsections, making it difficult to resolve the redundancy problem.
I am considering a solution where we have a chief reviewer and reviser. After each subsection completes its research, the chief reviewer and reviser would evaluate the final research across all subsections, ensure there is no redundant content, and provide revision directions to each subsection group to restart their research.
I think this kind of workflow will make the whole process more complex, increase token waste, and cause higher latency. However, I believe that if we can set global guidelines, such as "Each subsection must not have redundant content across different subsections," it can improve the final report's robustness and usefulness.
The text was updated successfully, but these errors were encountered: