Skip to content

Conversation

@danorlando
Copy link
Collaborator

@danorlando danorlando commented Nov 22, 2024

  1. Implements new "Corrective RAG" agent architecture
  2. Re-implements the standard RAG (ie. "chat_retrieval")
  3. Implements state registry for managing the agent's state through the query-final generation lifecycle from a centralized location. This should make adding new agent architectures more straight forward by following a common interface for formatting architecture-specific message schemas. The state registry is a place where any architecture-specific state management logic can go, effectively keeping any such logic out of places that would make the code harder and harder to manage as new architectures are added.
  4. Also adds debug scripts for each of the agent architectures, which allows for the agents to be tested directly without having to go through the runs api. (Sidenote: I think this is how we do test automation for the different agent architectures.)

Steps to test

In Postman...

  1. You'll need to first create an assistant by calling the assistants api with body:
{
  "name": "Corrective RAG Test",
  "config": {
    "configurable": {
        "type": "corrective_rag",
        "agent_type": "GPT 4o Mini",
        "interrupt_before_action": false,
        "system_prompt": "You are a helpful assistant.",
        "enable_web_search": true,
        "relevance_threshold": 0.5,
        "max_corrective_iterations": 3,
        "question_rewriter_prompt": "You are an expert at reformulating questions to be clearer and more effective for search."
    }
  },
  "file_ids": [],
  "public": true
}

Be sure to make note of the assistant_id that is returned.

  1. Ingest a document by calling /rag/ingest with a file you uploaded:
{
"files": [
   "e2c7447b-0443-43a6-9cfc-c342dc4aff24"
  ],
"purpose": "assistants",
 "namespace": <assistant_id returned from step 1>
}
  1. You can test the agent directly by running python debug_crag_agent.py --query "<your_query>". You'll need to replace assistant_id in assistant_config in the debug_crag_agent.py file with your new assistant id.
  2. Use the same process for standard RAG with python debug_chat_retrieval.py --query .... (Note that retrieval_config in DEFAULT_AGENT_CONFIG of debug_chat_retrieval.py does not currently do anything so don't worry about that being set to use ollama)
  3. Test using the API: call /runs/stream:
{
    // "assistant_id": "2cb8752f-1b2b-4400-8b7a-7dd195f420d6", // chat_retrieval
    "assistant_id": "0f7c920a-e71a-498a-9165-4cbffcef4eaf", // crag
    // "assistant_id": "46cdb912-679e-4efa-a92f-b787983b0e71", // tools agent
    "input":[
      {
         "content":"What is automatic prompt optimization?",
         "additional_kwargs":{ },
         "type":"human",
         "example":false
      }
   ]
}

@danorlando danorlando changed the title 37 self reflection Implement Corrective RAG and Implement Agent State Registry Nov 22, 2024
@danorlando danorlando changed the title Implement Corrective RAG and Implement Agent State Registry Implement Corrective RAG and Agent State Registry Nov 22, 2024
@danorlando danorlando marked this pull request as ready for review December 4, 2024 16:32
@danorlando danorlando changed the title Implement Corrective RAG and Agent State Registry Implement Corrective RAG and Agent State Registry, Re-implement Standard RAG Dec 4, 2024
@krishokr
Copy link
Collaborator

krishokr commented Dec 15, 2024

image

Getting this error when trying to run: python debug_crag_agent.py --query "<your_query>"

I was also unsure of what to include in the query (my document was about how to decide which running plan you should choose, some random article I pulled) - maybe I don't fully understand. Does the CRAG architecture simply grade other agents, or is it both grading and returning a response?

Given the above, what kind of a query would make sense for the CRAG agent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants