Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RAG support for man page retrieval #33

Open
wants to merge 7 commits into
base: main
Choose a base branch
from
Open

Add RAG support for man page retrieval #33

wants to merge 7 commits into from

Conversation

ncoop57
Copy link
Contributor

@ncoop57 ncoop57 commented Feb 8, 2025

This PR adds Retrieval Augmented Generation (RAG) support to enhance ShellSage's responses by retrieving relevant man pages.

Key changes:

  • Added RAG functionality using LanceDB and sentence transformers
  • Created new rag.py module for man page indexing and retrieval
  • Added CLI command 'ssage_index' to build the man page vector database
  • Modified core.py to optionally include retrieved man pages in prompts
  • Added new dependencies in settings.ini under 'rag_requirements'
  • Updated config to support RAG options (use_retrieval, retrieve_limit)

To use RAG:

  1. Install with 'pip install shell_sage[rag]'
  2. Run 'ssage_index' to build the vector database
  3. Use 'ssage --use-retrieval' to enable RAG in queries

The retrieved man pages are added to the prompt context to help the LLM provide more accurate command-line assistance.

ncoop57 added 6 commits January 31, 2025 00:32
- Add LanceDB vector store for man pages
- Implement sentence chunking and embedding
- Add search functionality using cosine similarity
 - Add optional RAG dependencies (lancedb, chonkie, sentence_transformers)
 - Create RAG functionality to search and retrieve relevant man pages
 - Add --retrieve flag to CLI for enabling RAG features
 - Move database to ~/.cache/shell_sage/db
 - Add search threshold and limit parameters
 - Clean up code and remove unused cells
- Lower default similarity threshold for better results
- Add package name to retrieved docs
- Add ssage_index CLI command
- Clean up notebook organization
- Add hybrid search combining text and vector similarity
- Implement LinearCombinationReranker for better results
- Rename retrieve flag to use_retrieval for consistency
- Clean up code and improve progress reporting
Copy link

gitnotebooks bot commented Feb 8, 2025

Found 3 changed notebooks. Review the changes at https://app.gitnotebooks.com/AnswerDotAI/shell_sage/pull/33

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant