This is an interactive Python demo of the Rocchio Algorithm for relevance feedback in information retrieval. It uses a collection of movie plot summaries as the document corpus and shows how search results improve over multiple rounds of feedback.
The Rocchio Algorithm is a classic method used in information retrieval to refine a search query based on user feedback about which results are relevant or irrelevant.
[ \vec{q}{\text{new}} = \alpha \vec{q} + \frac{\beta}{|D_r|} \sum{\vec{d}r \in D_r} \vec{d}r - \frac{\gamma}{|D{nr}|} \sum{\vec{d}{nr} \in D{nr}} \vec{d}_{nr} ]
Where:
- ( \vec{q} ): Original query vector
- ( D_r ): Relevant documents
- ( D_{nr} ): Irrelevant documents
- ( \alpha, \beta, \gamma ): Weights (default: 1.0, 0.75, 0.25)