Skip to content

Latest commit

 

History

History
51 lines (34 loc) · 1.8 KB

README.md

File metadata and controls

51 lines (34 loc) · 1.8 KB

Deeno Search Engine

image

The search project I've always wanted to work on.

Deeno is a search engine project that I am just beginning to undertake. The goal of this project is to really just understand what it takes to build search - so I'm building everything from the UI, the microservices and the data pipelines from scratch.

When complete, I picture search functionality on the entire Wikipedia dataset powered by microservices and Spark jobs written from scratch.

This project is built to have three components:

  1. Web interface built with Angular 15.
  2. Microservices built with the Spring framework.
  3. Indexers built using Apache Spark that update a Redis cluster.

I work on this when I have time off classes and work, so it can get quiet here at times, but its one step at a time.

The plan:

  1. Deploy and get a simple inverted index based retrieval system on the cloud.
  2. Graduate to ranked retrieval.
  3. Move to vector space retrieval using deep learning representations.
  4. Integrate question answering ability using language models.

I'm at step 1 now, and once the infrastructure is up and running, things should really accelerate. Stay tuned!

Setting up the infrastructure

I've now started configuring the infrastructure for this project as follows on Google Cloud:

Deeno Architecture.png

So I can now build individual containers like so:

gcloud builds submit --tag [IMAGE] /Users/cksash/Documents/proj/search/api/flask-aisearch

Run them individually if I wish like so:

gcloud run deploy flask-aisearch --image [IMAGE]

And run the entire project in the correct order (defined by dependencies) like so:

gcloud run services replace service.yaml