Skip to content

Vheektoh/NLP-similarity-search-using-embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

NLP similarity search using embeddings and the Faiss

Overview

The project was carried out to show how some basic important text preprocessings are done on data using the regular expression library. Then using embeddings, hash the key of a given word to an ID to carry out similary searches like a dictionary. The transformer from hugging face played a big role in the project as it was used to get the embeddings of the sentences i.e., the word meanings.

Data Used

The data used for this project is a csv file containing words and meanings from the letter A-Z which was joined together using the os library before the preprocessing stages.

Application in Real life

  • Information retrieval
  • Text classification system
  • Reccomendation system for products and services
  • used to cluster documents

Summary

This project was done similar to that of the cohere library for similarity searches and was aimed at utilising many important libraries as well as the steps handled in the preprocessing of such a dirty data. Similar steps would be carried out if there are any form of need for searches.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published