NLP similarity search using embeddings and the Faiss

Overview

The project was carried out to show how some basic important text preprocessings are done on data using the regular expression library. Then using embeddings, hash the key of a given word to an ID to carry out similary searches like a dictionary. The transformer from hugging face played a big role in the project as it was used to get the embeddings of the sentences i.e., the word meanings.

Data Used

The data used for this project is a csv file containing words and meanings from the letter A-Z which was joined together using the os library before the preprocessing stages.

Application in Real life

Information retrieval
Text classification system
Reccomendation system for products and services
used to cluster documents

Summary

This project was done similar to that of the cohere library for similarity searches and was aimed at utilising many important libraries as well as the steps handled in the preprocessing of such a dirty data. Similar steps would be carried out if there are any form of need for searches.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Dictionary in csv		Dictionary in csv
DICTIONARY SIMILARITY SEARCH.ipynb		DICTIONARY SIMILARITY SEARCH.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP similarity search using embeddings and the Faiss

Overview

Data Used

Application in Real life

Summary

About

Releases

Packages

Languages

Vheektoh/NLP-similarity-search-using-embeddings

Folders and files

Latest commit

History

Repository files navigation

NLP similarity search using embeddings and the Faiss

Overview

Data Used

Application in Real life

Summary

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages