Query-Engine-on-MapReudce-Results

This is a project in CPSC5330 Big Data Analytics of Seattle University Author: Hao Li

This program employs Hadoop MapReduce to generate a tfidf score for each terms of the target documents.

Then the system prompts users for input and uses the data collected from the tfidf form to get the top 5 most related documents against the query.

For now, the HiveQL language needs to be modified to generate more accurate result.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
mrstreaming		mrstreaming
README.md		README.md
index		index
prepare.sql		prepare.sql
queryEngine.py		queryEngine.py

Provide feedback