HadoopSQL

An implementation of a SQL engine using Hadoop MapReduce

Shell- The shell.py script provides a command line interface for typing in the SQL queries. It also handles some basic parsing and gives out some appropriate error messages.

query_handler - Detailed parsing is done in the query_handler. It parses the query and calls the appropriate functions which dynamically create the MapReducecode code and it also runs them and produces the output.

MR_utils - This script contains some methods that is used for writing MapReduce code. The MapReduce code for select, project and aggregation functions exists here. It also has two other methods for checking if a give database or a table exists.

Instructions to run

The engine has been tested on Hadoop-3.2.0. It also needs a Python version 3.6+ To use the SQL engine, first start the Hadoop daemons using start-dfs.sh and start-yarn.sh.

Start the shell using python3.7 shell.py or python3.6 shell.py.

Syntax for using the engine

To create a new database

create database <database_name>

To load a table in the created database

load <database_name/table_name.csv> as <column_name1: datatype, column_name2: datatype ...>

Queries that can be run on the table

select <column_names (*)> from <database_name/table_name.csv> where <condition>
"where <condition>" is optional in above query

select count(<column_names>) | min(<column_names>) | max(<column_names>) from <database_name/table_name.csv> where <condition>

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
MR_utils.py		MR_utils.py
README.md		README.md
query_handler.py		query_handler.py
shell.py		shell.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HadoopSQL

Instructions to run

Syntax for using the engine

To create a new database

To load a table in the created database

Queries that can be run on the table

About

Uh oh!

Releases

Packages

Languages

hritvikpatel4/HadoopSQL

Folders and files

Latest commit

History

Repository files navigation

HadoopSQL

Instructions to run

Syntax for using the engine

To create a new database

To load a table in the created database

Queries that can be run on the table

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages