-
Notifications
You must be signed in to change notification settings - Fork 32
Quick Start
Once you have successfully deployed Jumbune, you may quickly get started running it.
Start Jumbune by executing bin/startWeb script. Using the browser navigate to Jumbune home page at 'localhost:8080' if Jumbune is deployed on the same machine or navigate to 'Jumbune machine IP:8080'. You may now choose one of the below options:
- Create a YAML file
- Upload a YAML
- Choose an existing YAML file from Jumbune repository
If creating a new YAML file, select the modules you wish to execute. Then, fill up the details like Name node, data node and job jar details and click “validate”. Once your YAML is successfully validated, simply click “Run” button. You will be navigated to the result page where you would see the outcome in form of intuitive graphs and details.

If you wish to execute the Jumbune sample examples instead, please follow the below mentioned steps to quickly get you started.
Running Jumbune Sample Jobs
-
Upload one of the YAMLs found in the '/examples/resources/sample yaml/' directory.
-
The sample job jars are found in the '/examples/example-distribution/' directory.
Word Count:
- Upload sample input file in HDFS by using the following command (Please ensure that path is not present on HDFS and user has appropriate permission to put the data file on HDFS)
bin/hadoop fs -put <Jumbune Home>/examples/resources/data/PREPROCESSED/data1
/Jumbune/Demo/input/PREPROCESSED/data1
-
Upload the sample wordcount YAML (/examples/resources/sample yaml/WordCountSample.yaml).
-
Edit the Name-node and Data-node information.
-
In the 'M/R Jobs' tab select the WordCount sample jar, either by mentioning the path on the Jumbune machine or by uploading from the local machine.
-
Validate and Run the job.
Movie rating (for profiling):
- Upload sample input file in HDFS by using the following command (Please ensure that path is not present on HDFS and user has appropriate permission to put the data file on HDFS)
bin/hadoop fs -put <Jumbune_Home>/examples/resources/data/u.data /Jumbune/examples/regex
-
Upload the sample YAML (/examples/resources/sample yaml/MovieRatingSample.yaml).
-
In the 'M/R Jobs' tab select the movie rating sample jar, either by mentioning the path on the Jumbune machine or by uploading from the local machine.
-
Edit the Name-node and Data-node information.
-
Validate and Run the job.
Bank defaulters (for debugging):
- Upload sample input file in HDFS by using the following command (Please ensure that path is not present on HDFS and user has appropriate permission to put the data file on HDFS)
bin/hadoop fs -put <Jumbune_Home>/examples/resources/data/defaulterlistdata.txt
/Jumbune/examples/defaulter
-
Upload the sample YAML (/examples/resources/sample yaml/BankDefaultersSample.yaml).
-
In the 'M/R Jobs' tab select the bank defaulters sample jar, either by mentioning the path on the Jumbune machine or by uploading from the local machine.
-
Edit the Name-node and Data-node information.
-
Validate and Run the job.
US region portout(for debugging):
- Upload sample input file in HDFS by using the following command (Please ensure that path is not present on HDFS and user has appropriate permission to put the data file on HDFS)
bin/hadoop fs -put <Jumbune_Home>/examples/resources/data/PREPROCESSED/data1
/Jumbune/Demo/input/PREPROCESSED/data1
bin/hadoop fs -put <Jumbune_Home>/examples/resources/data/PREPROCESSED/data2
/Jumbune/Demo/input/PREPROCESSED/data2
-
Upload the sample YAML (/examples/resources/sample yaml/USRegionPortOutSample.yaml).
-
In the 'M/R Jobs' tab select the US region portout sample jar, either by mentioning the path on the Jumbune machine or by uploading from the local machine.
-
Edit the Name-node and Data-node information.
-
Validate and Run the job.
Clickstream analysis (for debugging):
- Upload sample input file in HDFS by using the following command (Please ensure that path is not present on HDFS and user has appropriate permission to put the data file on HDFS)
bin/hadoop fs -put <Jumbune_Home>/examples/resources/data/clickstream.tsv
/Jumbune/clickstreamdata
-
Upload the sample YAML (/examples/resources/sample yaml/ClickstreamSample.yaml).
-
In the 'M/R Jobs' tab select the clickstream sample jar, either by mentioning the path on the Jumbune machine or by uploading from the local machine.
-
Edit the Name-node and Data-node information.
-
Validate and Run the job.
Sensor data (for HDFS validation):
- Upload sample input file in HDFS by using the following command (Please ensure that path is not present on HDFS and user has appropriate permission to put the data file on HDFS)
bin/hadoop fs -put <Jumbune_Home>/examples/resources/data/sensor_data/Jumbune/sensordata
-
Upload the sample YAML (/examples/resources/sample yaml/SensorDataSample.yaml).
-
Edit the Name-node and Data-node information.
-
Validate and Run the job.
NOTE:
- We have used GenericOptionsParser for the examples, so do not provide class name information, just select 'Job Class defined in the Jar Manifest' option instead.
- Please ensure that the output path provided in YAML file must not exist on HDFS previously.