Skip to content

sequenceiq/docker-kylin

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
Richard Doktorics
Feb 15, 2016
03f9fa7 · Feb 15, 2016

History

46 Commits
Nov 10, 2014
Oct 27, 2014
Feb 13, 2016
Jul 21, 2015
Jul 21, 2015
Jul 23, 2015
Jul 23, 2015
Jan 25, 2015
Jul 22, 2015
Jul 22, 2015
Jul 23, 2015

Repository files navigation

##Docker Kylin

Kylin is an open source distributed Analytical Engine from eBay to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop to support TB to PB size analysis.

This is a Docker container for Kylin to help you quick start with. In order to start first pull the container from the Docker repository.

docker pull sequenceiq/kylin:0.7.2

We have provided a few helper functions for your convenience.

wget https://raw.githubusercontent.com/sequenceiq/docker-kylin/master/ambari-functions
source ambari-functions

Once the container is pulled you are ready to start playing with Kylin. Use the following functions from ambari-functions (make sure you source it).

 kylin-deploy-cluster 1

(This will start a hadoop cluster with 1 node; It is okay to start the cluster with more than 1 nodes; Kylin will run in the first node 'amb0')

Once the container is up and running you can reach out to the Kylin UI. First you will need to find the IP address of the container:

docker inspect -f '{{ .NetworkSettings.IPAddress }}' amb0

Once you know the IP address you can use the Kylin UI:

http://<container_ip>:7070/kylin (username: ADMIN, password: KYLIN, select a sample project like "learn_kylin" from the project list).

There are a sample Cube has been defined for you, but haven't been built; Select a Cube and then click "Action -> Build", pick up a date later than 2014-01-01 as the end date; After the job be submitted, watch and track the job progress in the "Jobs" tab until it finished successfully; Once the Cube's status is changed to "READY", you can write some SQL queries in the "Query" tab, for example: select part_dt, sum(price) as total_selled, count(distinct seller_id) as sellers from kylin_sales group by part_dt order by part_dt