|
| 1 | +## Simple Dockerized AsterixDB |
| 2 | + |
| 3 | +This projects aims to make it easy to get started with [AsterixDB](https://asterixdb.apache.org/). It is based on Docker and [Docker compose](https://docs.docker.com/compose/). Currently, the following features are supported: |
| 4 | + |
| 5 | +* The [sample cluster](https://asterixdb.apache.org/docs/0.9.6/ncservice.html#quickstart) (which consists of one cluster controller and two node controllers) in a single container |
| 6 | +* HDFS (with name node and data nodes in their own respective containers) |
| 7 | + |
| 8 | +### Starting AsterixDB without HDFS |
| 9 | + |
| 10 | +If you do not need HDFS, you can use the docker image without `docker-compose`: |
| 11 | + |
| 12 | +```bash |
| 13 | +docker run --rm -it -p 19002:19002 -p 19006:19006 ingomuellernet/asterixdb |
| 14 | +``` |
| 15 | + |
| 16 | +### Starting AsterixDB with HDFS |
| 17 | + |
| 18 | +The following should be enough to bring up all required services: |
| 19 | + |
| 20 | +```bash |
| 21 | +docker-compose up |
| 22 | +``` |
| 23 | + |
| 24 | +### Varying the number of HDFS Data Nodes |
| 25 | + |
| 26 | +To change the number of HDFS data nodes, use the `--scale` flag of docker-compose: |
| 27 | + |
| 28 | +```bash |
| 29 | +docker-compose up --scale datanode=3 |
| 30 | +``` |
| 31 | + |
| 32 | +### Building the Image Locally |
| 33 | + |
| 34 | +Above command uses a pre-built [docker image](https://hub.docker.com/r/ingomuellernet/asterixdb). If you want the image to be build locally, do the following instead: |
| 35 | + |
| 36 | +```bash |
| 37 | +docker-compose --file docker-compose-local.yml up |
| 38 | +``` |
| 39 | + |
| 40 | +If you are behind a corporate firewall, you will have to configure Maven (which is used to build part of Presto) as follows before running above command: |
| 41 | + |
| 42 | +```bash |
| 43 | +export MAVEN_OPTS="-Dhttp.proxyHost=your.proxy.com -Dhttp.proxyPort=3128 -Dhttps.proxyHost=your.proxy.com -Dhttps.proxyPort=3128" |
| 44 | +``` |
| 45 | + |
| 46 | +### Uploading Data to HDFS |
| 47 | + |
| 48 | +The `data/` folder is mounted into the HDFS namenode container, from where you can upload it using the HDFS client in that container (`docker-presto_asterixdb_1` may have a different name on your machine; run `docker ps` to find out): |
| 49 | + |
| 50 | +```bash |
| 51 | +docker exec -it docker-asterixdb_namenode_1 hadoop fs -mkdir /dataset |
| 52 | +docker exec -it docker-asterixdb_namenode_1 hadoop fs -put /data/file.parquet /dataset/ |
| 53 | +docker exec -it docker-asterixdb_namenode_1 hadoop fs -ls /dataset |
| 54 | +``` |
| 55 | + |
| 56 | +### Running Queries |
| 57 | + |
| 58 | +Once started, you should be able to use the server by accssing http://localhost:19006. Alternatively, the [REST API](https://ci.apache.org/projects/asterixdb/api.html) is accessible on the standard port. |
| 59 | + |
| 60 | +### Creating an External Table |
| 61 | + |
| 62 | +Suppose you have the following file `test.json`: |
| 63 | + |
| 64 | +```json |
| 65 | +{"s": "hello world", "i": 42} |
| 66 | +``` |
| 67 | + |
| 68 | +Upload it to `/dataset/test.json` on HDFS as described above. Then run the following in the web interface: |
| 69 | + |
| 70 | +```SQL |
| 71 | +CREATE TYPE t1 AS OPEN {}; |
| 72 | + |
| 73 | +CREATE EXTERNAL DATASET Test(t1) |
| 74 | +USING HDFS |
| 75 | + (("hdfs"="hdfs://namenode:8020"), |
| 76 | + ("path"="/dataset/test.json"), |
| 77 | + ("input-format"="text-input-format"), |
| 78 | + ("format"="json")); |
| 79 | + |
| 80 | +``` |
0 commit comments