Skip to content

Commit 02bca3d

Browse files
Add README and docker-compose file for local building.
1 parent 3b12bbd commit 02bca3d

File tree

2 files changed

+124
-0
lines changed

2 files changed

+124
-0
lines changed

README.md

+80
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
## Simple Dockerized AsterixDB
2+
3+
This projects aims to make it easy to get started with [AsterixDB](https://asterixdb.apache.org/). It is based on Docker and [Docker compose](https://docs.docker.com/compose/). Currently, the following features are supported:
4+
5+
* The [sample cluster](https://asterixdb.apache.org/docs/0.9.6/ncservice.html#quickstart) (which consists of one cluster controller and two node controllers) in a single container
6+
* HDFS (with name node and data nodes in their own respective containers)
7+
8+
### Starting AsterixDB without HDFS
9+
10+
If you do not need HDFS, you can use the docker image without `docker-compose`:
11+
12+
```bash
13+
docker run --rm -it -p 19002:19002 -p 19006:19006 ingomuellernet/asterixdb
14+
```
15+
16+
### Starting AsterixDB with HDFS
17+
18+
The following should be enough to bring up all required services:
19+
20+
```bash
21+
docker-compose up
22+
```
23+
24+
### Varying the number of HDFS Data Nodes
25+
26+
To change the number of HDFS data nodes, use the `--scale` flag of docker-compose:
27+
28+
```bash
29+
docker-compose up --scale datanode=3
30+
```
31+
32+
### Building the Image Locally
33+
34+
Above command uses a pre-built [docker image](https://hub.docker.com/r/ingomuellernet/asterixdb). If you want the image to be build locally, do the following instead:
35+
36+
```bash
37+
docker-compose --file docker-compose-local.yml up
38+
```
39+
40+
If you are behind a corporate firewall, you will have to configure Maven (which is used to build part of Presto) as follows before running above command:
41+
42+
```bash
43+
export MAVEN_OPTS="-Dhttp.proxyHost=your.proxy.com -Dhttp.proxyPort=3128 -Dhttps.proxyHost=your.proxy.com -Dhttps.proxyPort=3128"
44+
```
45+
46+
### Uploading Data to HDFS
47+
48+
The `data/` folder is mounted into the HDFS namenode container, from where you can upload it using the HDFS client in that container (`docker-presto_asterixdb_1` may have a different name on your machine; run `docker ps` to find out):
49+
50+
```bash
51+
docker exec -it docker-asterixdb_namenode_1 hadoop fs -mkdir /dataset
52+
docker exec -it docker-asterixdb_namenode_1 hadoop fs -put /data/file.parquet /dataset/
53+
docker exec -it docker-asterixdb_namenode_1 hadoop fs -ls /dataset
54+
```
55+
56+
### Running Queries
57+
58+
Once started, you should be able to use the server by accssing http://localhost:19006. Alternatively, the [REST API](https://ci.apache.org/projects/asterixdb/api.html) is accessible on the standard port.
59+
60+
### Creating an External Table
61+
62+
Suppose you have the following file `test.json`:
63+
64+
```json
65+
{"s": "hello world", "i": 42}
66+
```
67+
68+
Upload it to `/dataset/test.json` on HDFS as described above. Then run the following in the web interface:
69+
70+
```SQL
71+
CREATE TYPE t1 AS OPEN {};
72+
73+
CREATE EXTERNAL DATASET Test(t1)
74+
USING HDFS
75+
(("hdfs"="hdfs://namenode:8020"),
76+
("path"="/dataset/test.json"),
77+
("input-format"="text-input-format"),
78+
("format"="json"));
79+
80+
```

docker-compose-local.yml

+44
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
version: "3.7"
2+
services:
3+
asterixdb:
4+
build:
5+
context: ./image
6+
args:
7+
- MAVEN_OPTS=${MAVEN_OPTS}
8+
depends_on:
9+
- datanode
10+
- namenode
11+
ports:
12+
- "19002:19002"
13+
- "19006:19006"
14+
command: ["wait-for-it", "namenode:8020", "-t", "20",
15+
"--", "/opt/asterixdb-entrypoint.sh"]
16+
17+
namenode:
18+
build:
19+
context: ./image
20+
args:
21+
- MAVEN_OPTS=${MAVEN_OPTS}
22+
volumes:
23+
- ./data/:/data/:ro
24+
expose:
25+
- "8020"
26+
ports:
27+
- "9870:9870"
28+
command: ["hdfs", "namenode"]
29+
30+
datanode:
31+
build:
32+
context: ./image
33+
args:
34+
- MAVEN_OPTS=${MAVEN_OPTS}
35+
depends_on:
36+
- namenode
37+
ports:
38+
- "36000-36999:9864"
39+
expose:
40+
- "9866"
41+
- "9867"
42+
- "46759"
43+
command: ["wait-for-it", "namenode:8020", "-t", "10",
44+
"--", "hdfs", "datanode"]

0 commit comments

Comments
 (0)