Skip to content

Commit 35be659

Browse files
committed
Add documentation for file-based metastore
1 parent a6d82fe commit 35be659

File tree

3 files changed

+81
-0
lines changed

3 files changed

+81
-0
lines changed

presto-docs/src/main/sphinx/connector.rst

+1
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,7 @@ from different data sources.
1717
connector/deltalake
1818
connector/druid
1919
connector/elasticsearch
20+
connector/file-based-metastore
2021
connector/googlesheets
2122
connector/hana
2223
connector/hive
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
====================
2+
File-Based Metastore
3+
====================
4+
5+
.. contents::
6+
:local:
7+
:backlinks: none
8+
:depth: 1
9+
10+
Overview
11+
^^^^^^^^
12+
13+
For testing or developing purposes, Presto can be configured to use a filesystem
14+
directory as a Hive Metastore. This can be a directory on the local filesystem
15+
or a non-local file system such as Amazon S3.
16+
17+
The file-based metastore works only with the following connectors:
18+
19+
* :doc:`/connector/deltalake`
20+
* :doc:`/connector/hive`
21+
* :doc:`/connector/hudi`
22+
* :doc:`/connector/iceberg`
23+
24+
Configuring a File-Based Metastore
25+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
26+
27+
1. In ``etc/catalog/``, find the catalog properties file for the supported
28+
connector.
29+
30+
2. In the catalog properties file, set the following properties:
31+
32+
.. code-block:: none
33+
34+
hive.metastore=file
35+
hive.metastore.catalog.dir=file:///<catalog-dir>
36+
37+
Replace ``<catalog-dir>`` in the example with the path to a directory on an
38+
accessible filesystem.
39+
40+
Using a File-Based Warehouse
41+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
42+
43+
Create a schema
44+
45+
.. code-block:: none
46+
47+
CREATE SCHEMA hive.warehouse;
48+
49+
This query creates a folder as ``/data/hive_data/warehouse``.
50+
51+
Create a table with any connector-supported file formats. For example, if the
52+
Hive connector is being configured:
53+
54+
.. code-block:: none
55+
56+
CREATE TABLE hive.warehouse.orders_csv("order_name" varchar, "quantity" varchar) WITH (format = 'CSV');
57+
CREATE TABLE hive.warehouse.orders_parquet("order_name" varchar, "quantity" int) WITH (format = 'PARQUET');
58+
59+
These queries create folders as ``/data/hive_data/warehouse/orders_csv`` and
60+
``/data/hive_data/warehouse/orders_parquet``. Users can insert and query
61+
from these tables.
62+
63+
Reading Existing Data Files with a File-based Metastore
64+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
65+
66+
To read existing data files, the metastore needs to know the file schema. File
67+
formats such as Parquet that contain the schema need no additional work, but
68+
for other file formats such as CSV, the user must either
69+
70+
* manually specify the schema as shown in the example above
71+
* provide ``.prestoSchema`` and ``.prestoPermissions`` files
72+
73+
Once the table is created with the required schema, users can move existing
74+
data files to the table folder.
75+
76+
For example, a CSV file ``orders.csv`` with contents ``books, 100`` can be
77+
moved to ``/data/hive_data/warehouse/orders_csv`` and can be queried with Presto.
78+

presto-docs/src/main/sphinx/installation/deployment.rst

+2
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,8 @@ Deploying Presto
77
:backlinks: none
88
:depth: 1
99

10+
.. _Installing Presto:
11+
1012
Installing Presto
1113
-----------------
1214

0 commit comments

Comments
 (0)