Skip to content

Commit b3a734e

Browse files
committed
Merge branch 'data-graph-databricks-setup' of https://github.com/segmentio/segment-docs into data-graph-databricks-setup
2 parents a6adb0b + e0a31d6 commit b3a734e

File tree

1 file changed

+13
-8
lines changed

1 file changed

+13
-8
lines changed

src/unify/linked-profiles/setup-guides/databricks-setup.md

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ On this page, you'll learn how to connect your Databricks data warehouse to the
1212

1313
## Set up Databricks credentials
1414

15-
Sign into Databricks with admin permissions to create new resources and provide the Data Graph with the necessary permissions.
15+
Sign in to Databricks with admin permissions to create new resources and provide the Data Graph with the necessary permissions.
1616

1717
Segment assumes that you already have a workspace that includes the datasets you'd like to use for the Data Graph. Segment recommends setting up a new Service Principal user with only the permissions to access the required catalogs and schemas.
1818

@@ -23,7 +23,7 @@ Segment recommends that you set up a new Service Principal user. If you already
2323
If you want to create a new Service Principal user, complete the following substeps:
2424

2525
#### Substep 1: Create a new Service Principal user
26-
1. Log into the Databricks UI as an Admin.
26+
1. Log in to the Databricks UI as an Admin.
2727
2. Click **User Management**.
2828
3. Select the **Service principals** tab.
2929
4. Click **Add Service Principal**.
@@ -38,7 +38,7 @@ If you want to create a new Service Principal user, complete the following subst
3838
> If you already have a warehouse you'd like to use, you can move on to the next substep, [Substep 2: Add your Service Principal user to Warehouse User Lists](#substep-2-add-your-service-principal-user-to-warehouse-user-lists). If you need to create a new warehouse first, see the [Create a new warehouse](#create-a-new-warehouse) before completing the next substep.
3939
4040
#### Substep 2: Add your Service Principal user to Warehouse User Lists
41-
1. Log into the Databricks UI as an Admin.
41+
1. Log in to the Databricks UI as an Admin.
4242
2. Navigate to SQL Warehouses.
4343
3. Select your warehouse and click **Permissions**.
4444
4. Add the Service Principal user and grant them “Can use” access.
@@ -59,7 +59,7 @@ To confirm that your Service Principal user has "Can use" permission:
5959
6060
Segment requires write access to a catalog to create a schema for internal bookkeeping, and to store checkpoint tables for the queries that are executed.
6161

62-
Segment recommends creating an empty catalog for this purpose by running the SQL below. This is also the catalog that you'll be required to specify when setting up your Databricks integration in the Segment app.
62+
Segment recommends creating an empty catalog for this purpose by running the following SQL. This is also the catalog that you'll be required to specify when setting up your Databricks integration in the Segment app.
6363

6464
```sql
6565
CREATE CATALOG IF NOT EXISTS `SEGMENT_LINKED_PROFILES_DB`;
@@ -71,17 +71,17 @@ GRANT SELECT ON CATALOG `SEGMENT_LINKED_PROFILES_DB` TO `${client_id}`;
7171

7272
### Step 3: Grant read-only access to the Profiles Sync catalog
7373

74-
Run the SQL below to grant the Data Graph read-only access to the Profiles Sync catalog:
74+
Run the following SQL to grant the Data Graph read-only access to the Profiles Sync catalog:
7575

7676
```sql
7777
GRANT USAGE, SELECT, USE SCHEMA ON CATALOG `${profiles_sync_catalog}` TO `${client_id}`;
7878
```
7979

8080
### Step 4: Grant read-only access to additional catalogs for the Data Graph
81-
Run the SQL below to grant your Service Principal user read-only access to any additional catalogs you want to use for the Data Graph:
81+
Run the following SQL to grant your Service Principal user read-only access to any additional catalogs you want to use for the Data Graph:
8282

8383
```sql
84-
-- Run the SQL below for each catalog you want to use for the Segment Data Graph
84+
-- Run this command for each catalog you want to use for the Segment Data Graph
8585
GRANT USAGE, SELECT, USE SCHEMA ON CATALOG `${catalog}` TO `${client_id}`;
8686
```
8787

@@ -114,7 +114,7 @@ GRANT SELECT ON TABLE `${table_2}` TO `${client_id}`;
114114

115115
### Step 5: Validate the permissions of your Service Principal user
116116

117-
Sign into the [Databricks CLI with your Client ID secret](https://docs.databricks.com/en/dev-tools/cli/authentication.html#oauth-machine-to-machine-m2m-authentication){:target="_blank”} and run the following SQL to verify the Service Principal user has the correct permissions for a given table.
117+
Sign in to the [Databricks CLI with your Client ID secret](https://docs.databricks.com/en/dev-tools/cli/authentication.html#oauth-machine-to-machine-m2m-authentication){:target="_blank”} and run the following SQL to verify the Service Principal user has the correct permissions for a given table.
118118

119119
> success ""
120120
> If this command succeeds, you can view the table.
@@ -140,6 +140,11 @@ After identifying the following settings, continue setting up the Data Graph by
140140

141141
## Additional set up for warehouse permissions
142142

143+
#### Create a new warehouse
144+
1. Log in to your workspace as an Admin in the Databricks UI.
145+
2. Navigate to SQL Warehouses and click **Create SQL Warehouse**.
146+
3. Enter a name for your warehouse, select a cluster size, and click **Create**.
147+
143148
### Update user access for Segment Reverse ETL catalog
144149
Run the following SQL if you run into an error on the Segment app indicating that the user doesn’t have sufficient privileges on an existing `_segment_reverse_etl` schema.
145150

0 commit comments

Comments
 (0)