Skip to content

Commit ffd2e76

Browse files
authored
fix(doc): Update permissions in redshift quickstart doc (#14909)
1 parent 18d3eec commit ffd2e76

File tree

1 file changed

+54
-15
lines changed
  • docs/quick-ingestion-guides/redshift

1 file changed

+54
-15
lines changed

docs/quick-ingestion-guides/redshift/setup.md

Lines changed: 54 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -20,9 +20,11 @@ CREATE USER datahub WITH PASSWORD 'Datahub1234';
2020

2121
## Redshift Setup
2222

23-
1. Grant the following permissions to your `datahub` user. For most users, the **minimal set** below will be sufficient:
23+
1. Grant the following permissions to your `datahub` user. For most users, the **recommended set** below will be sufficient:
2424

25-
### Minimal Required Permissions (Recommended)
25+
### Recommended Permissions
26+
27+
For a typical provisioned cluster with default settings:
2628

2729
```sql
2830
-- Core system access (required for lineage and usage statistics)
@@ -48,13 +50,26 @@ GRANT SELECT ON pg_catalog.pg_attrdef TO datahub;
4850
-- Datashare lineage (enabled by default)
4951
GRANT SELECT ON pg_catalog.svv_datashares TO datahub;
5052

51-
-- Choose ONE based on your Redshift type:
52-
-- For Provisioned Clusters:
53+
-- Provisioned cluster materialized views
5354
GRANT SELECT ON pg_catalog.stv_mv_info TO datahub;
55+
```
56+
57+
### Additional Permissions Based on Your Configuration
58+
59+
**For Serverless Workgroups:**
60+
61+
```sql
62+
-- Use these instead of stv_mv_info (from Provisioned section above)
63+
GRANT SELECT ON pg_catalog.svv_user_info TO datahub;
64+
GRANT SELECT ON pg_catalog.svv_mv_info TO datahub;
65+
```
66+
67+
**For Shared Databases (Datashare Consumers):**
5468

55-
-- For Serverless Workgroups:
56-
-- GRANT SELECT ON pg_catalog.svv_user_info TO datahub;
57-
-- GRANT SELECT ON pg_catalog.svv_mv_info TO datahub;
69+
```sql
70+
-- Required when is_shared_database = True
71+
GRANT SELECT ON pg_catalog.svv_redshift_tables TO datahub;
72+
GRANT SELECT ON pg_catalog.svv_redshift_columns TO datahub;
5873
```
5974

6075
### Data Access Permissions (Required for Profiling/Classification)
@@ -71,22 +86,46 @@ GRANT SELECT ON ALL TABLES IN SCHEMA public TO datahub;
7186
GRANT SELECT ON ALL TABLES IN SCHEMA your_schema_name TO datahub;
7287

7388
-- For production environments (future tables/views):
74-
-- IMPORTANT: Only works for objects created by the user running this command
89+
-- IMPORTANT: Default privileges only apply to objects created by the user who runs this command
90+
-- Option 1: If you (as admin) will create all future tables/views:
91+
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
92+
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
7593
ALTER DEFAULT PRIVILEGES IN SCHEMA your_schema_name GRANT SELECT ON TABLES TO datahub;
7694
ALTER DEFAULT PRIVILEGES IN SCHEMA your_schema_name GRANT SELECT ON VIEWS TO datahub;
77-
--
78-
-- Alternative: Run this periodically to catch all new objects regardless of creator:
79-
-- GRANT SELECT ON ALL TABLES IN SCHEMA your_schema_name TO datahub;
95+
96+
-- Option 2: If other users will create tables/views, run this for each user:
97+
-- ALTER DEFAULT PRIVILEGES FOR ROLE other_user_name IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
98+
-- ALTER DEFAULT PRIVILEGES FOR ROLE other_user_name IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
99+
100+
-- Option 3: For all future users (requires superuser):
101+
-- ALTER DEFAULT PRIVILEGES FOR ALL ROLES IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
102+
-- ALTER DEFAULT PRIVILEGES FOR ALL ROLES IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
80103
```
81104

82-
### Additional Permissions (Only if needed)
105+
:::caution Data Access vs Metadata Access
106+
107+
**The permissions are split into two categories:**
108+
109+
1. **System table permissions** (above) - Required for metadata extraction, lineage, and usage statistics
110+
2. **Data access permissions** (this section) - Required for data profiling, classification, and any feature that reads actual table content
111+
112+
**Default privileges only apply to objects created by the user who ran the ALTER DEFAULT PRIVILEGES command.** If multiple users create tables in your schemas, you need to:
113+
114+
1. **Run the commands as each user**, OR
115+
2. **Use `FOR ROLE other_user_name`** for each user who creates objects, OR
116+
3. **Use `FOR ALL ROLES`** (requires superuser privileges)
117+
118+
**Common gotcha**: If User A runs `ALTER DEFAULT PRIVILEGES` and User B creates a table, DataHub won't have access to User B's table unless you used Option 2 or 3 above.
119+
120+
**Alternative approach**: Instead of default privileges, consider using a scheduled job to periodically grant access to new tables:
83121

84122
```sql
85-
-- Only if using shared databases (datashare consumers):
86-
-- GRANT SELECT ON pg_catalog.svv_redshift_tables TO datahub;
87-
-- GRANT SELECT ON pg_catalog.svv_redshift_columns TO datahub;
123+
-- Run this periodically to catch new tables
124+
GRANT SELECT ON ALL TABLES IN SCHEMA your_schema_name TO datahub;
88125
```
89126

127+
:::
128+
90129
## Next Steps
91130

92131
Once you've confirmed all of the above in Redshift, it's time to [move on](configuration.md) to configure the actual ingestion source within the DataHub UI.

0 commit comments

Comments
 (0)