You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/quick-ingestion-guides/redshift/setup.md
+54-15Lines changed: 54 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,9 +20,11 @@ CREATE USER datahub WITH PASSWORD 'Datahub1234';
20
20
21
21
## Redshift Setup
22
22
23
-
1. Grant the following permissions to your `datahub` user. For most users, the **minimal set** below will be sufficient:
23
+
1. Grant the following permissions to your `datahub` user. For most users, the **recommended set** below will be sufficient:
24
24
25
-
### Minimal Required Permissions (Recommended)
25
+
### Recommended Permissions
26
+
27
+
For a typical provisioned cluster with default settings:
26
28
27
29
```sql
28
30
-- Core system access (required for lineage and usage statistics)
@@ -48,13 +50,26 @@ GRANT SELECT ON pg_catalog.pg_attrdef TO datahub;
48
50
-- Datashare lineage (enabled by default)
49
51
GRANTSELECTONpg_catalog.svv_datashares TO datahub;
50
52
51
-
-- Choose ONE based on your Redshift type:
52
-
-- For Provisioned Clusters:
53
+
-- Provisioned cluster materialized views
53
54
GRANTSELECTONpg_catalog.stv_mv_info TO datahub;
55
+
```
56
+
57
+
### Additional Permissions Based on Your Configuration
58
+
59
+
**For Serverless Workgroups:**
60
+
61
+
```sql
62
+
-- Use these instead of stv_mv_info (from Provisioned section above)
63
+
GRANTSELECTONpg_catalog.svv_user_info TO datahub;
64
+
GRANTSELECTONpg_catalog.svv_mv_info TO datahub;
65
+
```
66
+
67
+
**For Shared Databases (Datashare Consumers):**
54
68
55
-
-- For Serverless Workgroups:
56
-
-- GRANT SELECT ON pg_catalog.svv_user_info TO datahub;
57
-
-- GRANT SELECT ON pg_catalog.svv_mv_info TO datahub;
69
+
```sql
70
+
-- Required when is_shared_database = True
71
+
GRANTSELECTONpg_catalog.svv_redshift_tables TO datahub;
72
+
GRANTSELECTONpg_catalog.svv_redshift_columns TO datahub;
58
73
```
59
74
60
75
### Data Access Permissions (Required for Profiling/Classification)
@@ -71,22 +86,46 @@ GRANT SELECT ON ALL TABLES IN SCHEMA public TO datahub;
71
86
GRANTSELECTON ALL TABLES IN SCHEMA your_schema_name TO datahub;
72
87
73
88
-- For production environments (future tables/views):
74
-
-- IMPORTANT: Only works for objects created by the user running this command
89
+
-- IMPORTANT: Default privileges only apply to objects created by the user who runs this command
90
+
-- Option 1: If you (as admin) will create all future tables/views:
91
+
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANTSELECTON TABLES TO datahub;
92
+
ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANTSELECTON VIEWS TO datahub;
75
93
ALTER DEFAULT PRIVILEGES IN SCHEMA your_schema_name GRANTSELECTON TABLES TO datahub;
76
94
ALTER DEFAULT PRIVILEGES IN SCHEMA your_schema_name GRANTSELECTON VIEWS TO datahub;
77
-
--
78
-
-- Alternative: Run this periodically to catch all new objects regardless of creator:
79
-
-- GRANT SELECT ON ALL TABLES IN SCHEMA your_schema_name TO datahub;
95
+
96
+
-- Option 2: If other users will create tables/views, run this for each user:
97
+
-- ALTER DEFAULT PRIVILEGES FOR ROLE other_user_name IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
98
+
-- ALTER DEFAULT PRIVILEGES FOR ROLE other_user_name IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
99
+
100
+
-- Option 3: For all future users (requires superuser):
101
+
-- ALTER DEFAULT PRIVILEGES FOR ALL ROLES IN SCHEMA public GRANT SELECT ON TABLES TO datahub;
102
+
-- ALTER DEFAULT PRIVILEGES FOR ALL ROLES IN SCHEMA public GRANT SELECT ON VIEWS TO datahub;
80
103
```
81
104
82
-
### Additional Permissions (Only if needed)
105
+
:::caution Data Access vs Metadata Access
106
+
107
+
**The permissions are split into two categories:**
108
+
109
+
1.**System table permissions** (above) - Required for metadata extraction, lineage, and usage statistics
110
+
2.**Data access permissions** (this section) - Required for data profiling, classification, and any feature that reads actual table content
111
+
112
+
**Default privileges only apply to objects created by the user who ran the ALTER DEFAULT PRIVILEGES command.** If multiple users create tables in your schemas, you need to:
113
+
114
+
1.**Run the commands as each user**, OR
115
+
2.**Use `FOR ROLE other_user_name`** for each user who creates objects, OR
116
+
3.**Use `FOR ALL ROLES`** (requires superuser privileges)
117
+
118
+
**Common gotcha**: If User A runs `ALTER DEFAULT PRIVILEGES` and User B creates a table, DataHub won't have access to User B's table unless you used Option 2 or 3 above.
119
+
120
+
**Alternative approach**: Instead of default privileges, consider using a scheduled job to periodically grant access to new tables:
83
121
84
122
```sql
85
-
-- Only if using shared databases (datashare consumers):
86
-
-- GRANT SELECT ON pg_catalog.svv_redshift_tables TO datahub;
87
-
-- GRANT SELECT ON pg_catalog.svv_redshift_columns TO datahub;
123
+
-- Run this periodically to catch new tables
124
+
GRANTSELECTON ALL TABLES IN SCHEMA your_schema_name TO datahub;
88
125
```
89
126
127
+
:::
128
+
90
129
## Next Steps
91
130
92
131
Once you've confirmed all of the above in Redshift, it's time to [move on](configuration.md) to configure the actual ingestion source within the DataHub UI.
0 commit comments