Skip to content

Commit 57abd0e

Browse files
Merge pull request #2248 from segmentio/bigquery-storage-updates
BigQuery content edits
2 parents bc5c5cd + 9b1490c commit 57abd0e

File tree

2 files changed

+33
-38
lines changed

2 files changed

+33
-38
lines changed

src/_includes/content/warehouse-sync-sched.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Your data will be available in Warehouses between 24 and 48 hours from your first sync. Your warehouse then syncs once, or twice a day depending on your [Segment Plan](https://segment.com/pricing).
1+
Your data will be available in Warehouses between 24 and 48 hours from your first sync. Your warehouse then syncs once or twice a day depending on your [Segment Plan](https://segment.com/pricing).
22

33
Segment allows Business Tier (BT) customers to schedule the time and frequency of warehouse data syncs.
44

src/connections/storage/catalog/bigquery/index.md

Lines changed: 32 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -16,26 +16,22 @@ process to pull raw events and objects and load them into your BigQuery cluster.
1616
Using BigQuery through Segment means you'll get a fully managed data pipeline
1717
loaded into one of the most powerful and cost-effective data warehouses today.
1818

19-
If you notice any gaps,
20-
out-dated information or want to leave some feedback to help us improve
21-
our documentation, [let us know](https://segment.com/help/contact)!
22-
2319
## Getting Started
2420

25-
First, you'll want to enable BigQuery for your Google Cloud project. Then, you
26-
will create a Service Account for Segment to use. Last, you will create the
27-
warehouse in Segment.
21+
To store your Segment data in BigQuery, complete the following steps:
22+
- [Enable BigQuery for your Google Cloud project](#create-a-project-and-enable-bigquery)
23+
- [Create a GCP service account for Segment to assume](#create-a-service-account-for-segment)
24+
- [Create a warehouse in the Segment app](#create-the-warehouse-in-segment)
2825

2926
### Create a Project and Enable BigQuery
3027

3128
1. Navigate to the [Google Developers Console](https://console.developers.google.com/)
3229
2. Configure [Cloud Platform](https://console.cloud.google.com/):
3330
- If you don't have a project already, [create one](https://support.google.com/cloud/answer/6251787?hl=en&ref_topic=6158848).
3431
- If you have an existing project, you will need to [enable the BigQuery API](https://cloud.google.com/bigquery/quickstart-web-ui).
35-
Once you've done so, you should see BigQuery in the ["Resources" section](https://cl.ly/0W2i2I2B2R0M) of Cloud Platform.
36-
- **Note:** make sure [billing is enabled](https://support.google.com/cloud/answer/6293499#enable-billing) on your project,
37-
otherwise Segment will not be able to write into the cluster.
38-
3. Copy your project ID, as you will need it later.
32+
Once you've done so, you should see BigQuery in the "Resources" section of Cloud Platform.
33+
- **Note:** make sure [billing is enabled](https://support.google.com/cloud/answer/6293499#enable-billing) on your project, or Segment will not be able to write into the cluster.
34+
3. Copy the project ID. You will need it when you create a warehouse source in the Segment app.
3935

4036
### Create a Service Account for Segment
4137

@@ -44,7 +40,7 @@ for more information.
4440

4541
1. From the Navigation panel on the left, go to **IAM & admin** > **Service accounts**
4642
2. Click **Create Service Account** along the top
47-
3. Enter a name (for example: "segment-warehouses") and click **Create**
43+
3. Enter a name for the service account (for example: "segment-warehouses") and click **Create**
4844
4. When assigning permissions, make sure to grant the following roles:
4945
- `BigQuery Data Owner`
5046
- `BigQuery Job User`
@@ -55,12 +51,12 @@ The downloaded file will be used to create your warehouse in the next section.
5551

5652
1. In Segment, go to **Workspace** > **Add destination** > Search for "BigQuery"
5753
2. Select **BigQuery**
58-
3. Enter your project ID in the **Project** field
59-
4. Copy the contents of the credentials (the JSON key) into the **Credentials** field
60-
5. (Optional) Enter a [region code](https://cloud.google.com/compute/docs/regions-zones/) in the **Location** field (the default will be "US")
54+
3. Add a name for the destination to the **Name your destination** field
55+
4. Enter your project ID in the **Project** field
56+
5. Copy the contents of the credentials (the JSON key) into the **Credentials** field <br/>
57+
**Optional:** Enter a [region code](https://cloud.google.com/compute/docs/regions-zones/) in the **Location** field (the default will be "US")
6158
6. Click **Connect**
62-
7. if Segment is able to successfully connect with the **Project ID** and **Credentials**,
63-
the warehouse will be created and your first sync should begin shortly
59+
7. If Segment can connect with the provided **Project ID** and **Credentials**, a warehouse will be created and your first sync should begin shortly
6460

6561
### Schema
6662

@@ -92,10 +88,10 @@ from <project-id>.<source-name>.<collection-name>$20160809
9288
#### Views
9389

9490
A [view](https://cloud.google.com/bigquery/querying-data#views) is a virtual
95-
table defined by a SQL query. We use views in our de-duplication process to
91+
table defined by a SQL query. Segment uses views in the de-duplication process to
9692
ensure that events that you are querying unique events, and the latest objects
97-
from third-party data. All our views are set up to show information from the last
98-
60 days. Whenever possible, we recommend that you query from these views.
93+
from third-party data. All Segment views are set up to show information from the last
94+
60 days. Whenever possible, query from these views.
9995

10096
Views are appended with `_view` , which you can query like this:
10197

@@ -108,31 +104,31 @@ from <project-id>.<source-name>.<collection-name>_view
108104

109105
For early customers using BigQuery with Segment, rather than providing Segment
110106
with credentials, access was granted to a shared Service Account
111-
(`[email protected]`). While convenient early
112-
adopters, this presents potential security risks that we would prefer to address
107+
(`[email protected]`). While convenient for early
108+
adopters, this presented potential security risks that Segment would prefer to address
113109
proactively.
114110

115-
Starting in **March 2019**, we're going to start requiring BigQuery customers to
116-
create their own Service Accounts and provide us with those credentials instead.
111+
As of **March 2019**, Segment requires BigQuery customers to
112+
create their own Service Accounts and provide the app with those credentials instead.
117113
In addition, any attempts to update warehouse connection settings will also
118114
require these credentials. This effectively deprecates the shared Service
119-
Account, and in the future it will be deactivated completely.
115+
Account.
120116

121-
In order to stay ahead of this, make sure to migrate your warehouse by following
117+
To stay ahead of this change, migrate your warehouse by following
122118
the instructions in the "Create a Service Account for Segment" section above.
123119
Then, head to your warehouse's connection settings and update with the
124-
**Credentials** you created along the way.
120+
**Credentials** you created.
125121

126122

127123
## Best Practices
128124

129125
### Use views
130126

131127
BigQuery charges based on the amount of data scanned by your queries. Views are
132-
a derived view over your tables that we use for de-duplication of events.
133-
Therefore, we recommend you query a specific view whenever possible to avoid
128+
a derived view over your tables that Segment uses for de-duplication of events.
129+
Therefore, Segment recommends you query a specific view whenever possible to avoid
134130
duplicate events and historical objects. It's important to note that BigQuery
135-
views are not cached:
131+
views are not cached.
136132

137133
> BigQuery's views are logical views, not materialized views, which means that
138134
> the query that defines the view is re-executed every time the view is queried.
@@ -159,12 +155,11 @@ querying sub-sets of tables.
159155
Absolutely! You will just need to modify one of the references to 60 in the view
160156
definition to the number of days of your choosing.
161157

162-
We chose 60 days as it suits the needs for most of our customers. However,
158+
Segment chose 60 days as it suits the needs of most customers. However,
163159
you're welcome to update the definition of the view as long as the name stays
164160
the same.
165161

166-
Here is the base query we use when first setting up your views. We are leaving
167-
in the placeholders (`%s.%s.%s`) where you would want to include the project,
162+
Here is the base query Segment uses when first setting up your views. Included in the base query are the placeholders (`%s.%s.%s`) that you would want to include the project,
168163
dataset and table (in that order).
169164

170165
```sql
@@ -196,14 +191,14 @@ costs.
196191
You can connect to BigQuery using a BI tool like Mode or Looker, or query
197192
directly from the BigQuery console.
198193

199-
BigQuery now supports standard SQL, which you can enable using their query UI.
200-
This does not work with views, or with a query that utilizes table range
194+
BigQuery now supports standard SQL, which you can enable using their query UI.
195+
This does not work with views, or with a query that uses table range
201196
functions.
202197

203198
### Does Segment support streaming inserts?
204199

205200
Segment's connector does not support streaming inserts at this time. If you have
206-
a need for streaming data into BigQuery, [contact us](https://segment.com/requests/integrations/).
201+
a need for streaming data into BigQuery, [contact Segment support](https://segment.com/requests/integrations/).
207202

208203
### Can I customize my sync schedule?
209204

@@ -215,5 +210,5 @@ a need for streaming data into BigQuery, [contact us](https://segment.com/reques
215210

216211
### I'm seeing duplicates in my tables.
217212

218-
This behavior is expected. We only de-duplicate data in your views. See the
213+
This behavior is expected. Segment only de-duplicates data in your views. See the
219214
section on [views](#views) for more details.

0 commit comments

Comments
 (0)