Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
b3308dc
Upgrade dependencies and theme components
mscwilson Sep 8, 2025
d07e8cd
Upgrade algolia
mscwilson Sep 9, 2025
5bfa05d
Fix compilation errors
mscwilson Sep 9, 2025
6b1d398
Update swizzled component imports
mscwilson Sep 9, 2025
086600c
Fix build problems
mscwilson Sep 9, 2025
c6d26ca
Remove unnecessary dependency resolution
mscwilson Sep 9, 2025
be277c9
Upgraded_CodeBlockComponent
AH-Avalanche Sep 16, 2025
17e1ccc
Fix build command errror
AH-Avalanche Sep 19, 2025
b14b099
Add missing dependency
mscwilson Sep 29, 2025
3e7c856
Specify Docusaurus v3.9
mscwilson Sep 29, 2025
7ef73de
Replace caution admons with warnings
mscwilson Sep 29, 2025
556cfb9
Fix some tables
mscwilson Sep 29, 2025
41d050e
Fix more tables
mscwilson Sep 29, 2025
2f2fa2f
Update ReactMarkdown tables
mscwilson Sep 29, 2025
19bd5a2
Add warning for broken anchors
mscwilson Sep 29, 2025
1f79250
Add v4 feature flags
mscwilson Sep 29, 2025
7f93cb4
Add v4 dependency
mscwilson Sep 29, 2025
4ad59c8
Merge branch 'main' into upgrade-v3
mscwilson Sep 29, 2025
892a2f4
Fix build problems
mscwilson Sep 29, 2025
1d4ac30
Remove runnable codeblocks
mscwilson Sep 29, 2025
d947b4d
Remove custom CodeBlock components
mscwilson Sep 29, 2025
5cf69a9
Restore github codeblocks
mscwilson Sep 29, 2025
20071d6
Ignore js in dependency
mscwilson Sep 30, 2025
31e5147
Set codeblock button opacity
mscwilson Sep 30, 2025
4a72aa0
Fix Tailwind/Docusaurus integration and improve UI components
AH-Avalanche Oct 2, 2025
ef702ab
Fixed-SmallbreadCrumbBugs
AH-Avalanche Oct 8, 2025
a6804bb
Tutorial Alignment Fix
AH-Avalanche Oct 8, 2025
25cdd41
Overiding FB default layout page
AH-Avalanche Oct 9, 2025
5de46c9
Tutorial Progress component and Layout update
AH-Avalanche Oct 13, 2025
79051c3
Bug fix - Prev & next buttons missing from tutorial
AH-Avalanche Oct 13, 2025
a96aee5
TutorialPage_Mobile_Upgrade
AH-Avalanche Oct 14, 2025
72d0888
Tutorial Landing page Upgrade
AH-Avalanche Oct 16, 2025
e2f9abd
new components imports
AH-Avalanche Oct 16, 2025
ea6fc4f
Filter and Search Bug Update
AH-Avalanche Oct 16, 2025
5e5d156
Update index.tsx
AH-Avalanche Oct 17, 2025
60d7d45
Body Overflow Issue
AH-Avalanche Oct 21, 2025
1d8905f
Style Fixes to Match production
AH-Avalanche Oct 23, 2025
25b7e00
Position the search Centerally
AH-Avalanche Oct 23, 2025
132d87a
Bug Fix Table overflow + Codeblock Button
AH-Avalanche Oct 28, 2025
b4936b6
Tutorial Page Light and dark modification
AH-Avalanche Oct 28, 2025
344253a
Tutorial Completion bug
AH-Avalanche Nov 4, 2025
0504387
Merge branch 'main' into upgrade-v3
mscwilson Nov 4, 2025
16ab95d
Fix broken links
mscwilson Nov 4, 2025
c01026c
Use the latest patch version
mscwilson Nov 4, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions .vscode/snowplow-docs.code-snippets
Original file line number Diff line number Diff line change
Expand Up @@ -88,11 +88,11 @@
"body": [":::info", "", "$0", "", ":::"],
"description": "Info admonition"
},
"Caution Admonition": {
"Warning Admonition": {
"scope": "markdown",
"prefix": "admon_caution",
"body": [":::caution", "", "$0", "", ":::"],
"description": "Caution admonition"
"prefix": "admon_warning",
"body": [":::warning", "", "$0", "", ":::"],
"description": "Warning admonition"
},
"Danger Admonition": {
"scope": "markdown",
Expand Down
3 changes: 3 additions & 0 deletions babel.config.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
module.exports = {
presets: [require.resolve('@docusaurus/core/lib/babel/preset')],
plugins: [],
// Ignore problematic plugin files during transpilation
ignore: ['**/node_modules/docusaurus-theme-github-codeblock/**/*.js'],
}
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
```mdx-code-block
import Mermaid from '@theme/Mermaid';
import Link from '@docusaurus/Link';
```

<p>On {props.cloud}, the BigQuery Streaming Loader continually pulls events from {props.stream} and writes to BigQuery using the <Link to="https://cloud.google.com/bigquery/docs/write-api">BigQuery Storage API</Link>.</p>

<Mermaid value={`
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
```mdx-code-block
import Link from '@docusaurus/Link';
```

<table>
<tbody>
<tr>
<td><code>batching.maxBytes</code></td>
<td>Optional. Default value <code>10000000</code>. Events are emitted to BigQuery when the batch reaches this size in bytes</td>
Expand All @@ -17,36 +17,36 @@ import Link from '@docusaurus/Link';
<tr>
<td><code>cpuParallelism.parseBytesFactor</code></td>
<td>
Optional. Default value <code>0.1</code>.
Controls how many batches of bytes we can parse into enriched events simultaneously.
E.g. If there are 2 cores and <code>parseBytesFactor = 0.1</code> then only one batch gets processed at a time.
Adjusting this value can cause the app to use more or less of the available CPU.
<p>Optional. Default value <code>0.1</code>.</p>
<p>Controls how many batches of bytes we can parse into enriched events simultaneously.</p>
<p>E.g. If there are 2 cores and <code>parseBytesFactor = 0.1</code> then only one batch gets processed at a time.</p>
<p>Adjusting this value can cause the app to use more or less of the available CPU.</p>
</td>
</tr>
<tr>
<td><code>cpuParallelism.transformFactor</code></td>
<td>
Optional. Default value <code>0.75</code>.
Controls how many batches of enriched events we can transform into BigQuery format simultaneously.
E.g. If there are 4 cores and <code>transformFactor = 0.75</code> then 3 batches gets processed in parallel.
Adjusting this value can cause the app to use more or less of the available CPU.
<p>Optional. Default value <code>0.75</code>.</p>
<p>Controls how many batches of enriched events we can transform into BigQuery format simultaneously.</p>
<p>E.g. If there are 4 cores and <code>transformFactor = 0.75</code> then 3 batches gets processed in parallel.</p>
<p>Adjusting this value can cause the app to use more or less of the available CPU.</p>
</td>
</tr>
<tr>
<td><code>retries.setupErrors.delay</code></td>
<td>
Optional. Default value <code>30 seconds</code>.
Configures exponential backoff on errors related to how BigQuery is set up for this loader.
Examples include authentication errors and permissions errors.
This class of errors are reported periodically to the monitoring webhook.
<p>Optional. Default value <code>30 seconds</code>.</p>
<p>Configures exponential backoff on errors related to how BigQuery is set up for this loader.</p>
<p>Examples include authentication errors and permissions errors.</p>
<p>This class of errors are reported periodically to the monitoring webhook.</p>
</td>
</tr>
<tr>
<td><code>retries.transientErrors.delay</code></td>
<td>
Optional. Default value <code>1 second</code>.
Configures exponential backoff on errors that are likely to be transient.
Examples include server errors and network errors.
<p>Optional. Default value <code>1 second</code>.</p>
<p>Configures exponential backoff on errors that are likely to be transient.</p>
<p>Examples include server errors and network errors.</p>
</td>
</tr>
<tr>
Expand All @@ -56,33 +56,34 @@ import Link from '@docusaurus/Link';
<tr>
<td><code>skipSchemas</code></td>
<td>
Optional, e.g. <code>["iglu:com.example/skipped1/jsonschema/1-0-0"]</code> or with wildcards <code>["iglu:com.example/skipped2/jsonschema/1-*-*"]</code>.
A list of schemas that won't be loaded to BigQuery.
This feature could be helpful when recovering from edge-case schemas which for some reason cannot be loaded to the table.
<p>Optional, e.g. <code>\["iglu:com.example/skipped1/jsonschema/1-0-0"]</code> or with wildcards <code>\["iglu:com.example/skipped2/jsonschema/1-*-*"]</code>.</p>
<p>A list of schemas that won't be loaded to BigQuery.</p>
<p>This feature could be helpful when recovering from edge-case schemas which for some reason cannot be loaded to the table.</p>
</td>
</tr>
<tr>
<td><code>legacyColumnMode</code></td>
<td>Optional. Default value <code>false</code>.
When this mode is enabled, the loader uses the legacy column style used by the v1 BigQuery loader.
For example, an entity for a <code>1-0-0</code> schema is loaded into a column ending in <code>_1_0_0</code>, instead of a column ending in <code>_1</code>.
This feature could be helpful when migrating from the v1 loader to the v2 loader.
<td>
<p>Optional. Default value <code>false</code>.</p>
<p>When this mode is enabled, the loader uses the legacy column style used by the v1 BigQuery loader.</p>
<p>For example, an entity for a <code>1-0-0</code> schema is loaded into a column ending in <code>_1_0_0</code>, instead of a column ending in <code>_1</code>.</p>
<p>This feature could be helpful when migrating from the v1 loader to the v2 loader.</p>
</td>
</tr>
<tr>
<td><code>legacyColumns</code></td>
<td>
Optional, e.g. <code>["iglu:com.example/legacy/jsonschema/1-0-0"]</code> or with wildcards <code>["iglu:com.example/legacy/jsonschema/1-*-*"]</code>.
Schemas for which to use the legacy column style used by the v1 BigQuery loader, even when <code>legacyColumnMode</code> is disabled.
<p>Optional, e.g. <code>\["iglu:com.example/legacy/jsonschema/1-0-0"]</code> or with wildcards <code>\["iglu:com.example/legacy/jsonschema/1-*-*"]</code>.</p>
<p>Schemas for which to use the legacy column style used by the v1 BigQuery loader, even when <code>legacyColumnMode</code> is disabled.</p>
</td>
</tr>
<tr>
<td><code>exitOnMissingIgluSchema</code></td>
<td>
Optional. Default value <code>true</code>.
Whether the loader should crash and exit if it fails to resolve an Iglu Schema.
We recommend `true` because Snowplow enriched events have already passed validation, so a missing schema normally indicates an error that needs addressing.
Change to <code>false</code> so events go the failed events stream instead of crashing the loader.
<p>Optional. Default value <code>true</code>.</p>
<p>Whether the loader should crash and exit if it fails to resolve an Iglu Schema.</p>
<p>We recommend <code>true</code> because Snowplow enriched events have already passed validation, so a missing schema normally indicates an error that needs addressing.</p>
<p>Change to <code>false</code> so events go the failed events stream instead of crashing the loader.</p>
</td>
</tr>
<tr>
Expand Down Expand Up @@ -127,13 +128,15 @@ import Link from '@docusaurus/Link';
</tr>
<tr>
<td><code>telemetry.disable</code></td>
<td>Optional. Set to <code>true</code> to disable <Link to="/docs/getting-started-on-community-edition/telemetry/">telemetry</Link>.</td>
<td>Optional. Set to <code>true</code> to disable <Link to="/docs/get-started/self-hosted/telemetry/">telemetry</Link>.</td>
</tr>
<tr>
<td><code>telemetry.userProvidedId</code></td>
<td>Optional. See <Link to="/docs/getting-started-on-community-edition/telemetry/#how-can-i-help">here</Link> for more information.</td>
<td>Optional. See <Link to="/docs/get-started/self-hosted/telemetry/#how-can-i-help">here</Link> for more information.</td>
</tr>
<tr>
<td><code>http.client.maxConnectionsPerServer</code></td>
<td> Optional. Default value 4. Configures the internal HTTP client used for iglu resolver, alerts and telemetry. The maximum number of open HTTP requests to any single server at any one time.</td>
</tr>
</tbody>
</table>
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ The name of each column is the name of the schema field converted to snake case.

:::

:::caution
:::warning

If an event or entity includes fields not defined in the schema, those fields will not be stored in the warehouse.

Expand Down Expand Up @@ -107,7 +107,7 @@ The name of each record field is the name of the schema field converted to snake

:::

:::caution
:::warning

If an event or entity includes fields not defined in the schema, those fields will not be stored in the warehouse.

Expand Down Expand Up @@ -196,7 +196,7 @@ The name of each record field is the name of the schema field converted to snake

:::

:::caution
:::warning

If an event or entity includes fields not defined in the schema, those fields will not be stored in the warehouse.

Expand All @@ -213,6 +213,53 @@ For example, suppose you have the following field in the schema:

It will be translated into a field called `last_name` (notice the underscore), of type `STRING`.

</TabItem>
<TabItem value="synapse" label="Synapse Analytics">

Each type of self-describing event and each type of entity get their own dedicated columns in the underlying data lake table. The name of such a column is composed of the schema vendor, schema name and major schema version (more on versioning [later](#versioning)).

The column name is prefixed by `unstruct_event_` for self-describing events, and by `contexts_` for entities. _(In case you were wondering, those are the legacy terms for self-describing events and entities, respectively.)_

:::note

All characters are converted to lowercase and all symbols (like `.`) are replaced with an underscore.

:::

Examples:

| Kind | Schema | Resulting column |
| --------------------- | ------------------------------------------- | -------------------------------------------------- |
| Self-describing event | `com.example/button_press/jsonschema/1-0-0` | `events.unstruct_event_com_example_button_press_1` |
| Entity | `com.example/user/jsonschema/1-0-0` | `events.contexts_com_example_user_1` |

The column will be formatted as JSON — an object for self-describing events and an array of objects for entities (because an event can have more than one entity attached).

Inside the JSON object, there will be fields corresponding to the fields in the schema.

:::note

The name of each JSON field is the name of the schema field converted to snake case.

:::

:::warning

If an event or entity includes fields not defined in the schema, those fields will not be stored in the data lake, and will not be availble in Synapse.

:::

For example, suppose you have the following field in the schema:

```json
"lastName": {
"type": "string",
"maxLength": 100
}
```

It will be translated into a field called `last_name` (notice the underscore) inside the JSON object.

</TabItem>
</Tabs>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ where `region` is one of `us-east-1`, `us-east-2`, `us-west-1`, `us-west-2`, `eu

## Configuring the EMR cluster

:::caution
:::warning

Starting from version `5.5.0`, batch transformer requires to use Java 11 on EMR ([default is Java 8](https://docs.aws.amazon.com/emr/latest/ReleaseGuide/configuring-java8.html)). See the `bootstrapActionConfigs` section in the configuration below.

Expand Down
Loading