Skip to content

Adding Signals Real-time personalisation accelerator #1284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
---
title: Delivering Personalised Recommendations
position: 4
---

Now that you have defined your Snowplow Signals Attributes and created a View and Service to calculate them in real-time, it's time to integrate these attribute values into our Newspaper website to serve real-time personalised recommendations for users.

Within our NextJS app, we will create a `/api/recommended-articles` endpoint that will return a list of recommended articles based on a user's Snowplow Signals attributes.

In this tutorial, we will walk through the key components of the recommendation API endpoint code. If you wish to see the entire code recommendation API from this tutorial you can access it here.

If doing this within your own NextJS project you will need to install Snowplow Signals Node SDK:

```bash
npm install @snowplow/signals-node
```

---

### Initialising Snowplow Signals

You’ll need the following to connect to Snowplow Signals:

- **Signals API URL:** You can find this in Snowplow BDP Console
- **Signals API Key + Key ID:** Generate these within the [Snowplow BDP Console](https://console.snowplowanalytics.com/credentials)
- **Snowplow Organisation ID:** Find this in the URL of any BDP Console page, e.g. `https://console.snowplowanalytics.com/<organisation-id>/homepage`

Initialise Signals by entering in the details above as shown below:

```js
import { Signals } from "@snowplow/signals-node"

const signals = new Signals({
baseUrl: process.env.SIGNALS_API_URL || "",
apiKey: process.env.SIGNALS_API_KEY || "",
apiKeyId: process.env.SIGNALS_API_KEY_ID || "",
organizationId: process.env.SIGNALS_ORG_ID || "",
});
```

---

### Extracting the Snowplow User ID

Within the GET request function, we will now extract the user’s cookie user ID from the API request cookie headers and pass it into our function call for Signals to get the online attributes for the user.

In this example we are using the `domain_userid` as specified in entity property when creating a Signals View in previous steps. You can follow similar steps if implementing with a `domain_sessionid` or a `network_userid`.

```js
import { cookies } from "next/headers"

const cookieStore = cookies()
const spCookie = cookieStore.get("_sp_id.1fff")?.value || "anonymous" // Update _sp_id.1fff to the name of your Snowplow Cookie

const spDomainUserId = spCookie.split(".")[0] || "anonymous"
const spDomainSessionId = spCookie.split(".")[5] || "anonymous"
```

---

### Retrieving the Snowplow Signals Attributes

Now we have the user cookie ID of the user we wish to fetch attributes for we can make a request to the Snowplow Signals Profile API.

We can fetch the attributes for a user by referring to the Service we created earlier:

```js
const attributes = await signals.getOnlineAttributes({
entities: { domain_userid: [spDomainUserId] },
service: "media_demo_service",
});
```

We can then extract the individual attributes to be used in our recommendation engine:

```js
const article_last_read = attributes['article_last_read']?.[0] ?? null;

const business_articles_read = attributes['article_category_business_read_count']?.[0] ?? 0;
const ai_articles_read = (attributes['article_category_ai_read_count']?.[0] ?? 0);
const data_articles_read = (attributes['article_category_data_read_count']?.[0] ?? 0);
const technology_articles_read = (attributes['article_category_technology_read_count']?.[0] ?? 0);
```

---

### Creating a Recommendation

In this simplified demo application our recommendations for newspaper articles are made by identifying the categories a user has read most frequently or the articles similar to the last article a user has read. This is a simple, yet powerful, way to personalise the experience for a user based on their behaviour.

In practice you may wish to have much more sophisticated recommendation logic, this could be implemented by passing a user's attributes from Signals into a machine learning algorithm and using that response to prioritise recommendations. This would allow for a more accurate and engaging personalised experience based on historical behavioural data of other users.

You can see below how the recommendation was implemented in this application:

```js
// Identify the user's interests based on the articles they have read. Use top 2 categories by read count
const interests = Object.entries({
Business: business_articles_read,
AI: ai_articles_read,
Data: data_articles_read,
Technology: technology_articles_read,
})
.filter(([_, count]) => typeof count === "number" && count > 0)
.sort(([, a], [, b]) => Number(b) - Number(a))
.slice(0, 2)
.map(([category]) => category);

let recommendedArticles = articles.filter((article) => interests.includes(article.category) && !articles_read.includes(article.slug))

// Limit to 3 articles and add a personalization score
recommendedArticles = recommendedArticles.slice(0, 3).map((article) => ({
...article,
// Add a reason for the recommendation
recommendationReason: `Based on your interest in ${article.category}`,
}))
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
---
title: Defining Snowplow Signals Attributes
position: 3
---

> **Note:** You can follow along with this guide by viewing the notebook in [this repository](https://github.com/snowplow-incubator/signals-media-demo).

### Install the Snowplow Signals Python SDK

Open a new Python notebook (or copy the notebook above) and install the Snowplow Signals Python SDK:

```bash
pip install snowplow-signals
```

> The Snowplow Signals SDK requires Python 3.11+.

---

### Connect to Snowplow Signals

You’ll need the following to connect to Snowplow Signals:

- **Signals API URL:** You can find this in Snowplow BDP Console
- **Signals API Key + Key ID:** Generate these within the [Snowplow BDP Console](https://console.snowplowanalytics.com/credentials)
- **Snowplow Organisation ID:** Find this in the URL of any BDP Console page, e.g. `https://console.snowplowanalytics.com/<organisation-id>/homepage`

Insert these details into a Python Notebook to initialise the connection:

```python
from snowplow_signals import Signals

signals = Signals(
api_url="YOUR_API_URL",
api_key="YOUR_API_KEY",
api_key_id="YOUR_API_KEY_ID",
org_id="YOUR_ORG_ID",
)
```

---

### Define Attributes

Attributes in Snowplow Signals represent specific facts about user behavior, calculated from events in your Snowplow pipeline.

For a newspaper recommendation engine, you might want to:

- Count how many times a user reads articles in each category (e.g., business, technology)
- Track the title of the last article a user read

Based on these attributes we can recommend articles to a user based on categories they find engaging and articles similar to the last article they read.

First, import the required classes:

```python
from snowplow_signals import Attribute, Event, Criteria, Criterion
```

#### Example: Count Business Article Reads

This attribute counts the number of `article_details` events where the `category` is `"business"`.

```python
article_category_business_read_count = Attribute(
name="article_category_business_read_count",
type="int32",
aggregation="counter",
events=[
Event(
vendor="com.snplow.example",
name="article_details",
version="1-0-0",
)
],
criteria=Criteria(
all=[
Criterion(
property="unstruct_event_com_snplow_sales_aws_article_details_1:category",
operator="=",
value="business"
)
]
)
)
```

Let's break down what that code does:
- We have created an attribute with a name of article_category_business_read_count and set the type to be an integer.
- We defined the aggregation method to be a counter, which makes the attribute count the number of events
- Next we define the Snowplow Event we wish to include as part of this attribute, in this case we are using the article_details event we defined earlier in this tutorial.
- Lastly we filter those events using Criteria to count to only those which have the property category set to “business”.

Next we want to repeat the step above to create attributes for the rest of the article categories.


#### Example: Last Article Read

Lastly we want to define another attribute to contain the title of the last article a user has read. We can do that by defining the following attribute:

```python
article_last_read = Attribute(
name="article_last_read",
type="string",
aggregation="last",
events=[
Event(
vendor="com.snplow.example",
name="article_details",
version="1-0-0",
)
],
property="unstruct_event_com_snplow_sales_aws_article_details_1:name"
)
```

---

### Define a View & Service

Now we have defined all our Attributes, the next step we need to do is put all of the Attributes into a View and define a Service which we will use to retrieve the attribute values on our website.

You can think of a View and a Service as the following:
- **Views** are a versioned collection of attributes grouped by a common Entity (e.g., session_id or user_id).
- **Services** are a collection of views that streamlines the retrieval of multiple views and are used in real-time applications to retrieve attributes.

You can define Views and Services by performing the following:

```python
media_demo_view = View(
name="media_demo_view",
version=1,
entity="domain_userid", # Entity over which attributes are calculated
online=True, # Enable Snowplow Signals Real-time calculation of Attributes
owner="[email protected]",
attributes=[
article_last_read,
article_category_business_read_count,
article_category_ai_read_count,
article_category_data_read_count,
article_category_technology_read_count
]
)

media_demo_service = Service(
name="media_demo_service",
description="Media Demo Service",
views=[media_demo_view],
owner="[email protected]"
)
```

---

### Test Your View

Now that you have successfully created a View and Service for your attributes, you can perform the following to test how your attributes currently look

#### Test Using the Data Warehouse

Signals provide a way to test how your attributes look using historical data. You can do this by running the following code and passing through the View you created earlier. It will generate a sample of attributes over the past hour from your atomic events table.

```python
data = signals.test(
view=media_demo_view,
app_ids=["website"],
)
print(data)
```

#### Test Using the Real-time Stream

Additionally, you may wish to see how the attributes are calculated in real-time for a certain identifier (user or session).

For example, you can generate events on your own device and use the Snowplow inspector to identify your `domain_userid` and see how the attributes are calculated.

You can do that by running the following code:

```python
response = signals.get_online_attributes(
source=media_demo_service,
identifiers="INSERT_YOUR_ID" # UUID for your entity value from the View eg. a domain_userid or domain_sessionid
)

print(response.model_dump()['data'])
```
40 changes: 40 additions & 0 deletions tutorials/signals-real-time-media-personalisation/intro.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
title: Introduction
position: 1
---

Personalisation is no longer a nice to have, consumers are expecting it. According to [McKinsey](https://www.mckinsey.com/capabilities/growth-marketing-and-sales/our-insights/enhancing-customer-experience-in-the-digital-age), 71% of consumers expect companies to deliver personalised experiences and 76% will switch if they don't like their experience. Snowplow Signals allows companies to personalise the user experience based on their behaviour to increase engagement, conversion, and retention.

In the example shown in this tutorial, we will personalise the experience for a newspaper website to increase customer engagement by recommending articles to customers based on their own reading habits. By keeping users engaged we can either increase our publisher ad revenue, or display paid content to a user that we know they are more likely to engage with and consequently convert to a paying subscriber.

This tutorial guides you through using Snowplow Signals to create personalized recommendations for anonymous users based on their behavioral data. By the end, you'll understand how to configure Snowplow Signals and integrate them into your app to deliver personalized experiences.

We will use a NextJS newspaper website as an example but the principles can be applied to many use cases. The focus will be on turning Snowplow events into real-time Signal attributes rather than on the recommendation algorithm itself which will be simplified.

### Before Personalization

![Initial recommendation component showing no recommendations are available yet.](./screenshots/no-recommendations.png)

### After Personalization

![Personalized recommendation component showing articles based on user interests like Technology and Data.](./screenshots/signals-recommendations.png)

## High-Level Architecture Overview

The following diagram illustrates the data flow from the user's activity on the website to the personalized recommendation they see.

![Architecture diagram showing the flow: Newspaper Website sends Snowplow events to the Signals Event Stream, which feeds the Snowplow Signals Profiles API. A NextJS App API requests user attributes from the Profiles API to generate and display recommended articles in the Recommendation Component.](./screenshots/signals-media-demo-architecture.png)

### Key Components

1. **Website with Snowplow Events**: Users browse the website, and the Snowplow SDK sends `page_view` and `article_detail` events. The `article_detail` event includes the article's title and category (e.g., Business, Technology).
2. **Snowplow Signals Profiles API**: Snowplow events stream into Snowplow Signals, where they are aggregated in real-time based on predefined attributes.
3. **Recommendation API Service**: When a user visits the homepage, the website calls an internal API (`/api/recommended-articles`). This service fetches the user's real-time attributes from the Profiles API and recommends articles based on their most-read categories and articles similar to their last-read one.

## Prerequisites & Assumptions

This tutorial assumes you have a basic understanding of Snowplow Event Tracking and how APIs work.

The complete code for this example is available in [this repository](https://github.com/snowplow-incubator/signals-media-demo). To run it yourself, you will need:
* A running Snowplow pipeline.
* Snowplow Signals deployed on your pipeline.
5 changes: 5 additions & 0 deletions tutorials/signals-real-time-media-personalisation/meta.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
{
"title": "Real-time Recommendations with Snowplow Signals",
"label": "Solution accelerator",
"description": "How to build a real-time recommendation system with Snowplow Signals on a newspaper website"
}
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading