Skip to content

Commit 16b686f

Browse files
committed
feat: add movie dataset
1 parent b9a2be0 commit 16b686f

File tree

5 files changed

+5869
-2
lines changed

5 files changed

+5869
-2
lines changed

playground/12-embeddings.ipynb

Lines changed: 120 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,120 @@
1+
{
2+
"cells": [
3+
{
4+
"cell_type": "code",
5+
"execution_count": 1,
6+
"metadata": {},
7+
"outputs": [],
8+
"source": [
9+
"import openai"
10+
]
11+
},
12+
{
13+
"cell_type": "code",
14+
"execution_count": 2,
15+
"metadata": {},
16+
"outputs": [],
17+
"source": [
18+
"from dotenv import dotenv_values\n",
19+
"config = dotenv_values(\".env\")"
20+
]
21+
},
22+
{
23+
"cell_type": "code",
24+
"execution_count": 3,
25+
"metadata": {},
26+
"outputs": [],
27+
"source": [
28+
"openai.api_key = config[\"OPENAI_API_KEY\"]"
29+
]
30+
},
31+
{
32+
"attachments": {},
33+
"cell_type": "markdown",
34+
"metadata": {},
35+
"source": [
36+
"## Generating a single embedding"
37+
]
38+
},
39+
{
40+
"cell_type": "code",
41+
"execution_count": null,
42+
"metadata": {},
43+
"outputs": [],
44+
"source": [
45+
"response = openai.Embedding.create(\n",
46+
" model=\"text-embedding-ada-002\",\n",
47+
" input=\"candy canes\"\n",
48+
")"
49+
]
50+
},
51+
{
52+
"cell_type": "code",
53+
"execution_count": null,
54+
"metadata": {},
55+
"outputs": [],
56+
"source": [
57+
"response[\"data\"][0][\"embedding\"]"
58+
]
59+
},
60+
{
61+
"attachments": {},
62+
"cell_type": "markdown",
63+
"metadata": {},
64+
"source": [
65+
"## Movies plotting with Atlas"
66+
]
67+
},
68+
{
69+
"cell_type": "code",
70+
"execution_count": 5,
71+
"metadata": {},
72+
"outputs": [],
73+
"source": [
74+
"import pandas as pd\n",
75+
"import numpy as np"
76+
]
77+
},
78+
{
79+
"cell_type": "code",
80+
"execution_count": 6,
81+
"metadata": {},
82+
"outputs": [],
83+
"source": [
84+
"dataset_path = \"./datasets/movie_plots.csv\"\n",
85+
"df = pd.read_csv(dataset_path)"
86+
]
87+
},
88+
{
89+
"cell_type": "code",
90+
"execution_count": 12,
91+
"metadata": {},
92+
"outputs": [],
93+
"source": [
94+
"movies = df[df[\"Origin/Ethnicity\"] == \"American\"].sort_values(\"Release Year\", ascending=False).head(500)"
95+
]
96+
}
97+
],
98+
"metadata": {
99+
"kernelspec": {
100+
"display_name": ".venv",
101+
"language": "python",
102+
"name": "python3"
103+
},
104+
"language_info": {
105+
"codemirror_mode": {
106+
"name": "ipython",
107+
"version": 3
108+
},
109+
"file_extension": ".py",
110+
"mimetype": "text/x-python",
111+
"name": "python",
112+
"nbconvert_exporter": "python",
113+
"pygments_lexer": "ipython3",
114+
"version": "3.10.5"
115+
},
116+
"orig_nbformat": 4
117+
},
118+
"nbformat": 4,
119+
"nbformat_minor": 2
120+
}

playground/README.md

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -77,6 +77,18 @@ You need to create a `.env` file with your `OPENAI_API_KEY`.
7777

7878
- writing the playlist generating prompt.
7979

80-
[Check the notebook](12-gpt-4-ai-spotify-playlist-generator.ipynb)
80+
[Check the notebook](11-gpt-4-ai-spotify-playlist-generator.ipynb)
81+
82+
## Embeddings
83+
84+
- generating a single embedding.
85+
- creating a movie embedding visualization with Atlas.
86+
- getting our movie data.
87+
- getting our movie data ready.
88+
- generating embeddings for 5000 movies.
89+
- visualizing our embeddings with atlas.
90+
- recommending movies using our embeddings.
91+
92+
[Check the notebook](12-embeddings.ipynb)
8193

8294
Based on [Mastering OpenAI Python APIs: Unleash the Power of GPT4](https://www.udemy.com/course/mastering-openai/) by Colt Steele (2023).

0 commit comments

Comments
 (0)