Skip to content

Commit cd7dfba

Browse files
committed
Simpler example in README, naming from "clusters" to "topics", expanded on notebook
1 parent 811df14 commit cd7dfba

File tree

4 files changed

+13548
-67
lines changed

4 files changed

+13548
-67
lines changed

README.md

+41-2
Original file line numberDiff line numberDiff line change
@@ -34,6 +34,45 @@ Topically's first feature is to name clusters of texts based on their content. F
3434
<img src="./assets/topically-name_cluster.png" />
3535

3636

37+
# Usage Example
38+
Use Topically to name clusters in the course of topic modeling
39+
40+
```python
41+
import topically
42+
43+
app = topically.Topically('cohere_api_key')
44+
45+
example_texts = [
46+
# Three headlines from the machine learning subreddit
47+
"[Project] From books to presentations in 10s with AR + ML",
48+
"[D] A Demo from 1993 of 32-year-old Yann LeCun showing off the World's first Convolutional Network for Text Recognition",
49+
"[R] First Order Motion Model applied to animate paintings",
50+
51+
# Three headlines from the investing subreddit
52+
"Robinhood and other brokers literally blocking purchase of $GME, $NOK, $BB, $AMC; allow sells",
53+
"United Airlines stock down over 5% premarket trading",
54+
"Bitcoin was nearly $20,000 a year ago today"]
55+
56+
# We know the first three texts belong to one topic (topic 0), the last three belong to another topic (topic 1)
57+
example_topics = [0, 0, 0, 1, 1, 1]
58+
59+
cluster_names = app.name_clusters((example_texts, example_topics)) #Optional: num_generations=5
60+
topic_names # Run again to get new suggested names. More text examples should result in better names.
61+
62+
```
63+
64+
Output:
65+
```
66+
['Text recognition',
67+
'Text recognition',
68+
'Text recognition',
69+
'Stock Market Closing Bell',
70+
'Stock Market Closing Bell',
71+
'Stock Market Closing Bell']
72+
```
73+
74+
In this simple example, we know the cluster assignments. In actual applications, a topic modeling library like BERTopic can cluster the texts for us, and then we can name them with topically.
75+
3776
# Usage Example: Topically + BERTopic
3877
Use Topically to name clusters in the course of topic modeling with tools like BERTopic. Get the cluster assignments from BERTopic, and name the clusters with topically. Here's example code and a colab notebook demonstrating this.
3978

@@ -56,9 +95,9 @@ df['topic'], probabilities = topic_model.fit_transform(df['title'], embeds)
5695
app = Topically('cohere_api_key')
5796

5897
# name clusters
59-
df['cluster_names'] = app.name_clusters((df['title'], df['topic']))
98+
df['topic_names'] = app.name_topics((df['title'], df['topic']))
6099

61-
df[['title', 'topic', 'cluster_names']]
100+
df[['title', 'topic', 'topic_names']]
62101
```
63102

64103

0 commit comments

Comments
 (0)