You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"[](https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tutorials/streamlit_notebooks/CONTEXTUAL_WORD_MEANING.ipynb)\n",
10
+
"\n"
11
+
],
12
+
"cell_type": "markdown",
13
+
"metadata": {}
14
+
},
15
+
{
16
+
"source": [
17
+
"# **Infer word meaning from context**"
18
+
],
19
+
"cell_type": "markdown",
20
+
"metadata": {}
21
+
},
22
+
{
23
+
"source": [
24
+
"Compare the meaning of words in two different sentences and evaluate ambiguous pronouns."
"The `T5 Transformer` model is able to perform 18 different tasks (ref.: [this paper](https://arxiv.org/abs/1910.10683)). To infer word meaning from context, we can use the following tasks:\n",
121
+
"\n",
122
+
"- `wic`: Classify for a pair of sentences and a disambigous word if the word has the same meaning in both sentences.\n",
123
+
"- `wsc-dpr`: Predict for an ambiguous pronoun in a sentence what it is referring to."
124
+
],
125
+
"cell_type": "markdown",
126
+
"metadata": {}
127
+
},
128
+
{
129
+
"cell_type": "code",
130
+
"execution_count": 5,
131
+
"metadata": {},
132
+
"outputs": [],
133
+
"source": [
134
+
"#TASK = 'wic'\n",
135
+
"TASK = 'wsc-dpr'"
136
+
]
137
+
},
138
+
{
139
+
"cell_type": "code",
140
+
"execution_count": 10,
141
+
"metadata": {},
142
+
"outputs": [],
143
+
"source": [
144
+
"# Prefix to be used on the T5Transformer().setTask(<<prefix>>)\n",
145
+
"task_prefix = {\n",
146
+
" 'wic': 'wic pos::', \n",
147
+
" 'wsc-dpr': 'wsc:',\n",
148
+
" }"
149
+
]
150
+
},
151
+
{
152
+
"source": [
153
+
"## 4 Examples to try on the model"
154
+
],
155
+
"cell_type": "markdown",
156
+
"metadata": {}
157
+
},
158
+
{
159
+
"cell_type": "code",
160
+
"execution_count": 11,
161
+
"metadata": {},
162
+
"outputs": [],
163
+
"source": [
164
+
"text_lists = {\n",
165
+
" 'wic': [\"\"\"\n",
166
+
" pos:\n",
167
+
" sentence1: The expanded window will give us time to catch the thieves.\n",
168
+
" sentence2: You have a two-hour window of turning in your homework.\n",
169
+
" word: window\n",
170
+
"\"\"\"],\n",
171
+
" 'wsc-dpr': [\"\"\"The stable was very roomy, with four good stalls; a large swinging window opened into the yard , which made *it* pleasant and airy.\"\"\"]\n",
172
+
" }"
173
+
]
174
+
},
175
+
{
176
+
"source": [
177
+
"## 5. Define the Spark NLP pipeline"
178
+
],
179
+
"cell_type": "markdown",
180
+
"metadata": {}
181
+
},
182
+
{
183
+
"cell_type": "code",
184
+
"execution_count": 12,
185
+
"metadata": {},
186
+
"outputs": [
187
+
{
188
+
"output_type": "stream",
189
+
"name": "stdout",
190
+
"text": [
191
+
"t5_base download started this may take some time.\n",
"+-----------------------------------------------------------------------------------------------------------------------------------+------+\n| text|result|\n+-----------------------------------------------------------------------------------------------------------------------------------+------+\n|The stable was very roomy, with four good stalls; a large swinging window opened into the yard , which made *it* pleasant and airy.|[True]|\n+-----------------------------------------------------------------------------------------------------------------------------------+------+\n\n"
0 commit comments