Skip to content

Commit f85a689

Browse files
committed
pushing for rtd
1 parent ca4b70e commit f85a689

File tree

3 files changed

+36
-20
lines changed

3 files changed

+36
-20
lines changed

docs/day2/cyoa.rst

+22-4
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,25 @@ Choose Your Own Adventure
44

55
The Choose Your Own Adventures have been structured to allow for pure model exploration without worrying as much about the dataset or the training routine. In each notebook, you will find an implemented dataset loading routine as well as an implemented training routine. These two parts should seem familiar to you given the last 2 days of content. What is not implemented is a model definition nor its instantiation.
66

7-
It is up to you what you want to use! Do you want to build the Continuous Bag of Words (CBOW)? Use an RNN or a CNN? Do you want to combine the CNN and RNN? Try out whatever you like and see if you can get the highest accuracy in class!
7+
8+
There are kinds of NLP Task you could solve:
9+
10+
- Classification
11+
- (day_2/CYOA-Amazon-Reviews) Sentiment Analysis with Amazon Reviews
12+
- (day_2/CYOA-Movie-Reviews) Sentiment Analysis with Movie Reviews
13+
- (day_2/CYOA-Surname-Classification) Surname Classication
14+
- (day_2/CYOA-Twitter-Language-ID) Twitter Language ID
15+
- (day_2/CYOA-CFPB-Classification) CFPB Classification
16+
- News categorization with NLTK's Brown corpus
17+
- Question Classification with NLTK's question dataset
18+
- Sequence Prediction
19+
- Language Model with NLTK's Shakespeare dataset
20+
- Sequence Tagging
21+
- NER with NLTK's CoNLL NER dataset
22+
- Chunking with NLTK's CoNLL Chunking dataset
23+
24+
The tasks that have notebooks are indicated. We did not include notebooks for every task to narrow the scope of the in-class work.
25+
826

927
Strategies for Model Exploration
1028
--------------------------------
@@ -50,12 +68,12 @@ More complicated models to try
5068
1. Better RNNs (Gated Recurrent Unit (GRU), Long Short Term Memory (LSTM))
5169
- Instead of using the simple RNN provided, use an RNN variant that has gating like GRU or LSTM
5270

71+
2. BiRNN
72+
- Use the bi-directional RNN model from the day_2 notebook
73+
5374
2. CNN + RNN
5475
- One thing you could try is to apply a CNN one or more times to create sequences of vectors that are informed by their neighboring vectors. Then, apply the RNN to learn a sequence model over these vectors. You can use the same method to pull out the final vector for each sequence, but with one caveat. If you apply the CNN in a way that shrinks the sequence dimension, then the indices of the final positions won't quite be right. One way to get around this is to have the CNN keep the sequence dimension the same size. This is done by setting the `padding` argument to be `kernel_size//2`. For example, if `kernel_size=3`, then it should be that `padding=1`. Similarly with `kernel_size=5`, then `padding=2`. The padding is added onto both sides of the sequence dimension.
5576

56-
3. Deep Averaging Network
57-
- The Deep Averaging Network is very similar to CBOW, but has one major differences: it applies an MLP to the pooled vectors.
58-
5977
4. Using Attention
6078
- If you're feeling ambitious, try implementing attention!
6179
- One way to do attention is use a Linear layer which maps feature vectors to scalars

docs/extras/load_pretrained_vectors.rst

+1-5
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,6 @@ Then, we can load that embedding matrix:
3333
.. code-block:: python
3434
3535
load_pretrained = True
36-
embedding_size = 32
3736
pretrained_embeddings = None
3837
3938
if load_pretrained:
@@ -47,7 +46,4 @@ And we can use it in an embedding layer:
4746

4847
.. code-block:: python
4948
50-
emb = nn.Embedding(embedding_dim=embedding_size,
51-
num_embeddings=num_embeddings,
52-
padding_idx=0,
53-
_weight=pretrained_embeddings)
49+
emb = nn.Embedding.from_pretrained(embeddings=pretrained_embeddings, freeze=False, padding_idx=0)

docs/index.rst

+13-11
Original file line numberDiff line numberDiff line change
@@ -15,18 +15,20 @@ Natural Language Processing (NLP) with PyTorch
1515
:caption: Day 1 Materials
1616

1717
day1/solutions
18+
19+
.. toctree::
20+
:hidden:
21+
:maxdepth: 3
22+
:caption: Day 2 Materials
23+
24+
day2/warmup
25+
day2/failfastprototypemode
26+
day2/tensorfu1
27+
1828
..
19-
.. toctree::
20-
:hidden:
21-
:maxdepth: 3
22-
:caption: Day 2 Materials
23-
24-
day2/warmup
25-
day2/failfastprototypemode
26-
day2/tensorfu1
27-
day2/tensorfu2
28-
day2/cyoa
29-
extras/index
29+
day2/tensorfu2
30+
day2/cyoa
31+
extras/index
3032
3133
..
3234
day2/cyoa

0 commit comments

Comments
 (0)