Skip to content

Comparison of three approaches for the Natural Language Processing task WiC

Notifications You must be signed in to change notification settings

prosho-97/Word-in-Context-Disambiguation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Word-in-Context-Disambiguation

Comparison of three approaches for the Natural Language Processing task WiC.

Description

An explanation regarding task and tested models can be found in the report.pdf file.

OS

I developed this code on an Ubuntu 20.04.2 LTS machine.

How to run

Requirements

  • conda;

  • docker, to avoid any issue pertaining code runnability.

Notes

Unless otherwise stated, all commands here are expected to be run from the root directory of this project.

Setup Environment

To run test.sh, we need to perform two additional steps:

  • Install Docker
  • Setup a client

test.sh essentially setups a server exposing the model through a REST Api and then queries this server, evaluating the model. So first, you need to install Docker:

curl -fsSL get.docker.com -o get-docker.sh
sudo sh get-docker.sh
rm get-docker.sh
sudo usermod -aG docker $USER

Unfortunately, for the latter command to have effect, you need to logout and re-login. Do it before proceeding.

The model will be exposed through a REST server. In order to call it, we need a client. The client has been written in the evaluation script, but it needs some dependencies to run. We will be using conda to create the environment for this client.

conda create -n nlp2021-hw1 python=3.7
conda activate nlp2021-hw1
pip install -r requirements.txt

Run

test.sh is a simple bash script. To run it:

conda activate nlp2021-hw1
bash test.sh data/dev.jsonl

Actually, you can replace data/dev.jsonl to point to a different file, as far as the target file has the same format.

Additional instructions

The StudentModel class takes as input two parameters. One is the device, the other one has as default value "SE": the best model is tested. The other two possible parameter values are "IDF-AVG" and "SIF", that can be used in order to test the other two implemented models. In the stud folder there are the corresponding three ipython notebooks containing the three training codes.

In order to be able to run the code with the "SE" model, it is necessary to download the glove.840B.300d vectors (they are needed also for the other two models) and the model and put them in the model/ folder.

About

Comparison of three approaches for the Natural Language Processing task WiC

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages