diff --git a/README.md b/README.md
index 2136ff0..d22598f 100644
--- a/README.md
+++ b/README.md
@@ -1,10 +1,38 @@
-Code for the EMNLP paper, "[Bootstrapping Transliteration with Guided Discovery for Low-Resource Languages](http://shyamupa.com/papers/UKR18.pdf)".
+### Using Trained Models for Generating Transliterations
 
-[[https://github.com/shyamupa/hma-translit/blob/master/image.pdf|alt=model figure]]
+Download and untar the relevant trained model.
+Right now the models for [bengali](http://bilbo.cs.illinois.edu/~upadhya3/bengali.tar.gz), [kannada](http://bilbo.cs.illinois.edu/~upadhya3/kannada.tar.gz) or [hindi](http://bilbo.cs.illinois.edu/~upadhya3/hindi.tar.gz) trained on the NEWS2015 datasets are available. 
 
-Tested using pytorch version '0.3.1.post2' with python3.
+Each tarball contains the vocab files and the pytorch model.
 
-## Running the code
+#### Interactive Mode
+To run in interactive mode
+
+```bash
+./load_and_test_model_interactive.sh hindi_data.vocab hindi.model
+```
+
+#### Get Predictions for Test input
+1. First prepare a test file (let's call it `hindi.test`) such that each line contains a sequence of space separated characters of each input token,
+
+```
+आ च र े क र
+आ च व ल
+```
+
+2. Then run the trained model on it using the following command,
+```bash
+./load_and_test_model_on_files.sh hindi_data.vocab hindi.model hindi.test hindi.test.out
+```
+This will generate output in the test file as follows,
+
+```
+आ च र े क र      a c h a r e k a r;a c h a b e k a r;a a c h a r e k a r -0.6695770507547368;-2.079195646460341;-2.465612842870943
+``` 
+
+where the 2nd column is the (; delimited) output from the beam search (using `beam_width` of 3) and 3rd column contains the (';' delimited) corresponding scores for each item. 
+That is, the model score for `a c h a r e k a r` was  `-0.6695770507547368`. 
+### Training Your Own Model
 
 1. First compile the C code for the aligner.
 ```bash
@@ -20,18 +48,28 @@ x1 x2 x3<tab>y1 y2 y3 y4 y5
 where `x1x2x3` is the input word (`xi` is the character), and `y1y2y3y4y5` is the desired output (transliteration). Example train and test files for bengali are in data/ folder. There is a optional 3rd column marking whether the word is *native* or *foreign* (see the paper for these terms); this column can be ignored for most purposes. 
 
 
-3. Run `train_model_on_files.sh` on your train (say `train.txt`) and dev file (say `dev.txt`) as follows,
+3. Create the vocab files and aligned data using `prepare_data.sh`
 
+```bash
+./prepare_data.sh hindi_train.txt hindi_dev.txt 100 hindi_data.vocab hindi_data.aligned  
 ```
-./train_model_on_files.sh train.txt dev.txt 100 translit.model
-```
 
-where 100 is the random seed and translit.model is the output model. Other parameters(see `utils/arguments.py` for options) can be specified by modifying the `train_model_on_files.sh` script appropriately.
+This will create two vocab files `hindi_data.vocab.envoc` and `hindi_data.vocab.frvoc`, and a file `hindi_data.aligned` containing the (monotonically) aligned training data .
+
 
-4. Test the trained model as follows,
+4. Run `train_model_on_files.sh` on your train (say train.txt) and dev file (dev.txt) as follows,
 
+```bash
+./train_model_on_files.sh hindi_data.vocab hindi_data.aligned hindi_dev.txt 100 hindi.model
 ```
-./load_and_test_model_on_files.sh train.txt test.txt translit.model 100 output.txt
+
+where 100 is the random seed and hindi.model is the output model. 
+Other parameters like embedding size, hidden size (see `utils/arguments.py` for all options) can be specified by modifying the `train_model_on_files.sh` script appropriately.
+
+5. Test the trained model as follows,
+
+```bash
+./load_and_test_model_on_files.sh hindi_data.vocab hindi.model hindi_test.txt output.txt
 ```
 
 The output should report relevant metrics,
@@ -59,8 +97,12 @@ The output should report relevant metrics,
 
 There is also a interactive mode where one can input test words directly,
 
+```bash
+./load_and_test_model_interactive.sh <ftrain> <model> <seed>
+```
+
+You will see a prompt to enter surface forms in the source writing script (see below)
 ```
-./load_and_test_model_interactive.sh train.txt translit.model 100
 ...
 ...
 :INFO: => loading checkpoint hindi.model
@@ -70,13 +112,3 @@ enter surface:ओबामा
 [(-0.4624647759074629, 'o b a m a')]
 ```
 
-### Citation
-
-```
-@InProceedings{UKR18,
-  author =       {Upadhyay, Shyam and Kodner, Jordan and Roth, Dan},
-  title =        {Bootstrapping Transliteration with Guided Discovery for Low-Resource Languages},
-  booktitle =    {EMNLP},
-  year =         {2018},
-}
-```
diff --git a/create_data.sh b/create_data.sh
new file mode 100755
index 0000000..978c701
--- /dev/null
+++ b/create_data.sh
@@ -0,0 +1,30 @@
+#!/usr/bin/env bash
+ME=`basename $0` # for usage message
+
+if [[ "$#" -ne 5 ]]; then 	# number of args
+    echo "USAGE: ${ME} <ftrain> <ftest> <seed> <vocabfile> <aligned_file>"
+    exit
+fi
+ftrain=$1
+ftest=$2
+seed=$3
+vocabfile=$4
+aligned_file=$5
+
+time python -m seq2seq.prepare_data \
+     --ftrain ${ftrain} \
+     --ftest ${ftest} \
+     --vocabfile ${vocabfile} \
+     --aligned_file ${aligned_file} \
+     --seed ${seed}
+
+
+
+
+
+if [[ $? == 0 ]]        # success
+then
+    :                   # do nothing
+else                    # something went wrong
+    echo "SOME PROBLEM OCCURED";            # echo file with problems
+fi
diff --git a/load_and_test_model_interactive.sh b/load_and_test_model_interactive.sh
index 99f5868..6ddd4a7 100755
--- a/load_and_test_model_interactive.sh
+++ b/load_and_test_model_interactive.sh
@@ -1,21 +1,19 @@
 #!/usr/bin/env bash
 ME=`basename $0` # for usage message
 
-if [ "$#" -ne 3 ]; then 	# number of args
-    echo "USAGE: ${ME} <ftrain> <model> <seed>"
+if [[ "$#" -ne 2 ]]; then 	# number of args
+    echo "USAGE: ${ME} <vocabfile> <model>"
     echo 
     exit
 fi
-ftrain=$1
+vocabfile=$1
 model=$2
-seed=$3
-time python -m seq2seq.main \
-     --ftrain ${ftrain} \
+time python -m seq2seq.predict \
+     --vocabfile ${vocabfile} \
      --mono \
      --beam_width 1 \
      --restore ${model} \
-     --interactive \
-     --seed ${seed}
+     --interactive
 
 if [[ $? == 0 ]]        # success
 then
diff --git a/load_and_test_model_on_files.sh b/load_and_test_model_on_files.sh
index 713e5ac..eedf37a 100755
--- a/load_and_test_model_on_files.sh
+++ b/load_and_test_model_on_files.sh
@@ -1,24 +1,22 @@
 #!/usr/bin/env bash
 ME=`basename $0` # for usage message
 
-if [ "$#" -ne 5 ]; then 	# number of args
-    echo "USAGE: <ftrain> <ftest> <model> <seed> <outfile>"
+if [[ "$#" -ne 4 ]]; then 	# number of args
+    echo "USAGE: <vocabfile> <model> <ftest> <outfile>"
     echo "$ME"
     exit
 fi
-ftrain=$1
-ftest=$2
-model=$3
-seed=$4
-out=$5
-time python -m seq2seq.main \
-     --ftrain ${ftrain} \
+vocabfile=$1
+model=$2
+ftest=$3
+outfile=$4
+time python -m seq2seq.predict \
+     --vocabfile ${vocabfile} \
      --ftest ${ftest} \
      --mono \
      --beam_width 1 \
      --restore ${model} \
-     --seed ${seed} \
-     --dump ${out}
+     --dump ${outfile}
 
 
 
diff --git a/readers/aligned_reader.py b/readers/aligned_reader.py
index 46ab9ab..2ef9599 100644
--- a/readers/aligned_reader.py
+++ b/readers/aligned_reader.py
@@ -1,20 +1,15 @@
 from __future__ import division
 from __future__ import print_function
 
-import sys
 import logging
+import random
 
-from seq2seq.lang import Lang
-from seq2seq.constants import ALIGN_SYMBOL
 from baseline import align_utils
-
-import random
-from collections import Counter
-# from seq2seq.main import oracle_action
+from seq2seq.constants import ALIGN_SYMBOL
 from seq2seq.constants import STEP
 
+
 # logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
-import argparse
 
 
 def safe_replace_spaces(s):
@@ -24,6 +19,29 @@ def safe_replace_spaces(s):
     return s
 
 
+def subsample_examples(examples, frac, single_token):
+    new_examples = []
+    for ex in examples:
+        fr, en, weight, is_eng = ex
+        frtokens, entokens = fr.split(" "), en.split(" ")
+        if len(frtokens) != len(entokens): continue
+        if single_token:
+            if len(frtokens) > 1 or len(entokens) > 1: continue
+        for frtok, entok in zip(frtokens, entokens):
+            new_examples.append((frtok, entok, weight, is_eng))
+    examples = new_examples
+    logging.info("new examples %d", len(examples))
+    # subsample if needed
+    random.shuffle(examples)
+    if frac < 1.0:
+        tmp = examples[0:int(frac * len(examples))]
+        examples = tmp
+    elif frac > 1.0:
+        tmp = examples[0:int(frac)]
+        examples = tmp
+    return examples
+
+
 def read_examples(fpath, native_or_eng="both", remove_spaces=False, weight=1.0):
     examples = []
     bad = 0
diff --git a/seq2seq/main.py b/seq2seq/main.py
index 5732fae..92d1f70 100644
--- a/seq2seq/main.py
+++ b/seq2seq/main.py
@@ -1,67 +1,27 @@
-import random
 import logging
+import random
 import sys
 
+import numpy as np
 import torch
 import torch.nn as nn
-import numpy as np
 
-from utils.arguments import PARSER
-from readers.aligned_reader import load_aligned_data, read_examples
-from seq2seq.constants import STEP
+from readers.aligned_reader import read_examples
 from seq2seq.evaluators.reporter import AccReporter, get_decoded_words
-from seq2seq.lang import Lang
+from seq2seq.model_utils import load_checkpoint, model_builder, setup_optimizers
+from seq2seq.prepare_data import langcodes, load_vocab_and_examples
 from seq2seq.runner import run
 from seq2seq.trainers.monotonic_train import MonotonicTrainer
-from seq2seq.model_utils import load_checkpoint, model_builder, setup_optimizers
+from utils.arguments import PARSER
 
 # logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s', level=logging.INFO)
 logging.basicConfig(format=':%(levelname)s: %(message)s', level=logging.INFO)
 
-
-def subsample_examples(examples, frac, single_token):
-    new_examples = []
-    for ex in examples:
-        fr, en, weight, is_eng = ex
-        frtokens, entokens = fr.split(" "), en.split(" ")
-        if len(frtokens) != len(entokens): continue
-        if single_token:
-            if len(frtokens) > 1 or len(entokens) > 1: continue
-        for frtok, entok in zip(frtokens, entokens):
-            new_examples.append((frtok, entok, weight, is_eng))
-    examples = new_examples
-    logging.info("new examples %d", len(examples))
-    # subsample if needed
-    random.shuffle(examples)
-    if frac < 1.0:
-        tmp = examples[0:int(frac * len(examples))]
-        examples = tmp
-    elif frac > 1.0:
-        tmp = examples[0:int(frac)]
-        examples = tmp
-    return examples
-
-
-def index_vocab(examples, fr_lang, en_lang):
-    for ex in examples:
-        raw_x, raw_y, xs, ys, weight, is_eng = ex
-        fr_lang.index_words(xs)
-        en_lang.index_words(ys)
-    logging.info("train size %d", len(examples))
-
-
-langcodes = {"hi": "hindi", "fa": "farsi", "ta": "tamil", "ba": "bengali", "ka": "kannada", "he": "hebrew",
-             "th": "thai"}
-
 if __name__ == '__main__':
     args = PARSER.parse_args()
     args = vars(args)
     logging.info(args)
-    batch_first = args["batch_first"]
-    device_id = args["device_id"]
     seed = args["seed"]
-    native_or_eng = args["nat_or_eng"]
-    single_token = args["single_token"]
 
     remove_spaces = True
     np.random.seed(seed)
@@ -71,31 +31,15 @@ def index_vocab(examples, fr_lang, en_lang):
 
     lang = langcodes[args["lang"]]
 
-    trainpath = "data/%s/%s_train_annotateEN" % (lang, lang) if args["ftrain"] is None else args["ftrain"]
-    testpath = "data/%s/%s_test_annotateEN" % (lang, lang) if args["ftest"] is None else args["ftest"]
-
-    examples = read_examples(fpath=trainpath,
-                             native_or_eng=native_or_eng,
-                             remove_spaces=remove_spaces)
+    testpath = args["ftest"]
 
-    examples = subsample_examples(examples=examples, frac=args["frac"], single_token=single_token)
-
-    fr_lang, en_lang = Lang(name="fr"), Lang(name="en")
-    examples = load_aligned_data(examples=examples,
-                                 mode="mcmc",
-                                 seed=seed)
-    index_vocab(examples, fr_lang, en_lang)
-    en_lang.index_word(STEP)
-    fr_lang.compute_maps()
-    en_lang.compute_maps()
-    # see_phrase_alignments(examples=examples)
+    fr_lang, en_lang, examples = load_vocab_and_examples(vocabfile=args["vocabfile"], aligned_file=args["aligned_file"])
     logging.info(fr_lang.word2index)
     logging.info(en_lang.word2index)
+
     # ALWAYS READ ALL TEST EXAMPLES
     test = read_examples(fpath=testpath)
-    train = read_examples(fpath=trainpath)
 
-    train = [ex for ex in train if '  ' not in ex[0] and '  ' not in ex[1]]
     logging.info("input vocab: %d", fr_lang.n_words)
     logging.info("output vocab: %d", en_lang.n_words)
     logging.info("beam width: %d", args["beam_width"])
@@ -113,8 +57,6 @@ def index_vocab(examples, fr_lang, en_lang):
     # Begin!
     test_reporter = AccReporter(args=args,
                                 dump_file=args["dump"])
-    train_reporter = AccReporter(args=args,
-                                 dump_file=args["dump"] + ".train.txt" if args["dump"] is not None else None)
 
     if args["restore"]:
         if "," in args["restore"]:
@@ -147,5 +89,4 @@ def index_vocab(examples, fr_lang, en_lang):
         run(args=args,
             examples=examples,
             trainer=trainer, evaler=evaler, criterion=criterion,
-            train=train, test=test,
-            train_reporter=train_reporter, test_reporter=test_reporter)
+            test=test,test_reporter=test_reporter)
diff --git a/seq2seq/model_utils.py b/seq2seq/model_utils.py
index afd454a..61ae2aa 100644
--- a/seq2seq/model_utils.py
+++ b/seq2seq/model_utils.py
@@ -31,7 +31,7 @@ def setup_optimizers(args, encoder, decoder):
 
 def model_builder(args, fr_lang, en_lang):
     bidi = args["bidi"]
-    device_id = args["device_id"]
+    # device_id = args["device_id"]
     batch_first = args["batch_first"]
     vector_size = args["wdim"]
     hidden_size = args["hdim"]
@@ -73,9 +73,9 @@ def model_builder(args, fr_lang, en_lang):
     logging.info(encoder)
     logging.info(decoder)
     # Move models to GPU
-    if device_id is not None:
-        encoder.cuda(device_id)
-        decoder.cuda(device_id)
+    # if device_id is not None:
+    #     encoder.cuda(device_id)
+    #     decoder.cuda(device_id)
     return encoder, decoder, evaler
 
 
diff --git a/seq2seq/predict.py b/seq2seq/predict.py
new file mode 100644
index 0000000..f0a724b
--- /dev/null
+++ b/seq2seq/predict.py
@@ -0,0 +1,62 @@
+import logging
+import sys
+
+from seq2seq.evaluators.reporter import get_decoded_words
+from seq2seq.model_utils import load_checkpoint, model_builder, setup_optimizers
+from seq2seq.prepare_data import load_vocab
+from utils.arguments import PARSER
+
+logging.basicConfig(format=':%(levelname)s: %(message)s', level=logging.INFO)
+
+if __name__ == '__main__':
+    args = PARSER.parse_args()
+    args = vars(args)
+    logging.info(args)
+
+    fr_lang, en_lang = load_vocab(vocabfile=args["vocabfile"])
+    logging.info(fr_lang.word2index)
+    logging.info(en_lang.word2index)
+
+    logging.info("input vocab: %d", fr_lang.n_words)
+    logging.info("output vocab: %d", en_lang.n_words)
+    logging.info("beam width: %d", args["beam_width"])
+
+    # Initialize models
+    encoder, decoder, evaler = model_builder(args, fr_lang=fr_lang, en_lang=en_lang)
+    enc_opt, dec_opt, enc_sch, dec_sch = setup_optimizers(args=args, encoder=encoder, decoder=decoder)
+
+    if args["restore"]:
+        load_checkpoint(encoder=encoder, decoder=decoder,
+                        enc_opt=enc_opt, dec_opt=dec_opt,
+                        ckpt_path=args["restore"])
+        if args["interactive"]:
+            try:
+                while True:
+                    surface = input("enter surface:")
+                    surface = " ".join(list(surface))
+                    print(surface)
+                    x, y, weight, is_eng = surface, None, 1.0, False
+                    decoded_outputs = evaler.infer_on_example(sentence=x)
+                    scores_and_words = get_decoded_words(decoded_outputs)
+                    decoded_words = [w for s, w in scores_and_words]
+                    scores = [s for s, w in scores_and_words]
+                    print(scores_and_words)
+            except KeyboardInterrupt:
+                print('interrupted!')
+                sys.exit(0)
+        else:
+            testpath = args["ftest"]
+            with open(args["dump"], "w") as out:
+                for idx, line in enumerate(open(testpath)):
+                    surface = line.strip()
+                    x, y, weight, is_eng = surface, None, 1.0, False
+                    if idx > 0 and idx % 200 == 0:
+                        logging.info("running infer on example %d", idx)
+                    decoded_outputs = evaler.infer_on_example(sentence=x)
+                    scores_and_words = get_decoded_words(decoded_outputs)
+                    # decoded_words = [w for s, w in scores_and_words]
+                    # scores = [s for s, w in scores_and_words]
+                    beam_outputs = ";".join([word for score, word in scores_and_words])
+                    beam_scores = ";".join([str(score) for score, word in scores_and_words])
+                    buf = f"{x}\t{beam_outputs}\t{beam_scores}\n"
+                    out.write(buf)
diff --git a/seq2seq/prepare_data.py b/seq2seq/prepare_data.py
new file mode 100644
index 0000000..19a22e8
--- /dev/null
+++ b/seq2seq/prepare_data.py
@@ -0,0 +1,87 @@
+import logging
+import pickle
+import random
+
+import numpy as np
+import torch
+
+from utils.arguments import PARSER
+from readers.aligned_reader import load_aligned_data, read_examples, subsample_examples
+from seq2seq.lang import Lang
+from seq2seq.constants import STEP
+
+
+def index_vocab(examples, fr_lang, en_lang):
+    for ex in examples:
+        raw_x, raw_y, xs, ys, weight, is_eng = ex
+        fr_lang.index_words(xs)
+        en_lang.index_words(ys)
+    logging.info("train size %d", len(examples))
+
+
+def load_vocab_and_examples(vocabfile, aligned_file):
+    with open(vocabfile + ".frvoc", 'rb') as f:
+        fr_lang = pickle.load(f)
+    with open(vocabfile + ".envoc", 'rb') as f:
+        en_lang = pickle.load(f)
+    with open(aligned_file, 'rb') as f:
+        examples = pickle.load(f)
+    return fr_lang, en_lang, examples
+
+
+def load_vocab(vocabfile):
+    with open(vocabfile + ".frvoc", 'rb') as f:
+        fr_lang = pickle.load(f)
+    with open(vocabfile + ".envoc", 'rb') as f:
+        en_lang = pickle.load(f)
+    return fr_lang, en_lang
+
+
+def save_vocab_and_examples(fr_lang, en_lang, examples, vocabfile, aligned_file):
+    with open(vocabfile + ".frvoc", 'wb') as f:
+        pickle.dump(fr_lang, file=f)
+    with open(vocabfile + ".envoc", 'wb') as f:
+        pickle.dump(en_lang, file=f)
+    with open(aligned_file, 'wb') as f:
+        pickle.dump(examples, file=f)
+
+
+langcodes = {"hi": "hindi", "fa": "farsi", "ta": "tamil", "ba": "bengali", "ka": "kannada", "he": "hebrew",
+             "th": "thai"}
+
+if __name__ == '__main__':
+    args = PARSER.parse_args()
+    args = vars(args)
+    logging.info(args)
+    # batch_first = args["batch_first"]
+    # device_id = args["device_id"]
+    seed = args["seed"]
+    native_or_eng = args["nat_or_eng"]
+    single_token = args["single_token"]
+
+    remove_spaces = True
+    np.random.seed(seed)
+    random.seed(seed)
+    torch.manual_seed(seed)
+    torch.cuda.manual_seed(seed)
+
+    lang = langcodes[args["lang"]]
+
+    trainpath = "data/%s/%s_train_annotateEN" % (lang, lang) if args["ftrain"] is None else args["ftrain"]
+    testpath = "data/%s/%s_test_annotateEN" % (lang, lang) if args["ftest"] is None else args["ftest"]
+
+    examples = read_examples(fpath=trainpath,
+                             native_or_eng=native_or_eng,
+                             remove_spaces=remove_spaces)
+
+    examples = subsample_examples(examples=examples, frac=args["frac"], single_token=single_token)
+
+    fr_lang, en_lang = Lang(name="fr"), Lang(name="en")
+    examples = load_aligned_data(examples=examples,
+                                 mode="mcmc",
+                                 seed=seed)
+    index_vocab(examples, fr_lang, en_lang)
+    en_lang.index_word(STEP)
+    fr_lang.compute_maps()
+    en_lang.compute_maps()
+    save_vocab_and_examples(fr_lang, en_lang, examples, vocabfile=args["vocabfile"], aligned_file=args["aligned_file"])
diff --git a/seq2seq/runner.py b/seq2seq/runner.py
index ec664fc..cd591ca 100644
--- a/seq2seq/runner.py
+++ b/seq2seq/runner.py
@@ -8,7 +8,7 @@
 __author__ = 'Shyam'
 
 
-def run(args, examples, trainer, criterion, evaler, train, test, test_reporter, train_reporter):
+def run(args, examples, trainer, criterion, evaler, test, test_reporter, train=None, train_reporter=None):
     n_epochs = args["iters"]
     logging.info("training on %d examples for %d epochs", len(examples), n_epochs)
     random.shuffle(examples)
diff --git a/train_model_on_files.sh b/train_model_on_files.sh
index 98a2f8d..42d54ed 100755
--- a/train_model_on_files.sh
+++ b/train_model_on_files.sh
@@ -1,18 +1,20 @@
 #!/usr/bin/env bash
 ME=`basename $0` # for usage message
 
-if [ "$#" -ne 4 ]; then 	# number of args
-    echo "USAGE: ${ME} <ftrain> <ftest> <seed> <model_path>"
+if [[ "$#" -ne 5 ]]; then 	# number of args
+    echo "USAGE: ${ME} <vocabfile> <aligned_file> <fdev> <seed> <model_path>"
     exit
 fi
-ftrain=$1
-ftest=$2
-seed=$3
-model=$4
+vocabfile=$1
+aligned_file=$2
+fdev=$3
+seed=$4
+model=$5
 
 time python -m seq2seq.main \
-     --ftrain ${ftrain} \
-     --ftest ${ftest} \
+     --vocabfile ${vocabfile} \
+     --aligned_file ${aligned_file} \
+     --ftest ${fdev} \
      --mono \
      --beam_width 1 \
      --save ${model} \
diff --git a/utils/arguments.py b/utils/arguments.py
index 088cb8c..d805df5 100644
--- a/utils/arguments.py
+++ b/utils/arguments.py
@@ -1,6 +1,6 @@
 import argparse
 
-PARSER = argparse.ArgumentParser(description='entity linker')
+PARSER = argparse.ArgumentParser(description='transliteration with monotonic attention')
 PARSER.add_argument('--iters', type=int, default=20, help='# train iters (default: 20)')
 PARSER.add_argument('--maxsteps', type=int, default=500000, help='# train iters (default: 5)')
 PARSER.add_argument('--batch_size', type=int, default=1, help='batch size (default: 1)')
@@ -32,8 +32,8 @@
 PARSER.add_argument('--ftest', type=str, help='test/val file')
 PARSER.add_argument('--frac', type=float, default=1.0, help='frac of train data')
 PARSER.add_argument('--dump', type=str, default=None, help='to dump test predictions')
-PARSER.add_argument('--device_id', type=int, default=None, help='gpu device')
-PARSER.add_argument('--ncands', type=int, default=20, help='ncands')
+# PARSER.add_argument('--device_id', type=int, default=None, help='gpu device')
+# PARSER.add_argument('--ncands', type=int, default=20, help='ncands')
 PARSER.add_argument('--no-bidi', dest='bidi', action='store_false', help='do not use bidirectional')
 PARSER.set_defaults(bidi=True)
 PARSER.add_argument('--no-batch-first', dest='batch_first', action='store_false', help='do not use batch first')
@@ -41,4 +41,6 @@
 PARSER.add_argument('--mono', dest='mono', action='store_true', help='use monotonic transliteration model')
 PARSER.add_argument('--interactive', action="store_true", dest="interactive")
 PARSER.add_argument('--outfile', action="store", dest="outfile")
+PARSER.add_argument('--vocabfile', action="store", dest="vocabfile")
+PARSER.add_argument('--aligned_file', action="store", dest="aligned_file")