+TransCoder builds on advances in natural-language processing (NLP), in particular unsupervised NMT. The model uses a Transformer-based sequence-to-sequence architecture which consists of an attention-based encoder and decoder. Since obtaining a dataset for supervised learning would be difficult. It would require many pairs of equivalent code samples in both the source and target languages, the team opted to used monolingual datasets to do unsupervised learning, using three strategies. First, the model is trained on input sequences that have random tokens masked, the model must learn to predict the correct value for the masked tokens. Next, the model is trained on sequences that have been corrupted by randomly masking, shuffling, or removing tokens, the model must learn to output the corrected sequence. Finally, two version of these models are trained in parallel to do back-translation, one model learns to translate from the source to target language, and the other learns to translate back to the source.
0 commit comments