Skip to content

Commit e8d2767

Browse files
authored
Update README.md
1 parent 31e74f1 commit e8d2767

File tree

1 file changed

+23
-1
lines changed

1 file changed

+23
-1
lines changed

README.md

+23-1
Original file line numberDiff line numberDiff line change
@@ -49,9 +49,31 @@ If you find this repository helpful, feel free to cite our publication:
4949
url={http://ceur-ws.org/Vol-2932/paper2.pdf}
5050
}
5151
```
52+
and check the latest version of the English ParaDetox in:
53+
```
54+
@inproceedings{logacheva-etal-2022-paradetox,
55+
title = "{P}ara{D}etox: Detoxification with Parallel Data",
56+
author = "Logacheva, Varvara and
57+
Dementieva, Daryna and
58+
Ustyantsev, Sergey and
59+
Moskovskiy, Daniil and
60+
Dale, David and
61+
Krotova, Irina and
62+
Semenov, Nikita and
63+
Panchenko, Alexander",
64+
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
65+
month = may,
66+
year = "2022",
67+
address = "Dublin, Ireland",
68+
publisher = "Association for Computational Linguistics",
69+
url = "https://aclanthology.org/2022.acl-long.469",
70+
pages = "6804--6818",
71+
abstract = "We present a novel pipeline for the collection of parallel data for the detoxification task. We collect non-toxic paraphrases for over 10,000 English toxic sentences. We also show that this pipeline can be used to distill a large existing corpus of paraphrases to get toxic-neutral sentence pairs. We release two parallel corpora which can be used for the training of detoxification models. To the best of our knowledge, these are the first parallel datasets for this task.We describe our pipeline in detail to make it fast to set up for a new language or domain, thus contributing to faster and easier development of new parallel resources.We train several detoxification models on the collected data and compare them with several baselines and state-of-the-art unsupervised approaches. We conduct both automatic and manual evaluations. All models trained on parallel data outperform the state-of-the-art unsupervised models by a large margin. This suggests that our novel datasets can boost the performance of detoxification systems.",
72+
}
73+
```
5274

5375
***
5476

5577
## Contacts
5678

57-
For any questions please contact Daryna Dementieva via [email](mailto:[email protected]) or [Telegram](https://t.me/dementyeva_ds).
79+
For any questions please contact Daryna Dementieva via [email](mailto:[email protected]).

0 commit comments

Comments
 (0)