Skip to content

This is a repository for data and code accompanying paper "Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options" (AACL 2020)

Notifications You must be signed in to change notification settings

nyu-mll/semi-automatic-nli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Semi-Automatic NLI Data Collection

This is a repository for data and code accompanying paper "Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options".

Datasets

The five datasets described in the paper are available under data/ directory: base_news, base_wiki, sim_news, sim_wiki, and translate_wiki. Each of the dataset comes with a training set and a test set, both in .jsonl format. Please refer to the paper for the statistics for each of dataset.

License

We use premises taken from the English Gigaword Fifth Edition, English Wikipedia and Simple Wikipedia (downloaded May 2020), and WikiMatrix. The English Gigaword is distributed under the LDC User Agreement license. Wikipedia is licensed under Creative Commons Attribution-ShareAlike 3.0 Unported License (CC-BY-SA) and the GNU Free Documentation License (GFDL).

Experiments

Code used for the experiments for the paper can be found under scripts. Please follow README in each sub-directory for more details. For experiments using jiant (we use v1.2), please follow the documentation for installation and instructions.

Citation

@inproceedings{vania2020asking,
    title = "{Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options}",
    author = "Vania, Clara  and
      Chen, Ruijie  and
      Bowman, Samuel R.",
    booktitle = "Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing",
    month = dec,
    year = "2020",
    address = "Online",
    publisher = "Association for Computational Linguistics"
}

About

This is a repository for data and code accompanying paper "Asking Crowdworkers to Write Entailment Examples: The Best of Bad Options" (AACL 2020)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published