I am a postdoctoral researcher at Queen Mary University of London. My current focus is on label aggregation techniques and the training of models on data with disagreements.
I am part of the DALI project, in charge of the analysis of the annotations collected using Phrase Detectives, a GWAP (game with a purpose) developed for gathering labels for coreference resolution, with over 5 million judgements collected.
[!!] A book on Statistical Methods for Annotation Analysis is in the works, in collaboration with Ron Artstein and Massimo Poesio. This is going to be part of the Synthesis Lectures on Human Language Technologies (morganclaypool.com) series.
A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alexandra Uma and Udo Kruschwitz
In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2019
[pdf] [data] [bib]
A new workshop on aggregating and analysing crowdsourced annotations for NLP (AnnoNLP) has been accepted at EMNLP 2019. More details here.
A Probabilistic Annotation Model for Crowdsourcing Coreference
Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio
In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2018
[pdf] [code] [bib]
Comparing Bayesian Models of Annotation
Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio
In Transactions of the Association for Computational Linguistics (TACL) , 2018
[pdf] [bib] [video] [slides]