I am a postdoctoral researcher at Queen Mary University of London. My research focuses on models of annotation, probabilistic and neural, for creating resources and to more efficiently train machine learning models.
I am part of the DALI project (ERC Advanced Grant 695662) in charge of making accessible the resources produced by Phrase Detectives, a game with a purpose designed to collect labels for coreference resolution, one of the most successful application of crowdsourcing to an NLP task, with millions of labels collected so far.
News
[!!] A book on Statistical Methods for Annotation Analysis is to be released, in collaboration with Ron Artstein and Massimo Poesio. This is going to be part of the Synthesis Lectures on Human Language Technologies (morganclaypool.com) series.
[!] A tutorial on aggregating and learning from multiple annotators has been accepted at EACL 2021. More details on its website. You can watch the recorded video here.
A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
Massimo Poesio, Jon Chamberlain, Silviu Paun, Juntao Yu, Alexandra Uma and Udo Kruschwitz
In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 2019
[pdf] [data] [bib]
A new workshop on aggregating and analysing crowdsourced annotations for NLP (AnnoNLP) has been accepted at EMNLP 2019. More details here.
A Probabilistic Annotation Model for Crowdsourcing Coreference
Silviu Paun, Jon Chamberlain, Udo Kruschwitz, Juntao Yu, Massimo Poesio
In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP) , 2018
[pdf] [code] [bib]
Comparing Bayesian Models of Annotation
Silviu Paun, Bob Carpenter, Jon Chamberlain, Dirk Hovy, Udo Kruschwitz, Massimo Poesio
In Transactions of the Association for Computational Linguistics (TACL) , 2018
[pdf] [bib] [video] [slides]