1 Department of Computer Engineering, North Tehran Branch, Islamic Azad University, Tehran, Iran.

2 Department of computer engineering, Shahid Beheshti University, Tehran, Iran.



Coreference resolution is one of the essential tasks of natural language
processing. This task identifies all in-text expressions that refer to the
same entity in the real world. Coreference resolution is used in other
fields of natural language processing, such as information extraction,
machine translation, and question-answering.
This article presents a new coreference resolution corpus in Persian
named Mehr corpus. The article's primary goal is to develop a Persian
coreference corpus that resolves some of the previous Persian corpus's
shortcomings while maintaining a high inter-annotator agreement. This
corpus annotates coreference relations for noun phrases, named
entities, pronouns, and nested named entities. Two baseline pronoun
resolution systems are developed, and the results are reported. The
corpus size includes 400 documents and about 170k tokens. Corpus
annotation is done by WebAnno preprocessing tool.


