ACL Logo

ACL Anthology Reference Corpus (ACL ARC)

[ Back to the ACL home page ]
[ Back to WING ]

This is the home page of the ACL Anthology Reference Corpus, a corpus of scholarly publications about Computational Linguistics. This corpus is a canonicalized subset of the ACL Anthology, up to February 2007, consisting of 10,921 articles. We hope this frozen corpus will be used for benchmarking applications for scholarly and bibliometric data processing.

Download the corpus



Group Members

Tools and Related Links

Links to information about the corpus itself, alternative and related corpora and specific tools to process it.

Here we list some related tools for bibliographic processing, and related sites for bibliographic research.


Our efforts have been supported by the grassroots initiative call made by the ACL Exec at the ACL annual 2007 meeting in Prague. We would like to acknowledge the support of the ACL Exec in encouraging this form of collaboration.

Thanks also go to Behrang Qasemizadeh, PhD student in the Unit for Natural Language Processing, Digital Enterprise Research Institute of the National University of Ireland, Galway (funded by Science Foundation Ireland) for his work on the SEPID ARC format and to Martin Helmout of Southampton for his work on proofchecking the files and schema of the XML files.

Min-Yen Kan <>
Created on: Wed May 5 16:07:15 2004 | Version: 1.0 | Last modified: Sat Mar 29 00:26:41 2008