Named Entities Workshop: Shared Task on Transliteration
(NEWS 2009 : An ACL-IJCNLP 2009 Workshop)
Workshop Focus
Named Entities (NEs) play a critical role in Natural Language Processing
(NLP) and Information Retrieval (IR) tasks, such as search, machine translation, document clustering, summarization, information extraction, etc. While identifying and analyzing NEs in a given natural language is a challenging research problem by itself, the phenomenal growth in the Internet user population, especially among the non-English speaking parts of the world, has extended this problem to the cross-language arena, making the handling of NEs in multiple languages critically important.
The purpose of this workshop is to bring together researchers interested in various aspects of NEs in natural language text. In addition, the NEWS workshop will feature a shared task on Machine Transliteration of NEs.
Topics of Interest
This workshop invites original research contributions on all aspects of NEs, including identification, analysis, extraction, mining, transformation and applications of NE to NLP and IR systems. The topics of interest include, but are not limited to the following:
NE Analysis
- Distributional characteristics of NEs in mono- & multi-lingual corpora
- Orthographic/phonetic characteristics of NEs
- NE origin/genre recognition
- Social network analysis and entity resolution NE extraction
- Language-independent monolingual NE extraction
- Cross-language NE extraction
- General Techniques
- Specific datasets (Wikipedia, news, etc.)
- Unsupervised and semi-supervised methods for NE extraction
- Complex NEs, domain-specific term extraction
- NE set expansion
- Creation of annotated data
Machine Transliteration
- Computational phonology, incl. modeling of phonological rules, structure,
behavior, etc.
- Transliteration modeling
- Phonetic, grapheme>phoneme and phoneme>grapheme conversions
- Statistical & machine learning based approaches, transliteration unit
alignments
- Forward and backward transliterations
- Learning transliteration from comparable corpora
- Transliteration lexicon construction
- Romanization of Asian languages
- Transliteration evaluation metrics
Applications
- Monolingual and Cross-Language IR, Information Extraction and Management
- Machine Translation
- Question Answering
- Computational Journalism
Important Dates
Task Details to be announced soon
Research Paper Submission Deadline: 1-May-2009
Acceptance Notification: 1-Jun-2009
Camera-Ready Copy Deadline: 7-Jun-2009
Workshop Date: 7 Aug 2009
Shared Task on Transliteration
Transliteration of NEs is necessary in many applications, such as machine translation, corpus alignment, cross-language IR, information extraction and automatic lexicon acquisition. This calls for high-performance transliteration systems, which is the focus of the shared task in this workshop. Details of the task will be made available soon in the workshop homepage,
Organizing Committee
Haizhou Li Institute for Infocomm Research + A Kumaran Microsoft Research India + Sanjeev Khudanpur Johns Hopkins University + Raghavendra Udupa Microsoft Research India + Min Zhang Institute for Infocomm Research + Monojit Choudhury Microsoft Research India
Program Committee
Kalika Bali, Microsoft Research India
Rafael Banchs, UPC, Spain
Sivaji Bandyopadhyay, Univ of Jadavpur, India
Pushpak Bhattacharyya, IIT-Bombay, India
Monojit Choudhury, Microsoft Research India
Marta Ruiz Costa-jussà, UPC, Spain
Jianfeng Gao, Microsoft Research, USA
Gregory Grefenstette, Exalead, France
Sanjeev Khudanpur, John Hopkins University, USA
Kevin Knight, ISI, USA
Greg Kondrak, Univ of Alberta, Canada
Olivia Kwong, City Univ, Hong Kong
Gina-Anne Levow, Univ of Chicago, USA
Arul Menezes, Microsoft Research, USA
Jong-Hoon Oh, NICT, Japan
Yan Qu, Advertising.com, USA
Dan Roth, Univ of Illinois, Urbana-Champaign, USA
Sunita Sarawagi, IIT-Bombay, India
Sudeshna Sarkar, IIT-Kharagpur, India
Richard Sproat, Univ of Illinois, Urbana-Champaign, USA
Keh-Yih Su, Behavior Design Corporation, Taiwan
Raghavendra Udupa, Microsoft Research India
Vasudeva Varma, IIIT-Hyderabad, India
Min Zhang, Institute for Infocomm Research, Singapore
Workshop & Contact Information