NAMED ENTITY RECOGNITION (NER) FOR NEWS ARTICLES

Authors

  • Tejal Chavan Department of Computer Engineering and Technology (AIDS), Dr. Vishwanath Karad MIT World Peace University, Pune, Maharashtra, India. Author
  • Seema Patil Department of Computer Engineering and Technology, Dr. Vishwanath Karad MIT World Peace University, Pune, Maharashtra, India. Author

Keywords:

Named Entity Recognition (NER), Conditional Random Fields (CRF), CoNLL-2003 Dataset, Information Extraction, Natural Language Processing (NLP), Feature Engineering

Abstract

Named Entity Recognition (NER) plays a pivotal role in automating the extraction and categorization of named entities from textual data, enabling efficient information retrieval and analysis across various domains. This paper presents a comprehensive study on NER techniques, focusing particularly on their application in news articles. The project employs Conditional Random Fields (CRF) as a discriminative probabilistic model for sequence labeling tasks, leveraging feature engineering and preprocessing steps for accurate entity recognition. The CoNLL-2003 dataset serves as the benchmark dataset for training and evaluating the CRF model, showcasing its performance in identifying entities such as persons, organizations, and locations.

References

Vychegzhanin, Sergey, and Evgeny Kotelnikov. "Comparison of named entity recognition tools applied to news articles." 2019 Ivannikov Ispras Open Conference (ISPRAS). IEEE, 2019.

Nadeau, David, and Satoshi Sekine. "A survey of named entity recognition and classification." Lingvisticae Investigationes 30.1 (2007): 3-26.

Sang, Erik F., and Fien De Meulder. "Introduction to the CoNLL-2003 shared task: Language-independent named entity recognition." arXiv preprint cs/0306050 (2003).

Rodriquez, Kepa Joseba, et al. "Comparison of named entity recognition tools for raw OCR text." Konvens. 2012.

Linhares Pontes, Elvys, et al. "Impact of OCR quality on named entity linking." Digital Libraries at the Crossroads of Digital Information for the Future: 21st International Conference on Asia-Pacific Digital Libraries, ICADL 2019, Kuala Lumpur, Malaysia, November 4–7, 2019, Proceedings 21. Springer International Publishing, 2019.

Downloads

Published

2024-05-09

How to Cite

Tejal Chavan, & Seema Patil. (2024). NAMED ENTITY RECOGNITION (NER) FOR NEWS ARTICLES. INTERNATIONAL JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH AND DEVELOPMENT (IJAIRD), 2(1), 103-112. https://iaeme-library.com/index.php/IJAIRD/article/view/IJAIRD_02_01_01