A. Pipitone, R. Pirrone

Cognitive Linguistics as the Underlying Framework for Semantic Annotation

Natural Language Processing

In recent years many attempts have been made to design suitable sets of rules aimed at extracting the semantic meaning from plain text, and to achieve annotation, but very few approaches make extensive use of grammars. Current systems are mainly focused on extracting the semantic role of the entities described in the text. This approach has limitations: in such applications the semantic role is conceived merely as the meaning of the involved entities without considering their context. As an example, current semantic annotators often specify a date entity without any annotation regarding the kind of the date itself i.e. a birth date, a book publication date, and so on. Moreover, these systems use ontologies that have been developed specifically for the system's purposes and have reduced portability. Extensive use of both linguistic resources and semantic representations of the domain are needed in this scenario, the semantic representation of the domain addresses the semantic interpretation of the context, while NLP tools can help to solve some linguistic problems related to the semantic annotation, as synonymy, ambiguities, and co-references. A novel framework inspired to Cognitive Linguistics theories is proposed in this work that is aimed at facing the problem outlined above. In particular, our work is based on Construction Grammar (CxG). CxG defines a "construction" as a form-meaning couple. We use RDF triples in the domain ontology as the "semantic seeds" to build constructions. A suitable set of rules based on linguistic typology have been designed to infer semantics and syntax from the semantic seed, while combining them as the poles of constructions. A hierarchy of rules to infer syntactic patterns for either single words or sentences using Word Net and Frame Net has been designed to overcome the limitations when expressing the syntactic poles using solely the terms stated in the ontology. As a consequence, semantic annotation of plain text is achieved by computing all possible

This article is authored also by Synbrain data scientists and collaborators. READ THE FULL ARTICLE