Natural Language Processing
What is natural language processing
Natural Language Processing (NLP) refers to the computer processing of natural language, for whatever reason, regardless of the level of the depth of the analysis. Natural language refers to the everyday language that is used, like English for example, and it is also synonymous with human language, primarily to distinguish itself from other languages, such as computer language.
Over the last few years, as Internet has affirmed its positon in social media, the written language has gained increasing importance. As a result, even if we have access to an unprecedented amount of natural language data encoded, only the most advanced form of NLP software is able to take advantage of it. Therefore, CELI’s NPL technology opens new doors and opportunities.
Enabling technology to introduce language intelligence
CELI designs and develops software components and resources that are able to build applications that include language intelligence (semantic search engines, text mining, opinion mining):
- Written collections and corpora: generic corpora, for training and evaluation of NLP systems, or domain-specific corpora.
- Morphological lexicons, general or specialised: dictionaries contain information about morphosyntactic words.
- Semantic networks and words formalised according to web semantic standards that are in line with AGID guidelines for semantic interoperability.
- Modules for Morphological analysis (see below for supported languages)
- Modules for supervised (or unsupervised) automated classification
- Modules for Semantic and Syntactic analysis, based on rules which can be personalised for particular applications
- Modules for extracting information from free text
- Hybrid Systems of symbolic/statistical analysis to improve the balance between accuracy and robustness in environments with a wide variability of languages
- Sentiment analysis and Opinion mining Systems, expressed on products, brands, etc.
- Semantic search engine to improve the value of information
- Automatic Recognition Modules of named entities like place names or first names, etc.
- Text Processing systems for phonetic transcriptions
Italian, English, French, Spanish, Catalan, Portuguese, German, Dutch, Swedish, Norwegian, Finnish, Danish, Polish, Russian, Belarusian, Estonian, Latvian, Lithuanian, Ukrainian, Greek, Turkish, Arabic, Hebrew, Armenian, Albanian, Croatian, Serbian, Czech, Slovenian, Slavic, Romanian, Bulgarian, Hungarian, Chinese and Japanese.