Entity Extraction

The entity extraction service detects references to people, places, companies, dates, etc. which are contained in text.

For instance: “Both selected tenders were submitted on Monday : the first one by the French businessmen threesome [PERSON Pigasse-Bergé-Niel], the other one by [PERSON Claude Perdriel] ([COMPANY Le Nouvel Obs]) in association with [COMPANY Orange] ([COMPANY France Télécom]) and Spanish [COMPANY Prisa] ([COMPANY El Pais]).”

The employed techniques are based on “hybrid methods”, combining symbolic approaches and statistical approaches (machine learning). They allow to understand that in such a sentence as “Orange is not quoted on the stock market“, “Orange” refers to a company, whereas in such a sentence as “Our journey to Orange ended well“, “Orange” refers to a French city, and finally in “I made orange marmelade“, “orange” refers to the fruit and not to any named entity as in both previous examples.

Named Entity Recognition is crucial to all applications needing to understand the semantics of text. This could range from semantic indexing (for search engines) to document anonymization, and includes business intelligence applied to text, Open source intelligence), etc.

Ho2S can provide a dedicated access to an instance of the named entity extraction service, adjusted according to your own needs in terms of business domains and custom applications.Customization may concern either the identification of named entities that are not supported by the basic service (for instance products, brands, specific geographic locations, roles/titles of people in organizations, etc.), or the entity extraction in very specific text inputs such as SMS, blogs, structured text like balance sheets, invoices, etc.