Concept Extraction

It is a service that extracts from a text all of the concepts that characterize it. For instance, from this article the service will extract concepts like:

GoogleStreet Viewcartographyofficial investigation, photographyWi-Fi networksdatasafeguardsMr. Blumenthal.

The service is able to extract the sentences that best describe the content of the article. This can be single words (“cartography”), normalized expressions (“official investigation”) or named entities (“Mr. Blumenthal”). Through the intervention of the symbolic linguistic analysis, the service is able to filter all the associations of words which, although statistically significant, are not a well-formed concept (eg “more sensitive”, “responsible”) .

The Concept Extraction is crucial in all applications of semantic indexing and storage. It can also be used to give the user a first look at the argument of the text (summary). With the concept extraction, one can easily create mashups between several sources of information (eg, extracted concepts and Wikipedia articles).

Through its extensive research, Ho2S has developed an hybrid technology for concept extraction. This technology is based on the interaction between machine learning algorithms and a functional analyzer of French-based grammars written by linguists. These grammars occur simultaneously in the phase before the text analysis by learning algorithms (to clean the text and identify significant segments) and after this phase (to select the right candidates and identify sentences that are semantically equivalent).

Ho2S can provide dedicated access to an instance of the concept extraction service set to your requirements in terms of business domain and custom applications. Ho2S can also provide a customized system of “matching” between the extracted terms and a thesaurus as Eurovoc, Mesh, Wordnet etc.