Modern information systems are flooded by short text to be classified. These are typically tickets from call centers and Customer Relationship Management (CRM) systems, text entered in response to open questions in user-submitted questionnaires, call center transcriptions, tweets, SMSs etc. We offer the best technology to tackle the problem of automatically classifying these texts. Our hybrid technology allows the user to rapidly design the classification plan best suited to his or her needs and have the system interpret it to classify any document. Two options are available:
- Local Classification: The classifier is installed in client-server mode on the customer’s premises. All nonfunctional aspects, such as security, load balancing, fault tolerance, etc., are dealt with by following the standard procedures in use by the customer.
- Service Based Classification: this is the most “agile” way of obtaining high-quality results with minimal integration costs and time. A client application sends our servers a “classification design document”, i.e. a document containing the minimal information necessary to learn a classifier (the classification hierarchy itself, the description of the categories, a set of manually chosen examples for each category, etc.). The server returns an id. From now on, the client can use this id to classify new documents in the selected classification hierarchy.
Thanks to our rich classification matrix we are able to provide accurate product classification based on simple product descriptions. This is a traditionally difficult task as it often involves many hundreds of categories which are distinguished only by linguistic nuances. In this case, we offer a semi-structured classification system which mixes both automatically learned information and hand-coded non-defeasible rules, to ensure that no “unpredictable” results are obtained.
This is the most “traditional” version of document classification, the one which is applied, for instance, to news, web pages, corporate documents, and so on. Out-of-the-box products can already achieve reasonable performance on these kinds of documents in terms of topic classification, as the presence of large quantities of text can drive the learning process. What we add on top of this is a highly accurate functional classification layer which is of paramount importance in driving, for instance, corporate workflows. Such a functional classification can detect document type, security and privacy level, sources, and other features which it would be difficult to fit into a standard topic classification system.
- Availability of topic classification and functional classification
- Availability of rule-based and learning-based classification
- Effective on both long and short documents
- Available locally or as a web service
- Language aware
- Domain aware
- Fast integration