Technology

Ideal Applications

Multi-label Classification: Given a text, what subset of a list of labels best apply to that text? For example, given a financial news story, it is about China, commodities, and/or mining? This is a very difficult problem technically, especially as the list of labels gets large.  We solve it with ease: live demo.

Decisions from text: Given a text, what is the probability that X will be true? For example, in healthcare X could be predicting an acute condition such as being diagnosed with severe sepsis in the future, or readmission to the Emergency Department within a month. We’ve just begun investigating such Risk Stratification with actual hospital data, and with actionable results. See preliminary work with Baystate Health (pdf).

Conceptual Search: Given a text, what previously-seen text is the most conceptually similar with it? Word search is simple, but what if you have an article about “hearts” and you’d like it to match articles about “cardiac”? Or text about “physicians” that should automatically match text about “doctors” or “surgeons”? One of the properties of NoNLP representation is that similar concepts map to similar vectors (while being aware of negations), a pattern that generalizes machine-learned models without the need for human programming.

Integrating Text with Data Modeling: Data scientists struggle to integrate unstructured data like text into their structured data models.  The challenge is so great that often the text is simply neglected.  However, the NoNLP representation is a fixed length vector, so integrating with structured data is as easy as appending that data to the vector. Machine learning can then take place over both types of data at the same time.