Skip to main content

the Natural Language App, part 1


Introduction

Natural Language Processing (or NLP) is the art of taking human written language (or indeed human spoken language) and analyzing it to use it in some form or fashion.  Advances in natural language processing have made it possible to embed human language understanding in software applications.  Things as personal assistants and bots are now common-place.  The next step is a more integrated approach, the nl-app.  An nl-app is architecturally different and has other architectural concerns, but that is for part 2 of this article.

Before we start discussing this, we'll take a small detour through existing solutions and why I think there is a difference.

Personal assistants

have been a series of new devices like Alexa, Echo, Google-home, Siri, Bixby and a few others.  These are stand-alone devices, usually with their own application API.  There is great potential for such devices to interface with the Internet of Things (IoT), ordering online and other use cases.  However, these devices aren't great at enhancing existing software products and have their issues.  There is a lot of research available on speech-only interfaces.  Beeps and sounds are added to the device's "interface" to indicate to the user that a message is understood or not.  More modern versions of such devices even have some form of display.

I believe it is too early yet for such devices to be mainstream replacing our every day experiences purely from an interface perspective.  Perhaps they never intended to do so in the first place.  A hybrid solution is needed.

Online NLP Services

The three main cloud providers, Amazon, Microsoft and Google have their own APIs for interfacing with a host of useful services for NLP.  There are other more specialized providers too like clever bot [3].  In doing so they've created a technical solution waiting for a problem.  A lot of organizations, having seen Google's Duplex [1], are now jumping on the bandwagon and creating applications consuming these services.  It was Winograd in 1968 who showed his SHRDLU [2] system could provide a fairly complete natural language set for exploring a limited domain.  Duplex is nothing new in that sense.  Google's added speech to text interfaces added a new dimension to this problem.

These online NLP services have the advantage that they can be used in a server-less architecture [4].  In my experience however, we usually require a more tight coupling between such services, and not at the client level of the stack.

Use of Natural Language in our own Software

the bot

A bot is primarily centered around what are called intents.  An intent is akin to a command, a coupling between a piece of natural language and an action (or a series of actions).  In bots, that usually a reply.  You can quickly construct a bot, using online offerings.  By sticking to a limited domain you can quite quickly make your bot look intelligent too.

External bots present more of an integration challenge.  Additional language processing is needed to deal with more complicated time requirements, entity recognition, and semantics in general.

Constructing your own bot, be it using neural-networks, semantic hashing [5], semantic vector spaces [6], or more traditional inverted indices, would enable you to add more complex processing.

natural interfaces and the nl-app

We've got our rich widget sets.  GUI's are easier to learn, traditional command-line interfaces (CLI) are more powerful [7].  What if we could construct a CLI using natural language, but keep our widget sets.  Sort of like the ubiquitous Google search box, in addition to our existing buttons and interfaces.

A natural language command like "what was my average income between January and March of 2018.", would take quite a few widgets to setup, and many more buttons or menus to deal with all the possible different commands.  Natural language when used right, can be far more powerful than widgets.

References

[1]  https://www.theverge.com/2018/5/8/17332070/google-assistant-makes-phone-call-demo-duplex-io-2018
[2]  https://hci.stanford.edu/winograd/shrdlu/
[3]  https://www.cleverbot.com/
[4]  https://martinfowler.com/articles/serverless.html
[5]  https://www.jpinfotech.org/understanding-short-texts-semantic-enrichment-hashing/
[6]  https://github.com/peter3125/sentence2vec
[7]  https://www.cybrary.it/0p3n/command-line-interface-cli-vs-graphical-user-interface-gui/

Comments

Popular posts from this blog

the Natural Language App, part 2

  In part one of this article [9] we discussed the different kinds of chatty AI interfaces and the merits of a mixed natural-language GUI interface. Now we will dig a little deeper in what is underneath the covers of a Natural Language Application (NLA). Natural Language Processing Components Natural Language Processing (NLP) has been around since the 1950s. We will exclude speech-to-text interface in this part of the discussion. Such interfaces have their own unique challenges but output / provide mostly similar “text” to an NLA. We will also only discuss an English NLA. Language with different glyphs, syntax and grammar have to be dealt with separately. NLP is a cross discipline between Linguistics and Computer Science. It consists of taking raw strings of text of a language, and breaking it down into various components for classification. It usually consists of: Sentence boundary detection (finding the unique sentences in some text) Sy

SimSage

Design of an Interactive A.I. for help desks, and the Internet of things Sean Wilson and I started a semantic search company over a decade ago.  This started my foray into  intelligent systems, big data, and artificial intelligence. We left this company after eight years of hard  work. This company is still operational today and doing well. I always felt that there was something missing from a search only solution.  First I tried to make the  search more intelligent. I tried many different approaches.  Focused on getting better Word Sense  Disambiguation (WSD) using neural networks. WSD can be thought of as being able to tell ambiguous  usage of words apart.  “Jaguar”, are we talking about the car or the animal? “bank”, did they mean a  financial institution or the side of a river? This can usually be resolved from the immediate or larger  context of whatever it is you’re looking at. This only led to better information retrieval, not anything remotely intelligent