The Blog of Ian Mercer.

A strongly-typed natural language engine (C# NLP)

Please visit nlp.abodit.com for information about my natural language engine. The information below is out of date.

Here is an explanation of the natural language engine that powers my home automation system. It's a strongly-typed natural language engine with tokens and sentences being defined in code. It currently understands sentences to control lights, heating, music, sprinklers, ... You can ask it who called, you can tell it to play music in a particular room, ... it tells you when a car comes down the drive, when the traffic is bad on I-90, when there's fresh snow in the mountains, when it finds new podcasts from NPR, ... and much more.

The natural language engine itself is a separate component that I hope one day to use in other applications.

Existing Natural Language Engines

  • Have a large, STATIC dictionary data file
  • Can parse complex sentence structure
  • Hand back a tree of tokens (strings)
  • Don’t handle conversations

C# NLP Engine

  • Defines strongly-typed tokens in code
  • Uses type inheritance to model ‘is a’
  • Defines sentences in code
  • Rules engine executes sentences
  • Understands context (conversation history)

Sample conversation

Sample Conversation

Goals

  • Make it easy to define tokens and sentences (not XML)

  • Safe, compile-time checked definition of the syntax and grammar (no XML)

  • Model real-world inheritance with C# class inheritance: ‘a labrador’ is ‘a dog’ is ‘an animal’ is ‘a thing’

  • Handle ambiguity,

    e.g.

    play something in the air tonight in the kitchen
    remind me at 4pm to call john at 5pm

C# NLP Engine Structure

Tokens - Token Definition

  • A hierarchy of Token-derived classes
  • Uses inheritance, e.g. TokenOn is a TokenOnOff is a TokenState is a Token. This allows a single sentence rule to handle multiple cases, e.g. On and Off
  • Derived from base Token class
  • Simple tokens are a set of words, e.g. « is | are »
  • Complex tokens have a parser, e.g. TokenDouble

A Simple Token Definition

public class TokenPersonalPronoun : TokenGenericNoun 
{
    internal static string wordz { get { return "he,him,she,her,them"; } } 
}
  • Recognizes any of the words specified
  • Can use inheritance (as in this example)

A Complex Token

public abstract class TokenNumber : Token 
{ 
    public static IEnumerable\<TokenResult\> Initialize(string input) { … 

-   Initialize method parses input and returns one or more possible
    parses.

TokenNumber is a good example:

  • Parses any numeric value and returns one or more of TokenInt, TokenLong, TokenIntOrdinal, TokenDouble, or TokenPercentage results.

The catch-all TokenPhrase

public class TokenPhrase : Token

TokenPhrase matches anything, especially anything in quote marks

e.g. add a reminder "call Bruno at 4pm"

The sentence signature to recognize this could be

    (…, TokenAdd, TokenReminder, TokenPhrase, TokenExactTime)

This would match the rule too …

    add a reminder discuss 6pm conference call with Bruno at 4pm

TemporalTokens

A complete set of tokens and related classes for representing time

  • Point in time, e.g. today at 5pm
  • Approximate time, e.g. who called at 5pm today
  • Finite sequence, e.g. every Thursday in May 2009
  • Infinite sequence, e.g. every Thursday
  • Ambiguous time with context, e.g. remind me on Tuesday (context means it is next Tuesday)
  • Null time
  • Unknowable/incomprehensible time

TemporalTokens (Cont.)

Code to merge any sequence of temporal tokens to the smallest canonical representation,

e.g.

the first thursday in may 2009
   -> {TIMETHEFIRST the first} + {THURSDAY thursday} + {MAY in may} + {INT 2009 -\> 2009}
    -> [TEMPORALSETFINITESINGLEINTERVAL [Thursday 5/7/2009] ]

TemporalTokens (Cont.)

Finite TemporalClasses provide

A way to enumerate the DateTimeRanges they cover

All TemporalClassesprovide

A LINQ expression generator and Entity-SQL expression generator allowing them to be used to query a database

Existing Token Types

  • Numbers (double, long, int, percentage, phone, temperature)
  • File names, Directories
  • URLs, Domain names
  • Names, Companies, Addresses
  • Rooms, Lights, Sensors, Sprinklers, …
  • States (On, Off, Dim, Bright, Loud, Quiet, …)
  • Units of Time, Weight, Distance
  • Songs, albums, artists, genres, tags
  • Temporal expressions
  • Commands, verbs, nouns, pronouns, …

Rules - A simple rule

	/// <summary>
	/// Set a light to a given state 
	///</summary>
	private static void LightState(NLPState st, TokenLight tlight, TokenStateOnOff ts)
	{
		if (ts.IsTrueState == true)
			tlight.ForceOn(st.Actor); 
		if (ts.IsTrueState == false)
			tlight.ForceOff(st.Actor);
		st.Say("I turned it " + ts.LowerCased); 
	}

Any method matching this signature is a sentence rule:-  NLPState, Token*

Rule matching respects inheritance, and variable repeats … (NLPState st, TokenThing tt, TokenState tokenState, TokenTimeConstraint[] constraints)

Rules are discovered on startup using Reflection and an efficient parse graph is built allowing rapid detection and rejection of incoming sentences.

State - NLPState

  • Every sentence method takes an NLPState first parameter
  • State includes RememberedObject(s) allowing sentences to react to anything that happened earlier in a conversation
  • Non-interactive uses can pass a dummy state
  • State can be per-user or per-conversation for non-realtime conversations like email
    • Chat (e.g Jabber/Gtalk)

    • Web chat

    • Email

    • Calendar (do X at time Y)

    • Rich client application

    • Strongly-typed natural language engine

    • Compile time checking, inheritance, …

    • Define tokens and sentences (rules) in C#

    • Strongly-typed tokens: numbers, percentages, times, dates, file names, urls, people, business objects, …

    • Builds an efficient parse graph

    • Tracks conversation history

  • Company names, locations, documents, …
  • From TimeExpressions

Related Stories

Home Automation

I've been working on home automation for over 15 years and I'm close to achieving my goal which is a house that understands where everyone is at all times, can predict where you are going next and can control lighting, heating and other systems without you having to do or say anything. That's a true "smart home".

Ian Mercer
Ian Mercer

Natural Language Processing

I could not find a Natural Language Processing engine when I needed one for my home automation system so I developed my own. After 10 years of on and off development I now have a unique NLP engine for .NET that is easy to configure but incredibly powerful for precise command and control applications. It doesn't use a tokenizer so it doesn't care if you run words together.

Ian Mercer
Ian Mercer

Home Automation Sensors

An overview of the many sensors I've experimented with for home automation including my favorite under-floor strain gauge, through all the usual PIR, beam and contact sensors to some more esoteric devices like an 8x8 thermal camera.

Ian Mercer
Ian Mercer

ATAN curve for probabilities

In a home automation system we often want to convert a measurement into a probability. The ATAN curve is one of my favorite curves for this as it's easy to map overything onto a 0.0-1.0 range.

Ian Mercer
Ian Mercer

Probabilistic Home Automation

A probabilistic approach to home automation models the probability that each room is occupied and how many people are in that room.

Ian Mercer
Ian Mercer

Multiple hypothesis tracking

A statistical approach to understanding which rooms are occupied in a smart house

Ian Mercer
Ian Mercer

A state machine for lighting control

An if-this-then-that style rules machine is insufficient for lighting control. This state machine accomplishes 90% of the correct behavior for a light that is controlled automatically and manually in a home automation system.

Ian Mercer
Ian Mercer

Home Automation States

Understanding the many different 'states' a house can have is critical to creating great home automation

Ian Mercer
Ian Mercer

Graphing gigabytes of home automation data with tableau

Some interesting charts from the gigabytes of data my home automation system produces

Ian Mercer
Ian Mercer

iBeacons for Home Automation

My investigations into using iBeacons for home automation

Ian Mercer
Ian Mercer

iBeacon meetup in Seattle - January 2015

My notes on the iBeacon meetup in Seattle held in January 2015

Ian Mercer
Ian Mercer

Home Automation Systems as a Graph

Using nodes and links to represent a home and all the devices in it

Ian Mercer
Ian Mercer

N-Gram Analysis of Sensor Events in Home Automation

Using n-gram analysis to spot patterns in sensor activations

Ian Mercer
Ian Mercer

Xamarin Forms Application For Home Automation

Building a Xamarin Forms application to control my home automation system

Ian Mercer
Ian Mercer

The Internet of Hubs (and things)

Maybe it should be called the Internet of Hubs instead

Ian Mercer
Ian Mercer

Showing home status with just a single RGB LED

Multicolored LEDs can convey a lot of information in a small space

Ian Mercer
Ian Mercer

A wireless sensor network using Moteino boards

The diminutive Arduino boards include a powerful transmitter/receiver

Ian Mercer
Ian Mercer

JSON Patch - a C# implementation

Ian Mercer
Ian Mercer

The home as a user interface

Ian Mercer
Ian Mercer

A RESTful API for sensor data

POSTing data to a home automation system from Arduino devices

Ian Mercer
Ian Mercer

The Internet of Boilers

An experiment to measure every aspect of an HVAC / boiler system

Ian Mercer
Ian Mercer

Dynamically building 'Or' Expressions in LINQ

How to create a LINQ expression that logically ORs together a set of predicates

Ian Mercer
Ian Mercer

VariableWithHistory - making persistence invisible, making history visible

A novel approach to adding history to variables in a programming language

Ian Mercer
Ian Mercer

A Quantified House - My Talk to the Seattle Quantified Self Meetup

My talk to the Seattle Quantified Self meetup

Ian Mercer
Ian Mercer

Updated Release of the Abodit State Machine

A hierarchical state machine for .NET

Ian Mercer
Ian Mercer

Integrating an Android phone into my home automation system

Some new features for my home automation using an Android phone

Ian Mercer
Ian Mercer

Building a better .NET State Machine

A state machine for .NET that I've released on Nuget

Ian Mercer
Ian Mercer

The Internet of Dogs

Connecting our dog into the home automation

Ian Mercer
Ian Mercer

GreenGoose Review

A review of the now defunct GreenGoose sensor system

Ian Mercer
Ian Mercer

A simple state machine in C#

State machines are useful in many contexts but especially for home automation

Ian Mercer
Ian Mercer

Convert a property getter to a setter

Ian Mercer
Ian Mercer

Home power meters revisited

Ian Mercer
Ian Mercer

Home Automation Calendar Integration

Ian Mercer
Ian Mercer

Smart home energy savings - update for 2010

Ian Mercer
Ian Mercer

A smart power strip

Ian Mercer
Ian Mercer

MongoDB Map-Reduce - Hints and Tips

Ian Mercer
Ian Mercer

What does a Smart House do at Halloween?

My favorite home automation features for Halloween

Ian Mercer
Ian Mercer

Home Automation Top Features

Ian Mercer
Ian Mercer

Weather Forecasting for Home Automation

Ian Mercer
Ian Mercer

Lengthening short Urls in C#

Ian Mercer
Ian Mercer

How can I tell if my house is smart?

Ian Mercer
Ian Mercer

ASP.NET MVC SEO - Solution Part 1

Ian Mercer
Ian Mercer

Home Automation Block Diagram

Ian Mercer
Ian Mercer

Building sitemap.xml for SEO ASP.NET MVC

Ian Mercer
Ian Mercer

World's Smartest House Demonstration

Ian Mercer
Ian Mercer

Tip: getting the index in a foreeach statement

A tip on using LINQ's Select expression with an index

Ian Mercer
Ian Mercer

Future proof your home with a new conduit system?

Running conduit can be expensive but maybe you don't need one to every room

Ian Mercer
Ian Mercer

WCF and the SYSTEM account

Namespace reservations and http.sys, my, oh my!

Ian Mercer
Ian Mercer

404 errors on IIS6 with ASP.NET 4 Beta 2

Ian Mercer
Ian Mercer

Mixed mode assembly errors after upgrade to .NET 4 Beta 2

Fixing this error was fairly simple

Ian Mercer
Ian Mercer

The EntityContainer name could not be determined

How to fix the exception "the entitycontainer" name could not be determined

Ian Mercer
Ian Mercer

Shortened URLs should be treated like a Codec ...

Expanding URLs would help users decide whether or not to click a link

Ian Mercer
Ian Mercer

A great site for developing and testing regular expressions

Just a link to a site I found useful

Ian Mercer
Ian Mercer

Entity Framework in .NET 4

Ian Mercer
Ian Mercer

System.Data.EntitySqlException

Hints for dealing with this exception

Ian Mercer
Ian Mercer

Exception Handling using Exception.Data

My latest article on CodeProject covers the lesser known Exception.Data property

Ian Mercer
Ian Mercer

ASP.NET Custom Validation

How to solve a problem encountered with custom validation in ASP.NET

Ian Mercer
Ian Mercer

Optimization Advice

Some advice on software optimization

Ian Mercer
Ian Mercer

Linq's missing link

LinqKit came in handy back in 2009

Ian Mercer
Ian Mercer

Cache optimized scanning of pairwise combinations of values

Using space-filling curves to optimize caching

Ian Mercer
Ian Mercer

Threading and User Interfaces

A rant about how few software programs get threading right

Ian Mercer
Ian Mercer

New Home Automation Server

Ian Mercer
Ian Mercer

World's Smartest House

Over 15 years of experimentation with home automation

Ian Mercer
Ian Mercer

World's Smartest House Videos

A collection of videos about my smart home efforts

Ian Mercer
Ian Mercer