Programming

Dynamically building ‘Or’ Expressions in LINQ

One common question on Stackoverflow concerns the creation of a LINQ expression that logically Ors together a set of predicates. The need stated is to be able to build such an expression dynamically. Creating the ‘And’ version is easy, you simply stack multiple ‘.Where‘ clauses onto an expression as you add each predicate. You can’t do the same for ‘Or’. The common responses are ‘use LINQKit’ or ‘use Dynamic LINQ’. LINQKit however adds the unfortunate ‘.AsExpandable()’ into the expression which can cause problems in some circumstances, and Dynamic LINQ is not strongly-typed so doesn’t survive renaming operations. Neither answer is ideal.

But, there is another way, using a bit of Expression tree manipulation you can build an ‘Or‘ expression dynamically while staying strongly-typed. The code below achieves this.

using System;
using System.Linq;
using System.Linq.Expressions;
using System.Collections.Generic;

public static class ExpressionBuilder
{
  public static Expression<Func<T, bool>> True<T>() { return f => true; }
  public static Expression<Func<T, bool>> False<T>() { return f => false; }

  public static Expression<T> Compose<T>(this Expression<T> first, 
       Expression<T> second, 
       Func<Expression, Expression, Expression> merge)
  {
      // build parameter map (from parameters of second to parameters of first)
      var map = first.Parameters
                   .Select((f, i) => new { f, s = second.Parameters[i] })
                   .ToDictionary(p => p.s, p => p.f);

      // replace parameters in the second lambda expression with parameters from 
      // the first
      var secondBody = ParameterRebinder.ReplaceParameters(map, second.Body);
      // apply composition of lambda expression bodies to parameters from 
      // the first expression 
      return Expression.Lambda<T>(merge(first.Body, secondBody), first.Parameters);
  }

  public static Expression<Func<T, bool>> And<T>(
      this Expression<Func<T, bool>> first,
      Expression<Func<T, bool>> second)
  {
      return first.Compose(second, Expression.And);
  }

  public static Expression<Func<T, bool>> Or<T>(
      this Expression<Func<T, bool>> first,
      Expression<Func<T, bool>> second)
  {
      return first.Compose(second, Expression.Or);
  }

  public class ParameterRebinder : ExpressionVisitor
  {
      private readonly Dictionary<ParameterExpression, ParameterExpression> map;

      public ParameterRebinder(
          Dictionary<ParameterExpression, 
          ParameterExpression> map)
      {
          this.map = map??new Dictionary<ParameterExpression,ParameterExpression>();
      }

      public static Expression ReplaceParameters(
          Dictionary<ParameterExpression, 
          ParameterExpression> map, 
          Expression exp)
      {
          return new ParameterRebinder(map).Visit(exp);
      }

      protected override Expression VisitParameter(ParameterExpression p)
      {
          ParameterExpression replacement;
          if (map.TryGetValue(p, out replacement))
          {
              p = replacement;
          }
          return base.VisitParameter(p);
      }
  }
}

NB Some of the ideas in this case from other blog posts, I can’t find them right now but if part of this was your idea I’d be happy to add a link to your blog.

VariableWithHistory – making persistence invisible, making history visible

In a typical .NET application variables have a short lifetime. When they go out of scope or the application ends their value is lost. Also, you cannot ask a variable what its value was 1 hour ago, or what its average, maximum or minimum value was yesterday.

Yet, such a variable would be extremely useful when writing a Home Automation System because you often need to make comparisons between a current value and some historical average, or between two ranges (e.g. was the kitchen more or less occupied than yesterday). Now, normally you wouldn’t want to mix persistence up with the representation of a value in your code (see ‘Separation of Concerns’), but in this case I decided that it was worth mixing the two concepts because the benefits of doing so were so great.

So I created a class called VariableWithHistory<T> which is the abstract base class for IntegerWithHistory, DoubleWithHistory, BoolWithHistory, StringWithHistory and a number of others.  The first property worth noting on these classes is the .Current property.  This always gives you the latest value that has been set.  Setting the .Current value stores both the value and the DateTime (Utc of course) at which the value became current.  A history of all past values is maintained in MongoDB up to some suitable limit per variable (each variable can have its own adjustable history size in bytes by using MongoDB’s capped collections).  If the new value is the same as the old one no update is made, the implicit behavior being that the value changed and stayed there until it changes again, so if you want to know what the value is now it is the same as the last change recorded.

With this new variable type in place any object in the house can have any number of persistent fields on it (bool occupied, double temperature, string triggeredBy, …).  Updating these values is as simple as assigning to their .Current property.  When the system loads, each value comes back with the value it had when the system was shut down.  To accomplish this every VariableWithHistory is given a unique id (based on the unique id of it’s parent, e.g. a room).

So far so good, shut down, restart and the house doesn’t need to query a device to know if it’s on or off and all the long running Sequential Logic Blocks I use for rules (e.g. .Delay(days:2)) carry on running as if nothing happened.  This is particularly useful since I typically deploy a new version almost every day and some logic blocks have long delays built into them.

But besides providing simple recovery from a reboot, these persistent variables allow me to do some much more interesting things.

int CountTransitions(DateTimeRange range, T direction);
Counts how many transitions there have been to the value T in a given time range, e.g. how many times did the driveway alarm go ‘true’ this evening?

Dictionary<T, double> Fractional(DateTimeRange range);
Builds a histogram of all the values seen in the time range, e.g. 50% hot, 20% cold, 30% warm for a string variable that tracks temperature

DateTimeOffset LastChangedState
e.g. when was this sensor last triggered?

TimedValue<T> ValueAtTime(DateTimeOffset dt)
What was the value at a given time in the past, e.g. what was the temperature at the same time yesterday?

Each specific type of VariableWithHistory<T> may also have additional methods relevant to the type T.  For example, on DoubleWithHistory there is a method double Average(DateTimeOffset minValue, DateTimeOffset maxValue) which gets the average value over the specified time range.  On BoolWithHistory there is a method double PercentageTrue(DateTimeRange range) which you could use to find the average occupancy for a room yesterday.

 

My initial implementation waited for the database to write each update before allowing any queries but now I simply cache the Current value and assume that queries will probably get executed after updates and that the average temperature yesterday is close enough with or without the last 100ms of updates.  I did try to keep this class isolated from MongoDB but in the end the benefit of some of the atomic update capabilities in MongoDB made it easier to just take the dependency.

My previous implementation of this feature used my own in-memory database, MongoDB has slowed it down a bit but I’ve gained the ability to archive terabytes of sensor data which should prove useful for my next project which is to add some machine learning to the system.

 

 

 

 

 

 

 

Neo4j Meetup in Seattle – some observations

I attended the Neo4j Meetup in Seattle this evening. It was an interesting tour around the internals of Neo4j and some of the design decisions behind how they store graphs in a database.

The most interesting thing about Neo4j is the Cypher query language used to construct graph queries that follow relationships, evaluate conditions on properties on relationships and nodes. Neo4j shows much promise in terms of being able to represent data in a very natural way and to query it using Cypher in ways that would bring SQL to its knees with join-upon-join-upon-join.

In an earlier blog post I lamented the lack of a single database solution that was the best of all worlds: relational + document + graph + semantic web. Tonight that feeling was compounded: Neo4j is a graph database but it’s missing several key features that could make it much more.

We were privileged to get a first hand explanation as to how Neo4j worked internally but what we saw looked like a work in progress: an unfinished implementation of something that could be so much better. Here’s some of the things Neo4j needs to fix before I’ll give it a go:-

1) Stealing bits from one value to give to another to create odd word lengths like 23 bits is so 1980′s. I cannot believe this is a worthwhile optimization to make in 2012. Neo should bite the bullet, upgrade their few existing customers and move to a more modern byte aligned, 64-bit address space. I was equally amazed at the implementation of compression schemes for text on disk but the omission of other obvious space-saving opportunities like declaring some relationships to be one-way only (no reverse queries, thus no need to store the back link). It’s 2012: disk space is essentially limitless; I should never have to hit a file-size limit because someone decided to use 23, 28 or some other random number of bits instead of 64.

2) The extremely limited set of data types. If you want to store json you’d better support at least all the common Javascript options including Dates. Frankly I don’t care if your database is written in Java, it exposes a web api using json so that’s what it should support. Also odd was the choice of a linked list, meandering its way through the file, as the way to store properties for a node. IMHO Neo4j should just switch to Bson and put a document size limit on nodes like MongoDB instead of carrying on down this bit-packing, linked-list approach to properties with a partial implementation of types.

3) The lack of file splitting at 2GB/4GB boundaries.

4) Putting nodes and relationships into separate files. Sure this simplifies the access pattern but it’s not going to give good locality to data on disk. An alignment based on disk block sizes with nodes and relationships packed into blocks seems likely to be a much better approach to minimizing disk seeks and reads.

3) Reliance on Lucene to provide indexing. Much as I appreciate Lucene, Neo4j needs built-in indexes; without them it’s impossible to optimize query plans across the graph and the indexes. MongoDB has a good selection of indexing options including 2D geo-spatial indexing; IMHO Neo4j should adopt the same set of options and offer queries that are both good relational database queries and good graph queries not force their users to pick one or the other whilst handling the interop between two different systems.

In fact, in my ideal world Neo4j and MongoDB would just become one database: a document database that also has great graph-querying capabilities!

I’ll keep monitoring Neo4j but in the meantime it’s full speed ahead with my own implementation of a graph database in MongoDB with the added twist that in my implementation, relationships are all modeled as triples (just like in a semantic web triple-store). My graph-query language isn’t likely to be as powerful as Cypher any time soon but I have indexes, the ability to query by relationships easily and a robust implementation of properties on each node with support for all common data-types and through my interface-based approach to storing objects with multiple-inheritance I get strongly-typed result sets in C#.

Updated Release of the Abodit State Machine

I published a new version of the Abodit State Machine to Nuget this evening. You can find it here.

One breaking change in this version is that the state machine is now specified using three Type parameters instead of two:

public class OccupancyStateMachine : 
          StateMachine<OccupancyStateMachine, Event, BuildingArea>

The third type parameter, TContext, is a context object that can be passed in with every event occurrence or tick. This means that you don’t need to store any extraneous data in the state machine itself and can keep it as a pure representation of the state of the system.

In the example above I have an OccupancyStateMachine and the context is a BuildingArea. Each call to EventHappens now takes the event that happened and a BuildingArea object.

When you define your state machine you will need to include 4 parameters in each lambda expression.

Here, for example, is the current state machine for a BuildingArea in my home automation. It uses a hierarchy of states with two base states: Not Occupied and Occupied. It has timers for activity within a room or for occupancy within rooms that are contained by a floor. Note how it also exposes an IObservable<State> so that other objects can subscribe to state machine changes. I didn’t want to take the Rx dependency in the state machine class itself but you can see how easy it is to hook it up.

Of interest also is the way I represent occupancy as three distinct states, the extra one ‘Asleep’ represents a room that is not-occupied in the sense that there is no motion there now but there was at some point during the evening before.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using Abodit.StateMachine;
using log4net;
using Abodit.Units;
using AboditUnits.Units;
using System.Reactive.Subjects;
using System.Reactive.Linq;

namespace Abodit
{
    /// <summary>
    /// An Occupancy State machine handles not occupied, occupied, asleep
    /// </summary>
    [Serializable]
    public class OccupancyStateMachine : StateMachine<OccupancyStateMachine, Event, BuildingArea>
    {
        private readonly Subject<State> watch = new Subject<State>();
        public IObservable<State> Watch { get { return watch.AsObservable(); } }

        public override void OnStateChanging(StateMachine<OccupancyStateMachine, Event, BuildingArea>.State newState, BuildingArea context)
        {
            watch.OnNext(newState);
        }

        public static readonly State Starting = AddState("Starting");

        public static readonly State NotOccupied = AddState("Not occupied",
                (m, e, s, c) => { 
                                m.CancelScheduledEvent(eTick);          // Stop the clock
                                m.IsTimerRunning = false;
                                m.IsRecentlyOccupied = false;
                                m.IsHeavilyOccupied = false;
                                m.After(new TimeSpan(hours:0, minutes:5, seconds:0), e5MinutesSinceOccupied);
                                m.After(new TimeSpan(hours:24, minutes:0, seconds:0), e24hoursSinceOccupied);
                                m.After(new TimeSpan(hours:48, minutes:0, seconds:0), e48hoursSinceOccupied);
                             },
                (m, e, s, c) => { });

        public static readonly State NotOccupiedIn5Minutes = AddState("Not occupied in over 5 minutes",
                (m, e, s, c) => { },
                (m, e, s, c) => { }, NotOccupied);

        public static readonly State NotOccupiedInOver24Hours = AddState("Not occupied in over 24 hours",
                (m, e, s, c) => { },
                (m, e, s, c) => { }, NotOccupiedIn5Minutes);

        public static readonly State NotOccupiedInOver48Hours = AddState("Not occupied in over 48 hours",
                (m, e, s, c) => { },
                (m, e, s, c) => { }, NotOccupiedInOver24Hours);

        public static readonly State NotOccupiedInOver1Week = AddState("Not occupied in over 1 week",
                (m, e, s, c) => { },
                (m, e, s, c) => { }, NotOccupiedInOver48Hours);

        public static readonly State Asleep = AddState("Asleep",
                (m, e, s, c) =>
                {
                    // Set a timer going for morning
                    var now = TimeProvider.Current.Now.LocalDateTime;
                    var morning = now.Hour < 8 ? now.AddHours(-now.Hour + 8) : now.AddHours(24 - now.Hour + 8);
                    m.At(morning.ToUniversalTime(), eMorning);
                },
                (m, e, s, c) => { },
                parent:NotOccupied);

        public static readonly State Occupied = AddState("Occupied",
                (m, e, s, c) =>
                {
                    m.IsRecentlyOccupied = true;
                    // Add a timer that runs while we are occupied
                    m.Every(new TimeSpan(hours:0, minutes:0, seconds:10), eTick);
                    // And set a timer going to mark 5 minutes since occupied
                    m.After(new TimeSpan(hours:0, minutes:5, seconds:0), e5MinutesAfterBecomingOccupied);
                    m.CancelScheduledEvent(e5MinutesSinceOccupied);
                    m.CancelScheduledEvent(e24hoursSinceOccupied);
                    m.CancelScheduledEvent(e48hoursSinceOccupied);
                },
                (m, e, s, c) => { });

        public static readonly State HeavilyOccupied = AddState("Heavily occupied",
                (m, e, s, c) => { },
                (m, e, s, c) => { },
                parent:Occupied);

        private static readonly Event eStart = new Event("Starts");
        private static readonly Event eUserActivity = new Event("User activity");
        private static readonly Event eTick = new Event("Tick");
        private static readonly Event eTimeout = new Event("Timeout");
        private static readonly Event eMorning = new Event("Morning");
        private static readonly Event e5MinutesAfterBecomingOccupied = new Event("5 minutes after becoming occupied");
        private static readonly Event e5MinutesSinceOccupied = new Event("5 minutes since occupied");
        private static readonly Event e24hoursSinceOccupied = new Event("24 hours since occupied");
        private static readonly Event e48hoursSinceOccupied = new Event("48 hours since occupied");

        private static readonly Event eAllChildrenNotOccupied = new Event("No child occupied");
        private static readonly Event eAtLeastOneChildOccupied = new Event("At least one child occupied");

        private double decliningActivity = 0.0;         // Up 1000 every UserInput, down x0.9 every n seconds
        private const int ActivityPerUserInput = 1000;
        private const double rateOfDecline = 0.92;

        public bool IsTimerRunning { get; set; }
        public bool IsRecentlyOccupied { get; set; }
        public bool IsHeavilyOccupied { get; set; }

        static OccupancyStateMachine()
        {
            // On startup we transition immediately to starting
            // but we want an event call to do this so we aren't doing any work
            // in the constructor, and so the initialization only happens when it's
            // a true 'cold start' not a 'warm start' from some database state
            Starting
                .When(eStart, (m, s, e, c) => { return NotOccupied; });

            // Note: This is a hierarchical state machine so NotOccupied includes Asleep
            NotOccupied
                .When(eAtLeastOneChildOccupied, (m, s, e, c) => 
                {
                    return Occupied;
                })
                .When(e5MinutesSinceOccupied, (m, s, e, c) =>
                {
                    // Could signal something??
                    return s;
                })
                .When(e24hoursSinceOccupied, (m, s, e, c) =>
                {
                    // Could signal something??
                    return s;
                })
                .When(e48hoursSinceOccupied, (m, s, e, c) =>
                {
                    // Could signal something??
                    return s;
                })
                .When(eUserActivity, (m, s, e, c) =>
                {
                    m.After(c.OccupancyTimeout, eTimeout);                // start a new timeout
                    m.IsTimerRunning = true;
                    return Occupied;
                });

            // Asleep is a substate of not occupied so no need for more logic on becoming occupied ...
            Asleep
                .When(eMorning, (m, s, e, c) =>
                {
                    // Eliminate Asleep if appropriate
                    return NotOccupied;
                });

            // Occupied includes recently occupied and heavily occupied ...
            Occupied
                .When(e5MinutesAfterBecomingOccupied, (m, s, e, c) => 
                {
                    m.IsRecentlyOccupied = false;
                    return s;
                })
                .When(eUserActivity, (m, s, e, c) =>
                {
                    // Accumulate activity ...
                    m.decliningActivity += ActivityPerUserInput;

                    m.CancelScheduledEvent(eTimeout);               // cancel the old timeout

                    m.After(c.OccupancyTimeout, eTimeout);                // start a new timeout
                    m.IsTimerRunning = true;

                    if (m.decliningActivity > 20 * ActivityPerUserInput)
                        return HeavilyOccupied;
                    else
                        return s;
                })
                .When(eAllChildrenNotOccupied, (m, s, e, c) =>
                    {
                        if (m.IsTimerRunning)
                        {
                            // If the timer is running ... wait until it runs out
                            return s;
                        }
                        else
                        {
                            DateTime nowLocal = TimeProvider.Current.Now.LocalDateTime;
                            if (nowLocal.Hour > 17)
                                return Asleep;
                            else
                                return NotOccupied;
                        }
                    })
                .When(eTick, (m, s, e, c) =>
                    {
                        m.decliningActivity *= rateOfDecline;
                        return s;
                    })
                .When(eTimeout, (m, s, e, c) =>
                    {
                        DateTime nowLocal = TimeProvider.Current.Now.LocalDateTime;
                        if (nowLocal.Hour > 17)
                            return Asleep;
                        else
                            return NotOccupied;
                    });

            HeavilyOccupied.When(eTick, (m, s, e, c) =>
            {
                // Same code as Occupied but this one will override if we are in HeavilyOccupied mode
                m.decliningActivity *= rateOfDecline;
                // Fall back to just occupied when ...
                if (m.decliningActivity < 0.2 * ActivityPerUserInput)
                    return Occupied;
                else
                    return s;

            });


        }

        public OccupancyStateMachine()
            : base(Starting)
        {
        }

        public OccupancyStateMachine(State initialState)
            : base(initialState)
        {
        }

        public override void Start()
        {
            this.EventHappens(eStart, null);
        }

        public void UserActivity(BuildingArea ba)
        {
            this.EventHappens(eUserActivity, ba);
        }

        public void AllChildrenNotOccupied(BuildingArea ba)
        {
            this.EventHappens(eAllChildrenNotOccupied, ba);
        }

        public void AtLeastOneChildOccupied(BuildingArea ba)
        {
            this.EventHappens(eAtLeastOneChildOccupied, ba);
        }
    }
}

My first programme [sic]

At the risk of looking seriously old, here’s something found on a paper tape bearing the title “Ian’s First Programme” …

BEGIN INTEGER A,B,C,D,E,F,G,ANS'
READ A,B,C,D,E,F,G'
ANS:=(A+B+C+D+E+F+G)/7'
PRINT ANS'
END'

Can you identify the language and the computer it ran on?

Building a better .NET State Machine

[Note: Updated version on Nuget has slightly different API, see latest blog post.]

There are several state machine implementations for .NET out there but, sadly, none of them met all of the requirements I have for a state machine. These are:-

1) Well written using encapsulation and other good practices
2) Able to be easily serialized to disk
3) Able to handle temporal events easily (After … At … Every …)
4) Disk serialized form must expose a property saying when it next needs to be fetched from disk to run
5) Implements hierarchical states with entry and exit actions

So I built one, and have made the source code available on Nuget so you can add it to any project easily without any extra DLLs.

Look for “AboditStateMachine” on Nuget to download it. The download includes a sample state machine documented to show off some of its capabilities.

Defining states is easy, just give them a name and specify their parent state if any:-

        public static readonly State UnVerified = AddState("UnVerified");

        public static readonly State Verified = AddState("Verified");

        // States are hierarchical.  If you are in state VerifiedRecently you are also in is parent state Verified.

        public static readonly State VerifiedRecently = AddState("Verified recently", parent: Verified);
        public static readonly State VerifiedAWhileAgo = AddState("Verified a while ago", parent: Verified);

You can use any other type that’s IEquatable as an Event type or you can use the provided Event class:

        private static Event eUserVerifiedEmail = new Event("User verified email");
        private static Event eScheduledCheck = new Event("Scheduled Check");
        private static Event eBeenHereAWhile = new Event("Been here a while");

The state machine itself is specified in a static constructor so it runs just once no matter how many instances of the state machine you create. Each method is provided with an instance of the state machine ‘m’ as well as the state ‘s’ and the event ‘e’ as appropriate:

        static DemoStatemachine()
        {
            UnVerified
                    .OnEnter((m, s, e) =>
                        {
                            // States can execute code when they are entered or when they are left
                            // In this case we start a timer to bug the user until they confirm their email
                            m.Every(new TimeSpan(hours: 10, minutes:0, seconds:0), eScheduledCheck);

                            // You can also set a reminder to happen at a specific time, or after a given interval just once
                            m.At(new DateTime(DateTime.Now.Year+1, 1, 1), eScheduledCheck);
                            m.After(new TimeSpan(hours: 24, minutes: 0, seconds: 0), eScheduledCheck);

                            // All necessary timing information is serialized with the state machine
                            // The serialized state machine also exposes a property showing when it next needs to be woken up
                            // External code will need to call the Tick(utc) method at that time to trigger the next temporal event
                        })
                    .When(eScheduledCheck, (m, s, e) =>
                    {
                        Trace.WriteLine("Here is where we would send a message to the user asking them to verify their email");
                        // We return the current state 's' rather than 'UnVerified' in case we are in a child state of 'Unverified'
                        // This makes it easy to handle hierarchical states and to either change to a different state or stay in the same state
                        return s;
                    })
                    .When(eUserVerifiedEmail, (m, s, e) =>
                    {
                        Trace.WriteLine("The user has verified their email address, we are done (almost)");
                        // Kill the scheduled check event, we no longer need it
                        m.CancelScheduledEvent(eScheduledCheck);
                        // Start a timer for one last transition
                        m.After(new TimeSpan(hours:24, minutes:0, seconds:0), eBeenHereAWhile);
                        return VerifiedRecently;
                    });

            VerifiedRecently
                    .When(eBeenHereAWhile, (m, s, e) =>
                    {
                        Trace.WriteLine("User has now been a member for over 24 hours - give them additional priviledges for example");
                        // No need to cancel the eBeenHereAWhile event because it wasn't auto-repeating
                        //m.CancelScheduledEvent(eBeenHereAWhile);
                        return VerifiedAWhileAgo;
                    });

            Verified.OnEnter((m, s, e) => 
                {
                    Trace.WriteLine("The user is now fully verified");
                });

            VerifiedAWhileAgo.OnEnter((m, s, e) =>
                {
                    Trace.WriteLine("The user has been verified for over 24 hours");
                });

        }

With your state machine defined you can now create instances of it, trigger events on them, serialize them to disk, fetch them back, carry on eventing on them, …

            DemoStatemachine demoStateMachine = new DemoStatemachine(DemoStatemachine.UnVerified);

            // At the time specified in demoStateMachine.NextTimedEventAt you reload the state machine from disk and call
            demoStateMachine.Tick(DateTime.UtcNow);

            // When the user verifies their email address you call ...
            demoStateMachine.VerifiesEmail();

            // At any other time you can examine the current state, act on the state changed event, ...

I hope you find this new state machine implementation useful, and if you have any feedback, do please send it my way.

The Internet of Dogs

In my previous post about GreenGoose I described my initial experiences with this “Internet Of Things in a box” product. Recently I’ve been trying their API and have integrated it into my Home Automation System.

Click image to see it all.

The initial integration was easy, I used the new ASP.NET WebApi Core Libraries (from Nuget) together with Newtonsoft Json.Net. GreenGoose’s datetime format is somewhat quirky but hopefully they’ll move to a more standard one soon. They are, however, also about to switch to OAuth so it’s going to require some more work when that happens.

Aside from a few simple WebAPI calls and some Json parsing the rest was just a matter of connecting up the appropriate TimeSeries classes that I use to track values that vary over time, declaring a few graphs, and deciding what to log. With that in place I can now spin up a home automation ‘sensor’ corresponding to any GreenGoose sensor Id and my home automation system will add all of the relevant graphs and charts, triggers and more for that device.

What’s interesting is that a single sensor potentially serves a couple of different purposes. The dog collar sensor for example polls regularly back to the base station so it can potentially be used to sense both how much exercise the dog has had but also simply whether the dog is at home or not which could be really handy for anyone with a dog that’s learned to ignore the invisible fence! Each sensor can, through the TimeSeries objects also offer additional data and triggers that can be used elsewhere in the home, for example, an alert if the dog was walked less than half and hour each day.

A simple state machine in C#

Within the Abodit Natural Language engine there is often a need to track the state of various elements of a conversation. For example, is the user logged in or not, have they verified their email address, what instructional text have we offered them so far, …

To make this easier I decided to add a simple state machine class to the Abodit utilities provided with my NLP Engine. There are, of course, a plethora of existing state machines on the web. Some of them are based on older .NET technology lacking use of generics and functional programming techniques. Others go overboard with fluent-style interfaces when a simple inheritance-based approach from an abstract base class would actually be simpler, less code, and more powerful. Most of them I didn’t discover until after I’d built this one. In any case, it’s always a good learning exercise to try to build something from scratch, so here goes …

First let’s take a look at the result. Here’s how you can define a state machine that derives from this new StateMachine class:

      public class LoginOutStatemachine : StateMachine<LoginOutStatemachine>
    {
        public static void ReportEnter(LoginOutStatemachine m, Event e, State state)
        {
            Console.WriteLine(m.User + " entered state " + state + " via " + e);
        }

        public static void ReportLeave(LoginOutStatemachine m, State state, Event e)
        {
            Console.WriteLine(m.User + " left state " + state + " via " + e);
        }

        public static State Initial = new State("Initial", ReportEnter, ReportLeave);
        public static State LoggedIn = new State("Logged In", ReportEnter, ReportLeave);
        public static State LoggedOut = new State("Logged Out", ReportEnter, ReportLeave);
        public static State Deleted = new State("Deleted", ReportEnter, ReportLeave);

        private static Event eLogsIn = new Event("Logs In");
        private static Event eLogsOut = new Event("Logs Out");
        private static Event eDeletesAccount = new Event("Account Deleted");

        static LoginOutStatemachine()
        {
            Initial
                    .When(eLogsIn, (m, s, e) => { Console.WriteLine("Logging in " + m.User); return LoggedIn; })
                    .When(eDeletesAccount, (m, s, e) => { Console.WriteLine("Deleting account " + m.User); return Deleted; });
            LoggedIn
                    .When(eLogsOut, (m, s, e) => { Console.WriteLine("Logging out " + m.User); return LoggedOut; })
                    .When(eDeletesAccount, (m, s, e) => { Console.WriteLine("Account deleted " + m.User); return Deleted; });
            LoggedOut
                    .When(eLogsIn, (m, s, e) => { Console.WriteLine("Logging in " + m.User); return LoggedIn; })
                    .When(eDeletesAccount, (m, s, e) => { Console.WriteLine("Account deleted " + m.User); return Deleted; });
        }

        public User User { get; private set; }

        public LoginOutStatemachine(State initial, User user)
            : base(initial)
        {
            this.User = user;
        }

        // Expose the events as public methods

        public void LogsIn()
        {
            this.EventHappens(eLogsIn);
        }

        public void LogsOut()
        {
            this.EventHappens(eLogsOut);
        }

        public void DeletesAccount()
        {
            this.EventHappens(eDeletesAccount);
        }
    }

As you can see, you define the states and the events for the state machine using static definitions. (Events trigger state changes and associated actions). Typically I’ll make the States public but the events private and instead provide method calls for each event that is allowed.

Each state can also have an Action that fires on entering the state and an action that fires on leaving the state and each action is provided with all of the parameters it might need (the state machine instance, the state it is going to or from, and the event that caused the transition to happen). In this case all of these entry and exit events are linked to the same method that simply reports what happened.

To define what happens when an given event is received by the state machine you create the static constructor as shown and then, using a fluent interface you define for each initial state, the transition to a new state by calling the When method passing it the event and the action to take when that event happens from the initial state specified. At the end of the method you must return the new state:

Initial
  .When(eLogsIn, (m, s, e) => { Console.WriteLine("Logging in " + m.User); return LoggedIn; })
  .When(eDeletesAccount, (m, s, e) => { Console.WriteLine("Deleting account " + m.User); return Deleted; });

The (m, s, e) parameters give you the state machine itself, the state you are coming from and the event that has been received. By passing your method all of these values I make it easy for you to access any properties of the state machine itself (e.g. a User object) and also allow you to write a single method that handles more than one event type or more than one initial state but which can still be parameterized by those values.

The other minor trick is that the StateMachine class is a generic in the state machine class itself. A small trick that allows access to `T` as the type of the inherited state machine class and thus to any additional properties you define there.

Note how your state machine class can have properties like `User` which allows the transition code to access any additional data it needs. You create an instance of the state machine for each user (all the heavy lifting is done in the static definition so the state machine remains a light-weight object).

In the case of the NLP engine you can pass an `IListener` in to the state machine constructor also so that you can `Say` messages back to the user. Since the state machine is such a light-weight object you can afford to create it for each message interaction with the user and the information you need to persist is just the current state (which I will soon make into a string lookup).

If you want to use the actual state machine in any of your own projects (gratis), here’s the current code:

    /// <summary>
    /// A state machine allows you to track state and to take actions when states change
    /// This state machine provides a fluent interface for defining states and transitions
    /// </summary>
    /// <remarks>
    /// Nasty generic of self so we can refer to the inheriting class in here
    /// </remarks>
    [Serializable]
    [DebuggerDisplay("Current State = {CurrentState.Name}")]
    public abstract class StateMachine<T> where T:StateMachine<T>
    {
        public State CurrentState { get; set; }

        public StateMachine(State initial)
        {
            this.CurrentState = initial;
        }

        /// <summary>
        /// An event has happened, transition to next state
        /// </summary>
        public void EventHappens(Event @event)
        {
            this.CurrentState = this.CurrentState.OnEvent((T)this, @event);
        }

        /// <summary>
        /// An event that causes the state machine to transition to a new state
        /// </summary>
        /// <remarks>
        /// Defined as a nested class so that this state machine's events can only be used with it
        /// </remarks>
        [DebuggerDisplay("Event = {Name}")]
        public class Event
        {
            public string Name { get; private set; }
            public Event(string name)
            {
                this.Name = name;
            }
            public override string ToString()
            {
                return "~" + this.Name + "~";
            }
        }

        /// <summary>
        /// A state that the state machine can be in
        /// </summary>
        /// <remarks>
        /// Defined as a nested class so that this state machine's states can only be used with it
        /// </remarks>
        [DebuggerDisplay("State = {Name}")]
        public class State
        {
            /// <summary>
            /// The Name of this state
            /// </summary>
            public string Name { get; private set; }

            public Action<T, State, Event> ExitAction { get; private set; }
            public Action<T, Event, State> EntryAction { get; private set; }

            private readonly IDictionary<Event, Func<T, State, Event, State>> transitions = new Dictionary<Event, Func<T, State, Event, State>>();

            /// <summary>
            /// Create a new State with a name and an optional entry and exit action
            /// </summary>
            public State(string name, Action<T, Event, State> entryAction = null, Action<T, State, Event> exitAction = null)
            {
                this.Name = name;
                this.EntryAction = entryAction;
                this.ExitAction = exitAction;
            }

            public State When(Event @event, Func<T, State, Event, State> action)
            {
                transitions.Add(@event, action);
                return this;
            }

            public State OnEvent(T parent, Event @event)
            {
                Func<T, State, Event, State> transition = null;
                if (transitions.TryGetValue(@event, out transition))
                {
                    State newState = transition(parent, this, @event);
                    if (newState != this)
                    {
                        // Entry and exit actions only fire when CHANGING state
                        if (this.ExitAction != null) this.ExitAction(parent, this, @event);
                        if (newState.EntryAction != null) newState.EntryAction(parent, @event, newState);
                    }
                    return newState;
                }
                else
                    return this;        // did not change state
            }

            public override string ToString()
            {
                return "*" + this.Name + "*";
            }
        }
    }
}

For further reading on State Machines I recommend this Wikipedia Article.

MongoDB substring search with a difference

It’s quite common to want to search a database for a key that starts with a given string. In SQL you have LIKE and in MongoDB you have regular expressions:

db.customers.find( { name : { $regex : '^acme', $options: 'i' } } );

But what if you want to do the inverse of this? i.e. to search the database for the keys that are themselves substrings of the search string? For example, suppose you are trying to parse a block of text and you want to find phrases in the database that match the start of the current block of text. In SQL you would be dead in the water but with MongoDB you can create a RegEx that matches either the first word, or the first two words, or the first three words, … and so on.

We can construct a regular expression to do this, it might look something like: ^word1($| word2($| word3$))

Here’s a C# method that can create the necessary regular expression:

        /// <summary>
        /// This generates a regular expression that matches as much of the given phrase as it can from a string
        /// i.e. a reverse prefix search where you want the database to supply the prefix and match it against your query
        /// useful for matching 'as much as possible from a given input'
        /// </summary>
        private string generatePrefixRegex(string phrase, bool atStart)
        {
            string[] bits = phrase.Split(' ');
            string result = bits[0];

            // At the start of a sentence, if the first character is upper cased, we should also be looking for a lowercased verson of it
            if (atStart && char.IsUpper(result[0]))
            {
                result = string.Format("(%0|%1)%2", char.ToLowerInvariant(result[0]), char.ToUpperInvariant(result[0]), result.Substring(1));
            }

            // Each additional word - either we end the string before it or we must include it

            foreach (var bit in bits.Skip(1))
            {
                result = result + "($| " + Regex.Escape(bit);
            }

            result = result + "$";                      // last word must end string

            foreach (var bit in bits.Skip(1))
            {
                result = result + ")";                 // close the expression
            }
            return "^" + result;                        // Must start at the start of a Name
        }

Programming a smart home with a fluent, domain-specific language

In response to a question I received recently, here is an example of the fluent extensions to C# that my home automation system uses to define how it should behave. In effect this is a domain specific language for modeling home automation behavior as a finite state machine. Note how you can use either a purely declarative sequence or you can use procedural code within a Do() block.

            Living.FloorSensor
                .Then(Entrance.HallBeam, 30)
                .Provided(Time.Bedtime)
                .ProvidedNot(Home.DinnerGuests)
                .PulseStretch(10 * 60)      // don't announce it too often - every 10 minutes
                .Do("Garage doors warning", () =>
            {
                if (Garage.GarageDoors.AreAnyOpen)
                {
                    FirstFloor.Kitchen.AllMediaZones.AnnounceString("Excuse me, I think you may want to close the garage doors.");
                }
            });

In many ways this is similar to the Reactive Framework from Microsoft except my work on this predates the availability of Reactive Framework and unlike the Reactive Framework my ‘sequential logic blocks’ include persistence logic so the entire state can be saved to disk (as it is every second). This is important because some transitions in the state machine might be several hours or even days long and you need to be able to restart the system and carry on exactly where you left off.

One key benefit of the declarative approach over the procedural approach is that the declarative approach can explain itself. So, the log entry for a light going on can show that the light was turned on because ‘it was dark’ and ‘we had visitors’ and ‘someone came into the room’. Compared to traditional home automation systems where you either have no logging at all or you have a log of what happened, this kind of logging is invaluable for figuring out what went wrong when the logic gets complicated. So in this example I should have moved the test for Garage.GarageDoors.AreAnyOpen out to a Provided clause which would allow it to be part of the reverse-chain logic explanation.

Partial results can be captured at any point in these logic chains and then reused in other chains because each partial result is itself a Sensor device that implements the full fluent interface.

Ultimately I plan to hook the logging for what happened back up to the NLP engine which will allow users to ask the home ‘Why did you put the driveway light on last night around 9pm?’ and ultimately I plan to allow the logic itself to be defined using natural language.