My News
Class-free persistence and multiple inheritance in C# with MongoDB
May 4th
Much as I appreciate Object Relational Mappers and the C# type system there’s a lot of work to do if you just want create and persist a few objects. MongoDB alleviates a lot of that work with its Bson serialization code that converts almost any object into a binary serialized object notation and provides easy round tripping with JSON.
But there’s no getting around the limitations of C# when it comes to multiple inheritance. You can use interfaces to get most of the benefits of multiple inheritance but implementing a tangled set of classes with multiple interfaces on them can lead to a lot of duplicate code.
What if there was a way to do multiple inheritance without every having to write a class? What if we could simply declare a few interfaces and then ask for an object that implements all of them and a way to persist it to disk and get it back? What if we could later take one of those objects and add another interface to it? “Crazy talk” I hear you say!
Well, maybe not so crazy … take a look at the open source project impromptu-interface and you’ll see some of what you’ll need to make this reality. It can take a .NET dynamic object and turn it into an object that implements a specific interface.
Combine that with a simple MongoDB document store and some cunning logic to link the two together and voila, we have persistent objects that can implement any interface dynamically and there’s absolutely no classes in sight anywhere!
Let’s take a look at it in use and then I’ll explain how it works. First, let’s define a few interfaces:
public interface ILegs
{
int Legs { get; set; }
}
public interface IMammal
{
double BodyTemperatureCelcius { get; set; }
}
// Interfaces can use multiple inheritance:
public interface IHuman: IMammal, ILegs
{
string Name { get; set; }
}
// We can have interfaces that apply to specific instances of a class: not all humans are carnivores
public interface ICarnivore
{
string Prey { get; set; }
}
Now let’s take a look at some code to create a few of these new dynamic documents and treat them as implementors of those interfaces. First we need a MongoDB connection:
MongoServer MongoServer = MongoServer.Create(ConnectionString);
MongoDatabase mongoDatabase = MongoServer.GetDatabase("Remember", credentials);
Next we grab a collection where we will persist our objects.
var sampleCollection = mongoDatabase.GetCollection<SimpleDocument>("Sample");
Now we can create some objects adding interfaces to them dynamically and we get to use those strongly typed interfaces to set properties on them.
var person1 = new SimpleDocument();
person1.AddLike<IHuman>().Name = "John";
person1.AddLike<ILegs>().Legs = 2;
person1.AddLike<ICarniovore>().Prey = "Cattle";
sampleCollection.Save(person1);
var monkey1 = new SimpleDocument();
monkey1.AddLike<IMammal>(); // mark as a mammal
monkey1.AddLike<ILegs>().Legs = 2;
monkey1.AddLike<ICarniovore>().Prey = "Bugs";
sampleCollection.Save(monkey1);
Yes, that’s it! That’s all we needed to do to create persisted objects that implement any collection of interfaces. Note how the IHuman is also an IMammal because our code will also support inheritance amongst interfaces. We can load them back in from MongoDB and get the strongly typed versions of them by using .AsLike
So next, let’s take a look at how we can query for objects that support a given interface and how we can get strongly typed objects back from MongoDB:
var query = Query.EQ("int", typeof(IHuman).Name);
var humans = sampleCollection.Find(query);
Console.WriteLine("Examine the raw documents");
foreach (var doc in humans)
{
Console.WriteLine(doc.ToJson());
}
Console.WriteLine("Use query results strongly typed");
foreach (IHuman human in humans.Select(m => m.AsLike<IHuman>()))
{
Console.WriteLine(human.Name);
}
Console.ReadKey();
So how does this ‘magic’ work? First we need a simple Document class. It can be any old object class, no special requirements. At the moment it does wrap these interface properties up in a document inside it called ‘prop’ making it just a little bit harder to query and index but still fairly easy.
/// <summary>
/// A very simple document object
/// </summary>
public class SimpleDocument : DynamicObject
{
public ObjectId Id { get; set; }
// All other properties are added dynamically and stored wrapped in another Document
[BsonElement("prop")]
protected BsonDocument properties = new BsonDocument();
/// <summary>
/// Interfaces that have been added to this object
/// </summary>
[BsonElement("int")]
protected HashSet<string> interfaces = new HashSet<string>();
/// <summary>
/// Add support for an interface to this document if it doesn't already have it
/// </summary>
public T AddLike<T>()
where T:class
{
interfaces.Add(typeof(T).Name);
foreach (var @interface in typeof(T).GetInterfaces())
interfaces.Add(@interface.Name);
return Impromptu.ActLike<T>(new Proxy(this.properties));
}
/// <summary>
/// Cast this object to an interface only if it has previously been created as one of that kind
/// </summary>
public T AsLike<T>()
where T : class
{
if (!this.interfaces.Contains(typeof(T).Name)) return null;
else return Impromptu.ActLike<T>(new Proxy(this.properties));
}
}
Then we need a simple proxy object to wrap up the properties as a dynamic object that we can feed to Impromptu:
public class Proxy : DynamicObject
{
public BsonDocument document { get; set; }
public Proxy(BsonDocument document)
{
this.document = document;
}
public override bool TryGetMember(GetMemberBinder binder, out object result)
{
BsonValue res = null;
this.document.TryGetValue(binder.Name, out res);
result = res.RawValue;
return true; // We always support a member even if we don't have it in the dictionary
}
/// <summary>
/// Set a property (e.g. person1.Name = "Smith")
/// </summary>
public override bool TrySetMember(SetMemberBinder binder, object value)
{
this.document.Add(binder.Name, BsonValue.Create(value));
return true;
}
}
And that’s it! There is no other code required. Multiple-inheritance and code-free persistent objects are now a reality! All you need to do is design some interfaces and objects spring magically to life and get persisted easily.
[NOTE: This is experimental code: it's a prototype of an idea that's been bugging me for some time as I look at how to meld Semantic Web classes which have multiple inheritance relationships with C# classes (that don't) and with MongoDB's document-centric storage format. Does everything really have to be stored in a triple-store or is there some hybrid where objects can be stored with their properties and triple-store statements can be reserved for more complex relationships? Can we get semantic web objects back as meaningful C# objects with strongly typed properties on them? It's an interesting challenge and this approach appears to have some merit as a way to solve it.]
Finally got the 1U Atom Server racked up
Apr 9th
The Atom server I added to my home network is finally installed in the rack and I’ve begun moving storage over to it and off the rather overloaded home automation server. The 1U immediately below it houses 4 SATA drives (mostly 2TB now) with USB/Firewire connections. Readers of my blog will recall my disdain for RAID as a “backup” technology (it’s for availabilitynot backup) so the storage scheme I used is original ->backup -> second backup so there are three copies of everything.
The Atom server is running 64-bit Windows Server and seems surprisingly fast. I plan to run MongoDB on it too.
The ‘Learning Database’ or ‘Why do we need so many different databases?’
Mar 24th
It’s 2011 and database management and design is still a tedious job. Software does more amazing things every day but when it comes to databases it’s back to 1970 in many ways.
My ideal database would borrow from RDBMS (like SQL Server), Document databases (like MongoDB), Graph Databases and Semantic Web Triple Stores; it would be the perfect hybrid of all of these and it would configure itself to be as efficient as possible answering queries.
Initially it might start storing data using a triple store format (since that’s one of the simplest forms of database and yet is capable of some of the richest expression of facts), it might use some graph database techniques to improve performance on the triple store. As triples accumulate regarding a specific subject (or object) it would automatically cluster them and use MongoDB-like documents to store the data in a way that can be retrieved efficiently (like I do in my MongoDB triple store). As more data piles up and the important columns become apparent it would switch some of the data to a relational database format and add any necessary indexes. All this would happen totally automatically.
It would of course also include map-reduce, using whatever language you want to use, and it would shard and replicate itself across servers automatically to maximize performance or reliability or whatever you want it to maximize. Advanced semantic reasoning capabilities would also be included so you can ask complex SPARQL queries expressing logic that you simply cannot do with relational databases today.
At this point the DBAs in the audience are freaking out, partly because they don’t have a job any more, but also because they want to ask all the usual questions “what about constraints?”, “what about foreign keys?”, and of course “what about stored procedures?” because they can’t live without that Cobol-esque T-SQL syntax. Well, guess what DBAs, OWL is a much more powerful way of expressing all those constraints. Using OWL and other semantic web technologies you can build complex ontologies that define how the data is represented and the constraints on it. You can even create an Ontology for your rules, and an Ontology for your Ontology for you rules, …. and so on.
I think this is where databases will end up, and I hope that CJ Date (one of the fathers of modern databases) will still be around to see it!
A Semantic Web / Ontology Driven Approach to CRM
Mar 22nd
Recently I’ve been spending some time working with Microsoft Dynamics CRM 2011. It’s certainly a powerful system with lots of flexibility for customization, but the more time I spend with it staring at endless database tables joined together in myriad ways, and the more time I spend discussing the meaning of different Entities and how they will be used within an organizational content … the more convinced I become that there has to be a better way!
The Semantic Web could well provide a better approach than this. For starters the complex data model would be gone – we’d use a triple store and have simple triples for all the data in the CRM. Normally the joins entailed by such an approach would make it prohibitively slow, but having seen all the tables and joins involved in MS-CRM I’m less concerned about that issue! There are also increasingly good triple-stores that can provide very efficient querying capabilities.
A completely open schema would allow users to add data that traditionally gets lumped into the Notes field in a CRM. For example “John likes baseball”.
A semantic approach would allow the organization to create an ontology to describe what they mean by ‘customer’ as opposed to what the CRM vendor meant by ‘customer’. Better yet they would be able to annotate their ontology with those descriptions and additional rules (e.g. constraints expressed in OWL) using the same approach.
A semantic approach would allow the organization to organize entities by much more than a simple object hierarchy. In MS-CRM for example when working with early-bound entities there is no relationship between a Customer and a Lead other than the fact that they are both Entities. Ideally you want to know that each is also a Person and as such will have a First Name and a Last Name etc. In the Semantic Web you aren’t limited to a single-hierarchy like you are in object-oriented languages, instead, you can create multiple ‘is a’ relationships. So, if you want to define that, say, baseball is a sport and soccer is a sport you can add that fact to your Ontology. With that fact in there you can now deduce the fact that “John likes sport” using a semantic reasoner.
A semantic approach would allow for much more powerful querying capabilities. For example, a query like “find all customers who like sports but haven’t been to an event in the last six months and send them an invite to the game on Saturday” could be mostly translated into a SPARQL query.
There’s the odd academic reference to using Ontologies with CRM, e.g. this one but no commercial products that I know of that have taken this approach.
Are there any? and what do you think of the idea? Comments as always greatly appreciated.
Extending C# to understand the language of the semantic web
Feb 5th
![]()
I was inspired by a question on semanticoverflow.com which asked if there was a language in which the concepts of the Semantic Web could be expressed directly, i.e. you could write statements and perform reasoning directly in the code without lots of parentheses, strings and function calls.
Of course the big issue with putting the semantic web into .NET is the lack of multiple inheritance. In the semantic web the class ‘lion’ can inherit from the ‘big cat’ class and also from the ‘carnivorous animals’ class and also from the ‘furry creatures’ class etc. In C# you have to pick one and implement the rest as interfaces. But, since C# 4.0 we have the dynamic type. Could that be used to simulate multiple inheritance and to build objects that behave like their semantic web counterparts?
The DynamicObject in C# allows us to perform late binding and essentially to add methods and properties at runtime. Could I use that so you can write a statement like “canine.subClassOf.mammal();” which would be a complete Semantic Web statement like you might find in a normal triple store but written in C# without any ‘mess’ around it. Could I use that same syntax to query the triple store to ask questions like “if (lion.subClassOf.animal) …” where a statement without a method invocation would be a query against the triple store using a reasoner capable of at least simple transitive closure? Could I also create a syntax for properties so you could say “lion.Color(“yellow”)” to set a property called Color on a lion?
Well, after one evening of experimenting I have found a way to do just that. Without any other declarations you can write code like this:
dynamic g = new Graph("graph");
// this line declares both a mammal an an animal
g.mammal.subClassOf.animal();
// we can add properties to a class
g.mammal.Label("Mammal");
// add a subclass below that
g.carnivore.subClassOf.mammal();
// create the cat family
g.felidae.subClassOf.carnivore();
// define what the wild things are - a separate hierarchy of things
g.wild.subClassOf.domesticity();
// back to the cat family tree
g.pantherinae.subClassOf.felidae();
// these one are all wild (multiple inheritance at work!)
g.pantherinae.subClassOf.wild();
g.lion.subClassOf.pantherinae();
// experiment with properties
// these are stored directly on the object not in the triple store
g.lion.Color("Yellow");
// complete the family tree for this branch of the cat family
g.tiger.subClassOf.pantherinae();
g.jaguar.subClassOf.pantherinae();
g.leopard.subClassOf.pantherinae();
g.snowLeopard.subClassOf.leopard();
Behind the scenes dynamic objects are used to construct partial statements and then full statements and those full statements are added to the graph. Note that I’m not using full Uri’s here because they wouldn’t work syntactically, but there’s no reason each entity couldn’t be given a Uri property behind the scenes that is local to the graph that’s being used to contain it.
Querying works as expected: just write the semantic statement you want to test. One slight catch is that I’ve made the query return an enumeration of the proof steps used to prove it rather than just a simple bool value. So use `.Any()` on it to see if there is any proof.
// Note that we never said that cheeta is a mammal directly.
// We need to use inference to get the answer.
// The result is an enumeration of all the ways to prove that
// a cheeta is a mammal
var isCheetaAMammal = g.cheeta.subClassOf.mammal;
// we use .Any() just to see if there's a way to prove it
Console.WriteLine("Cheeta is a wild cat : " + isCheetaAMammal.Any());
Behind the scenes the simple statement “g.cheeta.subClassOf.mammal” will take each statement made and expand the subject and object using a logical argument process known as simple entailement. The explanation it might give for this query might be:
because [cheeta.subClassOf.felinae], [felinae.subClassOf.felidae], [felidae.subClassOf.mammal]
As you can see, integrating Semantic Web concepts [almost] directly into the programming language is a pretty powerful idea. We are still nowhere close to the syntactic power of prolog or F# but I was surprised how far vanilla C# could get with dynamic types and a fluent builder. I hope to explore this further and to publish the code sometime. It may well be “the world’s smallest triple store and reasoner”!
This code will hopefully also allow folks wanting to experiment with core semantic web concepts to do so without the ‘overhead’ of a full-blown triple store, reasoner and lots of RDF and angle brackets! When I first came to the Semantic Web I was amazed how much emphasis there was on serialization formats (which are boring to most software folks) and how little there was on language features and algorithms for manipulating graphs (the interesting stuff). With this experiment I hope to create code that focuses on the interesting bits.
The same concept could be applied to other in-memory graphs allowing a fluent, dynamic way to represent graph structures in code. There’s also no reason it has to be limited to in-memory graphs, the code could equally well store all statements in some external triple store.
The code for this experiment is available on bitbucket: https://bitbucket.org/ianmercer/semantic-fluent-dynamic-csharp
Web site crawler and link checker (free)
Jan 13th
In a previous post I provided a utility called LinkChecker that is a web site crawler and link checker. The idea behind LinkChecker is that you can include it in your continuous integration scripts and thus check your web site either regularly or after every deployment and unlike a simple ping check this one will fail if you’ve broken any links within your site or have seo issues. It will also break just once for every site change and then be fixed the next time you run it. This feature means that in a continuous integration system like TeamCity you can get an email or other alert each time your site (or perhaps your competitor’s site) changes.
As promised in that post, a new version is now available. There’s many improvements under the covers but one obvious new feature is the ability to dump all the text content of a site into a text file. Simply append -dump filename.txt to the command line and you’ll get a complete text dump of any site. The dump includes page titles and all visible text on the page (it excludes embedded script and css automatically). It also excludes any element with an ID or CLASS that includes one of the words “footer”, “header”, “sidebar”, “feedback” so you don’t get lots of duplicate header and footer information in the dump. I plan to make this more extensible in future to allow other words to be added to the ignore list.
One technique you can use with this new ‘dump’ option is to dump a copy of your site after each deployment and then check it into source control. Now if there’s every any need to go back to see when a particular word or paragraph was changed on your site you have a complete record. You could for example use this to maintain a text copy of your WordPress blog, or perhaps to keep an eye on someone else’s blog or Facebook page to see when they added or removed a particular story.
Download the new version here:- LinkCheck <-- Requires Windows XP or later with .NET4 installed, unzip and run
Please consult the original article for more information.
LinkCheck is free, it doesn’t make any call backs, doesn’t use any personal data, use at your own risk. If you like it please make a link to this blog from your own blog or post a link to Twitter, thanks!
File and image upload security considerations and best practices
Jan 7th
Many web sites offer the ability to upload files. Whether it’s a simple JPG for an avatar, or a larger image for a photo gallery or perhaps an arbitrary file for a file cabinet type application there are several security considerations you need to take into account and some best practices for dealing with them. Here’s a partial list of some of the steps you should take if you are implementing this capability on your own site.
- Don’t put the files within your normal web site directory structure.
- Remove all path information from the uploaded file. In particular, never let the user specify which directory on the server the file is going to go in by, for example allowing relative paths to be specified.
- Don’t use the original file name the user gave you, store it in your database if you need it, but use a generated file name for the file on disk, e.g. use a Guid as the file name. (You can add a content disposition header with the original file name for downloads but the path and file name on the server shouldn’t be something the user can influence).
- Don’t trust image files – resize them and offer only the resized version for subsequent download. Some images can contain corrupted metadata that has in the past been able to attack vulnerabilities in some image manipulation software. Creating a clean, well-formed image file defends against this.
- Don’t trust mime types or file extensions, examine the file headers or open the file and manipulate it to make sure it’s what it claims to be.
- Limit the upload size and time.
Algorithm Complexity and the ‘Which one will be faster?’ question
Jan 7th
Over and over on Stackoverflow.com people ask a question of the form ‘If I have an algorithm of complexity A, e.g. O(n.log(n)), and an algorithm of complexity B, which one will be faster?’ Developers obsess over using the cleverest algorithm available with the lowest theoretical bound on complexity thinking that they will make their code run faster by doing so. To many developers the simple array scan is anathema, they think they should always use a SortedCollection with binary search, or some heap or tree that can lower the complexity from O(n squared) to O(n.log(n)).
To all these developers I say “You cannot use the complexity of an algorithm to tell you which algorithm is fastest on typical data sets on modern CPUs”.
I make that statement from bitter experience having found and implemented a search that could search vast quantities of data in a time proportional to the size of the key being used to search and not to the size of the data being searched. This clever suffix-tree approach to searching was beaten by a simple dumb array scan! How did that happen?
First you need to understand that there are constants involved. An algorithm might be O(n) but the actual instruction execution time for each step might be A microseconds. Another algorithm of O(n.log(n)) might take B microseconds to execute each step. If B >> A then for many values of n the first algorithm will be faster. Each algorithm may also involve a certain set up time and that too can dwarf execution time when n is small.
But here’s the catch: even if you think the number of add and multiply instructions between the two algorithms is similar, the CPU you are executing them on may have a very different point of view because modern x86 CPUs have in effect been optimized for dumb programmers. They are incredibly fast at processing sequential memory locations in tight loops that can fit in on-chip RAM. Give them a much more complex tree-based O(n log(n)) algorithm and they now have to go off-chip and access scattered memory locations. The results are sometimes quite surprising and can quickly push that O(n log(n)) algorithm out of contention for values of n less than several million.
For most practical algorithms running on typical datasets the only way to be sure that algorithm A is faster than algorithm B is to profile it.
The other huge catch is that even when you have profiled it and found that your complex algorithm is faster, that’s not the end of the story. Modern CPUs are now so fast that memory bottlenecks are often the real problem. If your clever algorithm uses more memory bandwidth than the dumb algorithm it may well affect other threads running on the same CPU. If it consumes so much memory that paging comes into play then any advantage it had in CPU cycles has evaporated entirely.
An example I saw once involved someone trying to save some CPU cycles by storing the result in a private member variable. Adding that member variable made the class larger and as a result less copies of it (and all other classes) could fit in memory and as a result the application overall ran slower. Sure the developer could claim that his method was now 50% faster than before but the net effect was a deterioration in the overall system performance.
A Semantic Web ontology / triple Store built on MongoDB
Jan 5th
In a previous blog post I discussed building a Semantic Triple Store using SQL Server. That approach works fine but I’m struck by how many joins are needed to get any results from the data and as I look to storing much larger ontologies containing billions of triples there are many potential scalability issues with this approach. So over the past few evenings I decided to try a different approach and so I created a semantic store based on MongoDB. In the MongoDB version of my semantic store I take a different approach to storing the basic building blocks of semantic knowledge representation. For starters I decided that typical ABox and TBox knowledge has really quite different storage requirements and that smashing all the complex TBox assertions into simple triples and stringing them together with meta fields only to immediately join then back up whenever needed just seemed like a bad idea from the NOSQL / document-database perspective.
TBox/ABox: In the ABox you typically find simple triples of the form X-predicate-Y. These store simple assertions about individuals and classes. In the TBox you typically find complex sequents, that’s to say complex logic statements having a head (or consequent) and a body (or antecedents). The head is ‘entailed’ by the body, which means that if you can satisfy all of the body statements then the head is true. In a traditional store all the ABox assertions can be represented as triples and all the complex TBox assertions use quads with a meta field that is used solely to rebuild the sequent with a head and a body. The ABox/TBox distinction is however arbitrary (see http://www.semanticoverflow.com/questions/1107/why-is-it-necessary-to-split-reasoning-into-t-box-and-a-box).
I also decided that I wanted to be use ObjectIds as the primary way of referring to any Entity in the store. Using the full Uri for every Entity is of course possible and MongoDB couuld have used that as the index but I wanted to make this efficient and easily shardable across multiple MongoDB servers. The MongoDB ObjectID is ideal for that purpose and will make queries and indexing more efficient.
The first step then was to create a collection that would hold Entities and would permit the mapping from Uri to ObjectId. That was easy: an Entity type inheriting from a Resource type produces a simple document like the one shown below. An index on Uri with a unique condition ensures that it’s easy to look up any Entity by Uri and that there can only ever be one mapping to an Id for any Uri.
RESOURCES COLLECTION - SAMPLE DOCUMENT
{
"_id": "4d243af69b1f26166cb7606b",
"_t": "Entity",
"Uri": "http://www.w3.org/1999/02/22-rdf-syntax-ns#first"
}
Although I should use a proper Uri for every Entity I also decided to allow arbitrary strings to be used here so if you are building a simple ontology that never needs to go beyond the bounds of this one system you can forgo namespaces and http:// prefixes and just put a string there, e.g. “SELLS”. Since every Entity reference is immediately mapped to an Id and that Id is used throughout the rest of the system it really doesn’t matter much.
The next step was to represent simple ABox assertions. Rather than storing each assertion as its own document I created a document that could hold several assertions all related to the same subject. Of course, if there are too many assertions you’ll still need to split them up into separate documents but that’s easy to do. This move was mainly a convenience for developing the system as it makes it easy to look at all the assertions made concerning a single Entity using MongoVue or the Mongo command line interface but I’m hoping it will also help performance as typical access patterns need to bring in all of the statements concerning a given Entity.
Where a statement requires a literal the literal is stored directly in the document and since literals don’t have Uris there is no entry in the resources collection.
To make searches for statements easy and fast I added an array field “SPO” which stores the set of all Ids mentioned anywhere in any of the statements in the document. This array is indexed in MongoDB using the array indexing feature which makes it very efficient to find and fetch every document that mentions a particular Entity. If the Entity only ever appears in the subject position in statements that search will result in possibly just one document coming back which contains all of the assertions about that Entity. For example:
STATEMENTGROUPS COLLECTION - SAMPLE DOCUMENT
{
"_id": "4d243af99b1f26166cb760c6",
"SPO": [
"4d243af69b1f26166cb7606f",
"4d243af69b1f26166cb76079",
"4d243af69b1f26166cb7607c"
],
"Statements": [
{
"_id": "4d243af99b1f26166cb760c5",
"Subject": {
"_t": "Entity",
"_id": "4d243af69b1f26166cb7606f",
"Uri": "GROCERYSTORE"
},
"Predicate": {
"_t": "Entity",
"_id": "4d243af69b1f26166cb7607c",
"Uri": "SELLS"
},
"Object": {
"_t": "Entity",
"_id": "4d243af69b1f26166cb76079",
"Uri": "DAIRY"
}
}
... more statements here ...
]
}
The third and final collection I created is used to store TBox sequents consisting of a head (consequent) and a body (antecedents). Once again I added an array which indexes all of the Entities mentioned anywhere in any of the statements used in the sequent. Below that I have an array of Antecedent statements and then a single Consequent statement. Although the statements don’t really need the full serialized version of an Entity (all they need is the _id) I include the Uri and type for each Entity for now. Variables also have Id values but unlike Entities, variables are not stored in the Resources collection, they exist only in the Rule collection as part of consequent statements. Variables have no meaning outside a consequent unless they are bound to some other value.
RULE COLLECTION - SAMPLE DOCUMENT
{
"_id": "4d243af99b1f26166cb76102",
"References": [
"4d243af69b1f26166cb7607d",
"4d243af99b1f26166cb760f8",
"4d243af99b1f26166cb760fa",
"4d243af99b1f26166cb760fc",
"4d243af99b1f26166cb760fe"
],
"Antecedents": [
{
"_id": "4d243af99b1f26166cb760ff",
"Subject": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760f8",
"Uri": "V3-Subclass8"
},
"Predicate": {
"_t": "Entity",
"_id": "4d243af69b1f26166cb7607d",
"Uri": "rdfs:subClassOf"
},
"Object": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760fa",
"Uri": "V3-Class9"
}
},
{
"_id": "4d243af99b1f26166cb76100",
"Subject": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760fa",
"Uri": "V3-Class9"
},
"Predicate": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760fc",
"Uri": "V3-Predicate10"
},
"Object": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760fe",
"Uri": "V3-Something11"
}
}
],
"Consequent": {
"_id": "4d243af99b1f26166cb76101",
"Subject": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760f8",
"Uri": "V3-Subclass8"
},
"Predicate": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760fc",
"Uri": "V3-Predicate10"
},
"Object": {
"_t": "Variable",
"_id": "4d243af99b1f26166cb760fe",
"Uri": "V3-Something11"
}
}
}
That is essentially the whole semantic store. I connected it up to a reasoner and have successfully run a few test cases against it. Next time I get a chance to experiment with this technology I plan to try loading a larger ontology and will rework the reasoner so that it can work directly against the database instead of taking in-memory copies of most queries that it performs.
At this point this is JUST AN EXPERIMENT but hopefully someone will find this blog entry useful. I hope later to connect this up to the home automation system so that it can begin reasoning across an ontology of the house and a set of ABox assertions about its current and past state.
Since I’m still relatively new to the semantic web I’d welcome feedback on this approach to storing ontologies in NOSQL databases from any experienced semanticists.
Silencing the annoying SpotBot beep – a simple hack
Dec 28th
If you have pets or children a SpotBot is an essential accessory. It’s also handy for parties where a spilled wine glass can be cleaned up in minutes hands-free while you continue talking to your guests. The SpotBot is small and can be brought in, dropped on the spot and a few minutes later it’s all gone. No paper towels, no apologetic guests on hands and knees. Sure, it’s noisy but it’s a lot less intrusive than a full-size steam cleaning vacuum cleaner.
The only downside to this indispensable appliance is the beep noise which begins as soon as it’s finished cleaning and goes on forever until you stop it. Not only is it persistent, it’s also very very loud. Searching the internet for a solution produced an article which didn’t even identify where the beep was coming from and had a solution that involved programming a micro-controller and soldering it in to the upper circuit board! That’s clearly not an option for most people and I wanted a simpler solution so today I took it apart and figured out how to silence it.
Below are the steps I used to quieten our SpotBot. Make sure it’s unplugged first and remove all the liquid containers from it. Proceed at your own risk and although this process involves no electrical changes you should not attempt it unless you are comfortable opening and reassembling electronic devices.

The first step is to open the SpotBot. To do this you need to remove all of the screws on the top side of the unit including the two that are hidden under the panel to the right shown here. There is no need to remove any of the screws on the under side of the unit. Once you have the unit separated into two parts you may also need to remove the zip tie around the wires in the middle to give you more room to work. Up inside the top half of the unit you’ll see a small white box made out of flexible white plastic. Open that box and you’ll see a circuit board and on the front-left corner of that circuit board you’ll see a small piezoelectric device that is responsible for emitting the beep.

You could at this stage undo more screws and get the board out so you can unsolder the beeper or add a resistor inline with it, but as a less drastic solution that can reduce the volume to a much more tolerable level you can try the simple fix of applying a large blob of glue over the piezoelectric device to quieten it down.
Once the glue has set, reassemble the unit and enjoy your new, slightly quieter Spotbot. As the glue cures the sound will get quieter.
Click the images to see larger versions

