Extending C# to understand the language of the semantic web
I was inspired by a question on semanticoverflow.com which asked if there was a language in which the concepts of the Semantic Web could be expressed directly, i.e. you could write statements and perform reasoning directly in the code without lots of parentheses, strings and function calls.
Of course the big issue with putting the semantic web into .NET is the lack of multiple inheritance. In the semantic web the class 'lion' can inherit from the 'big cat' class and also from the 'carnivorous animals' class and also from the 'furry creatures' class etc. In C# you have to pick one and implement the rest as interfaces. But, since C# 4.0 we have the dynamic type. Could that be used to simulate multiple inheritance and to build objects that behave like their semantic web counterparts?
The DynamicObject in C# allows us to perform late binding and
essentially to add methods and properties at runtime. Could I use that
so you can write a statement like "canine.subClassOf.mammal();" which
would be a complete Semantic Web statement like you might find in a
normal triple store but written in C# without any 'mess' around it.
Could I use that same syntax to query the triple store to ask questions
like "if (lion.subClassOf.animal) ..." where a statement without a
method invocation would be a query against the triple store using a
reasoner capable of at least simple transitive closure? Could I also
create a syntax for properties so you could say lion.Color("yellow")
to set a property called Color on a lion?
Well, after one evening of experimenting I have found a way to do just that. Without any other declarations you can write code like this:
dynamic g = new Graph("graph");
// this line declares both a mammal an an animal
g.mammal.subClassOf.animal();
// we can add properties to a class g.mammal.Label("Mammal");
// add a subclass below that g.carnivore.subClassOf.mammal();
// create the cat family g.felidae.subClassOf.carnivore();
// define what the wild things are - a separate hierarchy of things
g.wild.subClassOf.domesticity();
// back to the cat family tree g.pantherinae.subClassOf.felidae();
// these one are all wild (multiple inheritance at work!)
g.pantherinae.subClassOf.wild();
g.lion.subClassOf.pantherinae();
// experiment with properties
// these are stored directly on the object not in the triple store
g.lion.Color("Yellow");
// complete the family tree for this branch of the cat family
g.tiger.subClassOf.pantherinae();
g.jaguar.subClassOf.pantherinae();
g.leopard.subClassOf.pantherinae();
g.snowLeopard.subClassOf.leopard();
Behind the scenes dynamic objects are used to construct partial statements and then full statements and those full statements are added to the graph. Note that I'm not using full Uri's here because they wouldn't work syntactically, but there's no reason each entity couldn't be given a Uri property behind the scenes that is local to the graph that's being used to contain it.
Querying works as expected: just write the semantic statement you want to test. One slight catch is that I've made the query return an enumeration of the proof steps used to prove it rather than just a simple bool value. So use `.Any()` on it to see if there is any proof.
// Note that we never said that cheetah is a mammal directly.
// We need to use inference to get the answer.
// The result is an enumeration of all the ways to prove that
// a cheetah is a mammal
var isCheetahAMammal = g.cheeta.subClassOf.mammal;
// we use .Any() just to see if there's a way to prove it
Console.WriteLine("Cheetah is a wild cat : " + isCheetahAMammal.Any());
Behind the scenes the simple statement "g.cheeta.subClassOf.mammal" will take each statement made and expand the subject and object using a logical argument process known as simple entailement. The explanation it might give for this query might be:
> because [cheeta.subClassOf.felinae], [felinae.subClassOf.felidae],
> [felidae.subClassOf.mammal]
As you can see, integrating Semantic Web concepts [almost] directly into the programming language is a pretty powerful idea. We are still nowhere close to the syntactic power of prolog or F# but I was surprised how far vanilla C# could get with dynamic types and a fluent builder. I hope to explore this further and to publish the code sometime. It may well be "the world's smallest triple store and reasoner"!
This code will hopefully also allow folks wanting to experiment with core semantic web concepts to do so without the 'overhead' of a full-blown triple store, reasoner and lots of RDF and angle brackets! When I first came to the Semantic Web I was amazed how much emphasis there was on serialization formats (which are boring to most software folks) and how little there was on language features and algorithms for manipulating graphs (the interesting stuff). With this experiment I hope to create code that focuses on the interesting bits.
The same concept could be applied to other in-memory graphs allowing a fluent, dynamic way to represent graph structures in code. There's also no reason it has to be limited to in-memory graphs, the code could equally well store all statements in some external triple store.
The code for this experiment is available on bitbucket: https://bitbucket.org/ianmercer/semantic-fluent-dynamic-csharp