The Blog of Ian Mercer.

MongoDB - Map-Reduce coming from C#

People coming from traditional relational database thinking and LINQ sometimes struggle to understand map-reduce. One way to understand it is to realize that it's actually the simple composition of some LINQ operators with which you may already be familiar.

Map reduce is in effect a **SelectMany()**followed by a **GroupBy()followed by anAggregate()**operation.

In a SelectMany() you are projecting a sequence but each element can become multiple elements. This is equivalent to using multiple emit statements in your map operation. The map operation can also chose not to call emit which is like having a Where() clause inside your SelectMany() operation.

In a GroupBy() you are collecting elements with the same key which is what Map-Reduce does with the key value that you emit from the map operation.

In the Aggregate() or reduce step you are taking the collections associated with each group key and combining them in some way to produce one result for each key. Often this combination is simply adding up a single '1' value output with each key from the map step but sometimes it's more complicated.

One thing you should be aware of with map-reduce in MongoDB is that the reduce operation must accept and output the same data type because it may be applied repeatedly to partial sets of the grouped data. In C# your Aggregate() operation would be applied repeatedly on partial sequences to get to the final sequence.

Related Stories