Posts tagged .NET
MongoDB substring search with a difference
Nov 25th
It’s quite common to want to search a database for a key that starts with a given string. In SQL you have LIKE and in MongoDB you have regular expressions:
db.customers.find( { name : { $regex : '^acme', $options: 'i' } } );
But what if you want to do the inverse of this? i.e. to search the database for the keys that are themselves substrings of the search string? For example, suppose you are trying to parse a block of text and you want to find phrases in the database that match the start of the current block of text. In SQL you would be dead in the water but with MongoDB you can create a RegEx that matches either the first word, or the first two words, or the first three words, … and so on.
We can construct a regular expression to do this, it might look something like: ^word1($| word2($| word3$))
Here’s a C# method that can create the necessary regular expression:
/// <summary>
/// This generates a regular expression that matches as much of the given phrase as it can from a string
/// i.e. a reverse prefix search where you want the database to supply the prefix and match it against your query
/// useful for matching 'as much as possible from a given input'
/// </summary>
private string generatePrefixRegex(string phrase, bool atStart)
{
string[] bits = phrase.Split(' ');
string result = bits[0];
// At the start of a sentence, if the first character is upper cased, we should also be looking for a lowercased verson of it
if (atStart && char.IsUpper(result[0]))
{
result = string.Format("(%0|%1)%2", char.ToLowerInvariant(result[0]), char.ToUpperInvariant(result[0]), result.Substring(1));
}
// Each additional word - either we end the string before it or we must include it
foreach (var bit in bits.Skip(1))
{
result = result + "($| " + Regex.Escape(bit);
}
result = result + "$"; // last word must end string
foreach (var bit in bits.Skip(1))
{
result = result + ")"; // close the expression
}
return "^" + result; // Must start at the start of a Name
}
Home network crawler – cataloging every file on the home LAN with C# and MongoDB
Aug 22nd

Map-Reduce in action: The glaciers in Greenland 'map' the canyon walls into streams of rocks called lateral moraine. As the glaciers merge these rocks are 'reduced' into streams in the middle called 'medial' moraine. (A photo I took over Greenland this summer.)
I’m not a huge fan or RAID arrays – they mostly mean there’s another component to go wrong (the controller card) and when they do go wrong you can lose all your data just as easily as if it were all on one drive. I prefer a multiple copy strategy, an “Amazon S3 for the home” if you like. The downside of this is that there are multiple copies of each file across the home network and as I have several generations of hard drives the mapping from primary to secondary to tertiary is complex and hard to manage! It’s also really hard to find a single file when there are so many places to look and it’s nigh on impossible to be sure that I have the necessary three copies of every important file in the right places at all times.
So this weekend I embarked on a small project to catalog every file, directory and storage volume on the entire home network including drives that are only sometimes connected. The software has been running all weekend and is close to cataloging everything. It’s found 5 million files so far representing over 6TB of data!
The architecture I chose for this software was an agent that runs on each PC to catalog all of the attached volumes. This client uploads all the directories and files that it finds to a MongoDB database running on the same Atom server as the main storage array. The poor little Atom server’s 4GB of RAM has been in constant use but the server has remained responsive, in part because it boots from an SSD drive.
Each volume, directory and file is represented by a document in MongoDB in a single collection. The agent calculates an MD5 hash for each file and extracts metadata from MP3, WMA and JPG files. It also stores all of the key file dates (created, updated, accessed) and references to parent directories, volume identifiers and the currently connected PC. It does not assume that a volume is always connected to the same computer – you can unplug an external drive from one and put it somewhere else and it will all work just fine.
I implemented a re-startable tree scan that uses a couple of DateTime stamps to be able to determine which directories need to be scanned during the current pass and which ones have already been scanned. Any agent can be killed at any time and restarted and it will carry on walking the directory tree right where it left off. It will even continue correctly in the case where you move a volume from one PC to another.
Each agent uses the Parallel Task library’s Parallel.ForEach to crawl each volume in parallel and to parse multiple files from each directory simultaneously.
By storing all of the file metadata in Mongo DB it’s easy to use Map-Reduce to calculate some interesting statistics for the files on the network.
For example, to create a summary of file sizes I can use a Map function:
function Map() {
if (this.Size && this._t == "FileInformation")
{
var size = this.Size;
if (size < 1024)
emit ("kb", {count:1, size:this.Size});
else if (size < 1024*1024)
emit ("mb", {count:1, size:this.Size});
else if (size < 1024*1024*1024)
emit ("gb", {count:1, size:this.Size});
else if (size < 1024*1024*1024*1024)
emit ("tb", {count:1, size:this.Size});
else
emit ("tb+", {count:1, size:this.Size});
}
}
and a reduce function:
function Reduce(key, arr_values) {
var count = 0;
var size = 0;
for(var i in arr_values)
{
count = count + arr_values[i].count;
size = size + arr_values[i].size;
}
return {count:count, size:size};
}
Map-Reduce operations like this take about 20 minutes to run (on the Atom server with just 4GB of RAM) whereas any query serviced by one of the indexes on the MongoDB collection is almost instantaneous.
I’ve been using the excellent MongoVue to run simple map-reduce scripts like this and to keep track of how quickly the database is growing.
Map-reduce can also be used to find duplicate files – by emitting the MD5 hash as the key and some information about the file as the value I can find every copy of every file across every computer on the home network.
Since I have the file name and metadata for every file on the home network I can also easily find any file using MongoDB’s regex matching feature against the path.
The Hard Parts
For starters you’ll need a library that can handle long file names. Then you’ll need to fix it to provide at least the functionality that FileInfo and DirectoryInfo give you in .NET.
Next you’ll need to learn about reparse-points and hard-links and you’ll need to skip over them because with them in place the file system is not a tree; it’s a cyclical graph in which a simple crawler will quickly get confused or stuck.
You’ll also want to store the NTFS file Id and the unique Volume ID for every file so you can track it when the file is moved or the removable drive is connected to a different computer.
So how well does it work?
This all seems to work really well. Nearly every volume has now been cataloged. It’s located about 5M files occupying over 6TB of space. The worst case offender for the number of copies of the same file is 100+. I’ve used the find feature in MongoDB to find a file I was missing and I’m better able to plan how to arrange directories and file generations across the various hard drives I have.
What’s next
Well, of course this needs to be connected to the home automation system and my Natural Language engine so you can ask “send a copy of IMG_0228 from last week to X” or “where are all the spreadsheets I created last year?” That will be fairly easy.
After that I hope to incorporate backup features into the agents too so they can automatically keep the required number of copies of each file according to its importance. I’d also like to set up a rotating set of external drives that go in the fire safe when not connected and when they are connected they get updated with the latest copies of all the important files.
I’d also like to be able to get the agents to move whole groups of directories around between drives as juggling the directory layout each time a new hard drive is added to the system is always a time consuming process.
Comments or Questions?
Does everyone else have a hard time managing multiple computers, hard drives, directories and multiple copies of files? What tools do you use to do this? Is there anything commercially available that I could have used instead? Would a tool like this be useful to you? Should I publish the code somewhere? Comments and questions are always welcome here or on twitter.
Stop writing rude software! Use LASTINPUTINFO instead.
Aug 19th
Can you imagine what life would be like it people behaved like software programs do?
You’d be working away on something when someone would interrupt, steal your attention, and demand a response. You’d be interrupted in the middle of sentences all the time and while you were dealing with one interruption someone else could come up and interrupt you again.
You wouldn’t put up with people like that so why do you put up with software that behaves that way?
Windows itself is one of the worst offenders: the dreaded dialog that explains that updates have been installed and it wants to reboot, right this instant has caused me significant inconvenience in the past as it steals focus and then grabs the next return character and assumes I really did want to reboot right now, right in the middle of a blog post!
There really is no excuse for writing rude software. Windows includes an API called LASTINPUTINFO that can tell you if the user is busy typing or moving the mouse and you can delay your annoying toast pop-up, or worse that focus-stealing modal dialog until you think the user is ready for it. The C# code below shows how to use this API call to get a number of seconds since the last user input. Simply delay your notification or dialog until an appropriate time has passed (e.g. 5 seconds) and only then interrupt the user).
Background processing
Similarly if your background processing is hammering the disk drive you can make it more polite and throttle it back when the user is active on their computer. (You did, of course do all that background processing on a lower priority thread, didn’t you!)
One other area you might want to consider is using BITS to download files instead of hammering their internet connection to fetch files in the background.
The Code
So here’s the code you should use from today to make your software polite:
public static class Input
{
[DllImport("User32.dll")]
private static extern bool
GetLastInputInfo(ref LASTINPUTINFO plii);
private struct LASTINPUTINFO
{
public uint cbSize;
public uint dwTime;
}
/// <summary>
/// How many seconds since last user input
/// </summary>
public static double SecondsSinceLastInput()
{
LASTINPUTINFO lastInPut = new LASTINPUTINFO();
lastInPut.cbSize = (uint)System.Runtime.InteropServices.Marshal.SizeOf(lastInPut);
GetLastInputInfo(ref lastInPut);
uint idle = (uint)Environment.TickCount - lastInPut.dwTime;
return idle/1000.0;
}
}
C# Natural Language Engine connected to Microsoft Dynamics CRM 2011 Online
Jun 5th
In an earlier post I discussed some ideas around a Semantic CRM.
Recently I’ve been doing some clean up work on my C# Natural Language Engine and decided to do a quick test connecting it to a real CRM. As you may know from reading my blog, this natural language engine is already heavily used in my home automation system to control lights, sprinklers, HVAC, music and more and to query caller ID logs and other information.
I recently refactored it to use the Autofac dependency injection framework and in the process realized just how close my NLP engine is to ASP.NET MVC 3 in its basic structure and philosophy! To use it you create Controller classes and put action methods in them. Those controller classes use Autofac to get all of the dependencies they may need (services like an email service, a repository, a user service, an HTML email formattting service, …) and then the methods in them represents a specific sentence parse using the various token types that the NLP engine supports. Unlike ASP.NET MVC3 there is no Route registration; the method itself represents the route (i.e. sentence structure) that it used to decide which method to call. Internally my NLP engine has its own code to match incoming words and phrases to tokens and then on to the action methods. In a sense the engine itself is one big dependency injection framework working against the action methods. I sometimes wish ASP.NET MVC 3 had the same route-registration-free approach to designing web applications (but also appreciate all the reasons why it doesn’t).
Another improvement I made recently to the NLP Engine was to develop a connector for the Twilio SMS service. This means that my home automation system can now accept SMS messages as well as all the other communication formats it supports: email, web chat, XMPP chat and direct URL commands. My Twilio connector to NLP supports message splitting and batching so it will buffer up outgoing messages to reach the limit of a single SMS and will send that. This lowers SMS charges and also allows responses that are longer than a single SMS message.
Using this new, improved version of my Natural Language Engine I decided to try connecting it to a CRM. I chose Microsoft Dynamics CRM 2011 and elected to use the strongly-typed, early-bound objects that you can generate for any instance of the CRM service. I added some simple sentences in an NLPRules project that allow you to tell it who you met, and to input some of their details. Unlike a traditional forms-based approach the user can decide what information to enter and what order to enter it in. The Natural Language Engine supports the concept of a conversation and can remember what you were discussing allowing a much more natural style of conversation that some simple rule-based engines and even allowing it to ask questions and get answers from the user.
Here’s a screenshot showing a sample conversation using Google Talk (XMPP/Jabber) and the resulting CRM record in Microsoft CRM 2011 Online. You could have the same conversation over SMS or email. Click to enlarge.
Based on my limited testing this looks like another promising area where a truly fluent, conversational-style natural language engine could play a significant role. Note how it understands email addresses, phone numbers and such like and in code these all become strongly typed objects. Where it really excels is in temporal expressions where it can understand things like “who called on a Saturday in May last year?” and can construct an efficient SQL query from that.
Consultancy
Sep 1st
Software Consultancy Serving the Greater Seattle Area

Our consulting service is now open for business.
We can help you with:-
- Business planning
- Program management
- Software Development for the Microsoft.NET platform
- Architectural advice and review
- Custom software development (offsite or onsite)
- Migrating web applications to ASP.NET MVC 2
- Migration to Entity Framework 4
- Complex threading using the Task Parallel Library
- Performance optimization (memory and CPU)
- Search engine optimization (SEO)
- Software lifecycle management
- Source code control and continuous integration
- Authoring documents for patent applications
- Recruiting support (interviewing candidates)
We specialize in web applications, web services, and technical programming. If your requirements mandate a strong algorithmic or mathematical approach we have the expertise to help you. We are a local consultancy able to meet with you in Seattle, Redmond, or Bellevue at short notice. We have over twenty years of industry experience developing and shipping world-class software products. We have a holistic and pragmatic approach to problem solving. We offer full life-cycle support from inception through planning, development and deployment and into monitoring, maintenance and trouble shooting.
Please contact us on (425) 522-2040 or by email to schedule an initial meeting or to discuss your needs.
Constrained parallelism for the Task Parallel Library
Sep 1st
When developing .NET applications there is often the need to execute multiple background processes, for example, fetching and rendering different size thumbnails for images. Typically you queue actions like these onto the thread pool. But in the case of thumbnail generation you typically want to fetch a base image first and then perform the resize operations on it. If five web pages each request a different thumbnail size simultaneously you may end up fetching the same image five times before processing it. Of course, you can add file based locking around this to ensure that only the first once gets to fetch the data but it would be much better if you could instead instruct the Task Parallel Library to execute co-dependent tasks sequentially.
The new Task parallel library has continuations that allow one task to chain onto the end of a previous task but you still a way to track all the tasks currently active so you can find the other task to chain onto it. In a multi-threaded asp.net environment that’s not so easy.
Below is a TaskFactory that gives you constrained parallelism allowing you to queue up tasks in such a way that no two tasks with the same key will execute in parallel. To use it you simply create a new TaskFactorySequentiallyByKey and then call StartNewChainByKey() with a suitable key, e.g. “RENDERimage12345.jpg”. This method returns a normal Task object that you can Wait on or add more continuations. All the usual TaskFactory constructor options are provided so you can have a different TaskScheduler, common cancellation token, and other options.
Note also that it expects an Action<CancellationToken> not just a plain Action. This is so your Action can be polite and monitor the cancellation token to know when to stop early. If you don’t need that you can always pass in a closure that tosses the CancellationToken, i.e. (token) => MyAction().
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Threading;
using System.Diagnostics;
namespace Utility
{
/// <summary>
/// The TaskFactorySequentiallyByKey factory limits concurrency when actions are passed with the same key. Those actions are executed sequentially
/// and never in parallel.
/// </summary>
/// <remarks>
/// For example, you have an action to fetch an image from the web to a local hard drive and then render a specific size of thumbnail for it.
/// The action includes code to check if the original image is already on disk, if not it fetches it.
/// It then checks if the correct size thumbnail has been rendered, if not it renders it.
/// You want to be able to fire off requests for thumbnails from multiple different asp.net web pages and ensure that any two requests for the
/// same original image are executed sequentially so that the image is only fetched once from the web before both thumbnail renders run.
/// </remarks>
public class TaskFactorySequentiallyByKey : TaskFactory
{
/// <summary>
/// Tasks currently queued based on key
/// </summary>
Dictionary<string, Task> inUse = new Dictionary<string, Task>();
public TaskFactorySequentiallyByKey()
: base()
{
}
public TaskFactorySequentiallyByKey(CancellationToken cancellationToken)
: base(cancellationToken)
{ }
public TaskFactorySequentiallyByKey(TaskScheduler scheduler)
: base(scheduler)
{ }
public TaskFactorySequentiallyByKey(TaskCreationOptions creationOptions, TaskContinuationOptions continuationOptions)
: base(creationOptions, continuationOptions)
{ }
public TaskFactorySequentiallyByKey(CancellationToken cancellationToken, TaskCreationOptions creationOptions, TaskContinuationOptions continuationOptions, TaskScheduler scheduler)
: base(cancellationToken, creationOptions, continuationOptions, scheduler)
{ }
protected virtual void FinishedUsing(string key, Task taskThatJustCompleted)
{
lock (this.inUse)
{
// If the key is present AND it point to the task that just finished THEN we are done
// and can clear the key for the next task that comes in ...
if (this.inUse.ContainsKey(key))
if (this.inUse[key] == taskThatJustCompleted)
{
this.inUse.Remove(key);
Debug.WriteLine("Finished using " + key + " completely");
}
else
{
Debug.WriteLine("Finished an item for " + key);
}
}
}
/// <summary>
/// Queue an action but prevent parallel execution of items having the same key. Instead, run them sequentially.
/// </summary>
/// <remarks>
/// This allows you to, for example, queue up tasks to fetch an image from the web to a cache and render a thumbnail for it at different sizes
/// while ensuring that the image is only fetched to the cache once before each different size thumbnail is generated
/// </remarks>
public Task StartNewChainByKey(string key, Action<CancellationToken> action)
{
return StartNewChainByKey(key, action, base.CancellationToken);
}
/// <summary>
/// Queue an action but prevent parallel execution of items having the same key. Instead, run them sequentially.
/// </summary>
/// <remarks>
/// This allows you to, for example, queue up tasks to fetch an image from the web to a cache and render a thumbnail for it at different sizes
/// while ensuring that the image is only fetched to the cache once before each different size thumbnail is generated
/// </remarks>
public Task StartNewChainByKey(string key, Action<CancellationToken> action, CancellationToken cancellationToken)
{
CancellationToken combined = cancellationToken == base.CancellationToken ? base.CancellationToken :
CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, base.CancellationToken).Token;
lock (inUse)
{
Task result;
if (inUse.TryGetValue(key, out result))
{
// chain the supplied action after it ...
result = result.ContinueWith((task) => action(combined), combined);
// And then schedule a completion check after that
result.ContinueWith((task) => FinishedUsing(key, task));
// Update the dictionary so that it tracks the new LAST task in line, not any of the earlier ones
inUse[key] = result;
Debug.WriteLine("Chained onto " + key);
return result;
}
// otherwise simply create it and start it after remembering that the key is in use
result = new Task(() => action(combined), combined);
inUse.Add(key, result);
// queue up the check after it
result.ContinueWith((task) => FinishedUsing(key, task));
Debug.WriteLine("Starting a new action for " + key);
// And finally start it
result.Start(this.Scheduler);
return result;
}
}
}
}
Singleton tasks: A TaskFactory for the Task Parallel Library with ‘run-only-one’ semantics
Sep 1st
When developing .NET applications there is often the need to execute some slow background process repeatedly. For example, fetching a feed from a remote site, updating a user’s last logged in time, … etc. Typically you queue actions like these onto the thread pool. But under load that becomes problematic as requests may be coming in faster than you can service them, the queue builds up and you are now executing multiple requests for the same action when you only really needed to do one. Even when not under load, if two users request a web page that requires the same image to be loaded and resized for display you only want to fetch it and resize it once. What you really want is an intelligent work queue that can coalesce multiple requests for the same action into a single action that gets executed just once.
The new Task parallel library doesn’t have anything that can handle these ‘run-only-one’ actions directly but it does have all the necessary building blocks to build one by creating a new TaskFactory and using Task continuations.
Below is a TaskFactory that gives you ‘run-only-one’ actions. To use it you simply create a new TaskFactoryLimitOneByKey and then call StartNewOrUseExisting() with a suitable key, e.g. “FETCH/cache/image12345.jpg”. This method returns a normal Task object that you can Wait on or add more continuations. All the usual TaskFactory constructor options are provided so you can have a different TaskScheduler, common cancellation token, and other options.
Note also that it expects an Action<CancellationToken> not just a plain Action. This is so your Action can be polite and monitor the cancellation token to know when to stop early. If you don’t need that you can always pass in a closure that tosses the CancellationToken, i.e. (token) => MyAction().
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Threading;
using System.Diagnostics;
namespace Utility
{
/// <summary>
/// A task factory where Tasks are queued up with a key and only one of that key is allowed to exist either in the queue or executing
/// </summary>
/// <remarks>
/// This is useful for tasks like fetching a file from backing store, or updating local information from a remote service
/// You want to be able to queue up a Task to go do the work but you don't want it to happen 5 times in quick succession
/// NB: This does not absolve you from using file locking and other techniques in your method to handle simultaneous requests,
/// it just greatly reduces the chances of it happening. Another example would be updating a user's last logged in data in a
/// database. Under heavy load the queue to write to the database may be getting long and you don't want to update it for the same
/// user repeatedly if you can avoid it with a single write.
/// </remarks>
public class TaskFactoryLimitOneByKey : TaskFactory
{
/// <summary>
/// Tasks currently queued based on key
/// </summary>
Dictionary<string, Task> inUse = new Dictionary<string, Task>();
public TaskFactoryLimitOneByKey()
: base()
{
}
public TaskFactoryLimitOneByKey(CancellationToken cancellationToken)
: base(cancellationToken)
{ }
public TaskFactoryLimitOneByKey(TaskScheduler scheduler)
: base(scheduler)
{ }
public TaskFactoryLimitOneByKey(TaskCreationOptions creationOptions, TaskContinuationOptions continuationOptions)
: base(creationOptions, continuationOptions)
{ }
public TaskFactoryLimitOneByKey(CancellationToken cancellationToken, TaskCreationOptions creationOptions, TaskContinuationOptions continuationOptions, TaskScheduler scheduler)
: base(cancellationToken, creationOptions, continuationOptions, scheduler)
{ }
protected virtual void FinishedUsing(string key, Task taskThatJustCompleted)
{
lock (this.inUse)
{
// If the key is present AND it point to the task that just finished THEN we are done
// and can clear the key so that the next task coming in using it will get to execute ...
if (this.inUse.ContainsKey(key))
if (this.inUse[key] == taskThatJustCompleted)
{
this.inUse.Remove(key);
Debug.WriteLine("Finished using " + key + " completely");
}
else
{
Debug.WriteLine("Finished an item for " + key);
}
}
}
/// <summary>
/// Queue only one of a given action based on a key. A singleton pattern for Tasks with the same key.
/// </summary>
/// <remarks>
/// This allows you to queue up a request to, for example, render a file based on the file name
/// Even if multiple users all request the file at the same time, only one render will ever run
/// and they can all wait on that Task to complete.
/// </remarks>
public Task StartNewOrUseExisting(string key, Action<CancellationToken> action)
{
return StartNewOrUseExisting(key, action, base.CancellationToken);
}
/// <summary>
/// Queue only one of a given action based on a key. A singleton pattern for Tasks with the same key.
/// </summary>
/// <remarks>
/// This allows you to queue up a request to, for example, render a file based on the file name
/// Even if multiple users all request the file at the same time, only one render will ever run
/// and they can all wait on that Task to complete.
/// </remarks>
public Task StartNewOrUseExisting (string key, Action<CancellationToken> action, CancellationToken cancellationToken)
{
CancellationToken combined = cancellationToken == base.CancellationToken ? base.CancellationToken :
CancellationTokenSource.CreateLinkedTokenSource(cancellationToken, base.CancellationToken).Token;
lock (inUse)
{
if (inUse.ContainsKey(key))
{
Debug.WriteLine("Reusing existing action for " + key);
return inUse[key]; // and toss the new action away
}
// otherwise, make a new one and add it ... with a continuation on the end to pull it off ...
Task result = new Task(() => action(combined), combined);
inUse.Add(key, result);
// queue up the check after it
result.ContinueWith((finished) => this.FinishedUsing(key, result));
Debug.WriteLine("Starting a new action for " + key);
// and finally start it
result.Start(this.Scheduler);
return result;
}
}
}
}
GDI+ Image.FromFile has a problem – here’s how to fix it
Jul 30th
In GDI+ you can call Image.FromFile to load an image from a file. BUT there are several issues with this call, the biggest being that GDI+ will keep the file open long after you are done with it. Here is an image loader that gets around this issue.
If you are running a high volume web site, and your images are on a SAN you’ll find this technique necessary to prevent an eventual exhaustion of filehandles.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Drawing;
using System.IO;
using System.Data;
namespace Utility
{
public static class ImageLoader
{
// This isn’t going to help much – you’ll run out of memory anyway on very large images – but if you are keeping several in memory it might …
public const int MaximumImageDimension = 10000;
///
/// Method to safely load an image from a file without leaving the file open,
/// also gets the size down to a manageable size in the case of HUGE images
///
/// An Image – don’t forget to dispose of it later
public static Image LoadImage (string filePath)
{
try
{
FileInfo fi = new FileInfo(filePath);
if (!fi.Exists) throw new FileNotFoundException(“Cannot find image”);
if (fi.Length == 0) throw new FileNotFoundException(“Zero length image file “);
// Image.FromFile is known to leave files open, so we use a stream instead to read it
using (FileStream fs = new FileStream(filePath, FileMode.Open, FileAccess.Read))
{
if (!fs.CanRead) throw new FileLoadException (“Cannot read file stream”);
if (fs.Length == 0) throw new FileLoadException(“File stream zero length”);
using (Image original = Image.FromStream(fs))
{
// Make a copy of the file in memory, then release the one GDI+ gave us
// thus ensuring that all file handles are closed properly (which GDI+ doesn’t do for us in a timely fashion)
int width = original.Width;
int height = original.Height;
if (width == 0) throw new DataException(“Bad image dimension width=0″);
if (height == 0) throw new DataException(“Bad image dimension height=0″);
// Now shrink it to Max size to control memory consumption
if (width > MaximumImageDimension)
{
height = height * MaximumImageDimension / width;
width = MaximumImageDimension;
}
if (height > MaximumImageDimension)
{
width = width * MaximumImageDimension / height;
height = MaximumImageDimension;
}
Bitmap copy = new Bitmap(width, height);
using (Graphics graphics = Graphics.FromImage(copy))
{
graphics.DrawImage(original, 0, 0, copy.Width, copy.Height);
}
return copy;
}
}
}
catch (Exception ex)
{
ex.Data.Add(“FileName”, filePath);
throw;
}
}
}
A simple redirect route handler for ASP.NET 3.5 routing
Apr 20th
ASP.NET 3.5 Routing is a very powerful tool not just for registering routes for newer ASP.NET MVC applications but also for adding SEO friendly routes to older Webforms (ASPX) applications, or for routing multiple URLs to a single page. But that’s not all it can do. You can create your own IRouteHandler and then have complete control over what to do with any incoming HttpRequest.
Here for example is a way to do a permanent redirect when a given route is matched. To use it you might, for example, do:-
routes.Add(new Route("sample.aspx", new RedirectRouteHandler("/home/start")));
Here is the RedirectRouteHandler that can turn any request into a 301 redirect for you:-
/// <summary>
/// Redirect Route Handler
/// </summary>
public class RedirectRouteHandler : IRouteHandler
{
private string newUrl;
public RedirectRouteHandler(string newUrl)
{
this.newUrl = newUrl;
}
public IHttpHandler GetHttpHandler(RequestContext requestContext)
{
return new RedirectHandler(newUrl);
}
}
/// <summary>
/// <para>Redirecting MVC handler</para>
/// </summary>
public class RedirectHandler : IHttpHandler
{
private string newUrl;
public RedirectHandler(string newUrl)
{
this.newUrl = newUrl;
}
public bool IsReusable
{
get { return true; }
}
public void ProcessRequest(HttpContext httpContext)
{
httpContext.Response.Status = "301 Moved Permanently";
httpContext.Response.StatusCode = 301;
httpContext.Response.AppendHeader("Location", newUrl);
return;
}
}
Note: I’m not saying this is the best or only way to handle this. You’ll want to look at Url Rewriting and the Application and Request Routing module for IIS7 in particular.
Why functional programming and LINQ is often better than procedural code
Apr 15th
Functional programming is a relatively new component in the C# language. It can potentially replace for-loops in many situations with simpler code, but the question remains ‘what’s wrong with a good old for loop?’
Here are some of the reasons I think functional programming is important and in particular how LINQ can improve the readability, maintainability, and parallelizability (if there were such a word) of your code:
- Functional approaches are potentially easier to parallelize either manually using PLINQ or by the compiler. As CPUs move to even more cores this may become more important.
- Functional approaches make it easier to achieve lazy evaluation in multi-step processes because you can pass the intermediate results to the next step as a simple variable which hasn’t been evaluated fully yet rather than evaluating the first step entirely and then passing a collection to the next step (or without using a separate method and a yield statement to achieve the same procedurally).
- Functional approaches are often shorter and easier to read.
- Functional approaches often eliminate complex conditional bodies within for loops (e.g. if statements and ‘continue’ statements) because you can break the for loop down into logical steps – selecting all the elements that match, doing an operation on them, …
These days I opt for the functional syntax more often than not and fall back to for-loops when:-
A. The body of the loop contains complex logic that cannot be disentangled into a cleaner sequential application of functions and it simply easier to just write a for-loop with the complex conditional code in it.
B. The task is inherently not functional, i.e. has side effects
C. The task needs exception handling in it. Sure you can write big lambda blocks with try catch in them but at some point it becomes easier and cleaner just to use a for-loop.
