The Blog of Ian Mercer.

Algorithm Complexity and the 'Which one will be faster?' question

Over and over on Stackoverflow.com people ask a question of the form 'If I have an algorithm of complexity A, e.g. O(n.log(n)), and an algorithm of complexity B, which one will be faster?' Developers obsess over using the cleverest algorithm available with the lowest theoretical bound on complexity thinking that they will make their code run faster by doing so. To many developers the simple array scan is anathema, they think they should always use a SortedCollection with binary search, or some heap or tree that can lower the complexity from O(n squared) to O(n.log(n)).

To all these developers I say "You cannot use the complexity of an algorithm to tell you which algorithm is fastest on typical data sets on modern CPUs".

I make that statement from bitter experience having found and implemented a search that could search vast quantities of data in a time proportional to the size of the key being used to search and not to the size of the data being searched. This clever suffix-tree approach to searching was beaten by a simple dumb array scan! How did that happen?

First you need to understand that there are constants involved. An algorithm might be O(n) but the actual instruction execution time for each step might be A microseconds. Another algorithm of O(n.log(n)) might take B microseconds to execute each step. If B >> A then for many values of n the first algorithm will be faster. Each algorithm may also involve a certain set up time and that too can dwarf execution time when n is small.

But here's the catch: even if you think the number of add and multiply instructions between the two algorithms is similar, the CPU you are executing them on may have a very different point of view because modern x86 CPUs have in effect been optimized for dumb programmers. They are incredibly fast at processing sequential memory locations in tight loops that can fit in on-chip RAM. Give them a much more complex tree-based O(n log(n)) algorithm and they now have to go off-chip and access scattered memory locations. The results are sometimes quite surprising and can quickly push that O(n log(n)) algorithm out of contention for values of n less than several million.

For most practical algorithms running on typical datasets the only way to be sure that algorithm A is faster than algorithm B is to profile it.

The other huge catch is that even when you have profiled it and found that your complex algorithm is faster, that's not the end of the story. Modern CPUs are now so fast that memory bottlenecks are often the real problem. If your clever algorithm uses more memory bandwidth than the dumb algorithm it may well affect other threads running on the same CPU. If it consumes so much memory that paging comes into play then any advantage it had in CPU cycles has evaporated entirely.

An example I saw once involved someone trying to save some CPU cycles by storing the result in a private member variable. Adding that member variable made the class larger and as a result less copies of it (and all other classes) could fit in memory and as a result the application overall ran slower. Sure the developer could claim that his method was now 50% faster than before but the net effect was a deterioration in the overall system performance.

Related Stories

Why smarthomes are hard

Why automated learning is hard for a smart home. The perils of over-fitting, under-fitting and how the general unpredictable nature of life makes it hard to build a system that learns your behavior.

Ian Mercer
Ian Mercer

ATAN curve for probabilities

In a home automation system we often want to convert a measurement into a probability. The ATAN curve is one of my favorite curves for this as it's easy to map overything onto a 0.0-1.0 range.

Ian Mercer
Ian Mercer

Home Construction Advice

Several years ago we did a major remodel. I did all of the finish electrical myself and supervised all of the rough-in electrical. I also put in all of the electrical system and water in our barn. I have opinions ...

Ian Mercer
Ian Mercer

T-Mobile home internet

I'm testing a T-Mobile Home Internet device as a backup to XFinity and a way to offload half our monthly traffic to avoid the XFinity 1.2TB cap

Ian Mercer
Ian Mercer

My love/hate relationship with Stackoverflow

Stackoverflow is a terrific source of information but can also be infuriating.

Ian Mercer
Ian Mercer

iBeacon meetup in Seattle - January 2015

My notes on the iBeacon meetup in Seattle held in January 2015

Ian Mercer
Ian Mercer

Xamarin Forms Application For Home Automation

Building a Xamarin Forms application to control my home automation system

Ian Mercer
Ian Mercer

JSON Patch - a C# implementation

Ian Mercer
Ian Mercer

Websites should stop using passwords for login!

A slightly radical idea to eliminate passwords from many of the websites you use just occasionally

Ian Mercer
Ian Mercer

Dynamically building 'Or' Expressions in LINQ

How to create a LINQ expression that logically ORs together a set of predicates

Ian Mercer
Ian Mercer

VariableWithHistory - making persistence invisible, making history visible

A novel approach to adding history to variables in a programming language

Ian Mercer
Ian Mercer

Neo4j Meetup in Seattle - some observations

Some observations from a meetup in Seattle on graph databases and Neo4j

Ian Mercer
Ian Mercer

Updated Release of the Abodit State Machine

A hierarchical state machine for .NET

Ian Mercer
Ian Mercer

My first programme [sic]

At the risk of looking seriously old, here's something found on a paper tape

Ian Mercer
Ian Mercer

Building a better .NET State Machine

A state machine for .NET that I've released on Nuget

Ian Mercer
Ian Mercer

The Internet of Dogs

Connecting our dog into the home automation

Ian Mercer
Ian Mercer

A simple state machine in C#

State machines are useful in many contexts but especially for home automation

Ian Mercer
Ian Mercer

Convert a property getter to a setter

Ian Mercer
Ian Mercer

Finally got the 1U Atom Server racked up

Ian Mercer
Ian Mercer

Timelapse video using the GoPro HD Hero

Ian Mercer
Ian Mercer

MongoDB Map-Reduce - Hints and Tips

Ian Mercer
Ian Mercer

Weather Forecasting for Home Automation

Ian Mercer
Ian Mercer

Lengthening short Urls in C#

Ian Mercer
Ian Mercer

I should have created Four Square ...

Ian Mercer
Ian Mercer

How can I tell if my house is smart?

Ian Mercer
Ian Mercer

Asian Gadgets

Ian Mercer
Ian Mercer

A great video explaining the Semantic Web

Ian Mercer
Ian Mercer

Why don't you trust your build system?

Ian Mercer
Ian Mercer

Interesting Twitter Posts March 15th-

Ian Mercer
Ian Mercer

Twitter links for Week beginning March 8th

Ian Mercer
Ian Mercer

ASP.NET MVC SEO - Solution Part 1

Ian Mercer
Ian Mercer

Elliott 803 - An Early Computer

Ian Mercer
Ian Mercer

Facebook, social gaming and points

Ian Mercer
Ian Mercer

Building sitemap.xml for SEO ASP.NET MVC

Ian Mercer
Ian Mercer

Useful Twitter links Feb 8-Feb 15 2010

Ian Mercer
Ian Mercer

Continuous Integration -> Continuous Deployment

What is "quality" in terms of a released software product or website?

Ian Mercer
Ian Mercer

When will people learn to backup?

A rant about RAID

Ian Mercer
Ian Mercer

Making a bootable Windows 7 USB Memory Stick

Here's how I made a bootable USB memory stick for Windows 7

Ian Mercer
Ian Mercer

Tip: getting the index in a foreeach statement

A tip on using LINQ's Select expression with an index

Ian Mercer
Ian Mercer

Future proof your home with a new conduit system?

Running conduit can be expensive but maybe you don't need one to every room

Ian Mercer
Ian Mercer

SQL Server - error: 18456, severity: 14, state: 38 - Incorrect Login

A rant about developers using the same message for different errors

Ian Mercer
Ian Mercer

WCF and the SYSTEM account

Namespace reservations and http.sys, my, oh my!

Ian Mercer
Ian Mercer

404 errors on IIS6 with ASP.NET 4 Beta 2

Ian Mercer
Ian Mercer

Mixed mode assembly errors after upgrade to .NET 4 Beta 2

Fixing this error was fairly simple

Ian Mercer
Ian Mercer

Balloon Boy was much ado about nothing - Twitter

Some of the more witty comments on Twitter about the Balloon Boy hoax

Ian Mercer
Ian Mercer

The EntityContainer name could not be determined

How to fix the exception "the entitycontainer" name could not be determined

Ian Mercer
Ian Mercer

Shortened URLs should be treated like a Codec ...

Expanding URLs would help users decide whether or not to click a link

Ian Mercer
Ian Mercer

Tagging File Systems

Isn't it time we stopped knowing which drive our file is on?

Ian Mercer
Ian Mercer

A great site for developing and testing regular expressions

Just a link to a site I found useful

Ian Mercer
Ian Mercer

WMPnetwk.exe started using 50% of my CPU

Uninstalling Windows Media Player - the end of an era

Ian Mercer
Ian Mercer

Introducing Jigsaw menus

A novel UI for menus that combines a breadcrumb and a menu in one visual metaphor

Ian Mercer
Ian Mercer

Entity Framework in .NET 4

Ian Mercer
Ian Mercer

Fix for IE's overflow:hidden problem

Ian Mercer
Ian Mercer

A better Tail program for Windows

A comparison of tail programs for Windows

Ian Mercer
Ian Mercer

Measuring website browser performance

Found this great resource on website performance

Ian Mercer
Ian Mercer

Amazon Instance vs Dedicated Server comparison

Some benchmark performance for Amazon vs a dedicated server

Ian Mercer
Ian Mercer

System.Data.EntitySqlException

Hints for dealing with this exception

Ian Mercer
Ian Mercer

Agile Software Development is Like Sailing

You cannot tack too often when sailing or you get nowhere. Agile is a bit like that.

Ian Mercer
Ian Mercer

Exception Handling using Exception.Data

My latest article on CodeProject covers the lesser known Exception.Data property

Ian Mercer
Ian Mercer

Javascript error reporting

Sending client-side errors back to a server for analysis

Ian Mercer
Ian Mercer

AntiVirus Software is the Worst Software!

When your anti-virus software starts stealing your personal data, it's time to remove it!

Ian Mercer
Ian Mercer

ASP.NET Custom Validation

How to solve a problem encountered with custom validation in ASP.NET

Ian Mercer
Ian Mercer

Optimization Advice

Some advice on software optimization

Ian Mercer
Ian Mercer

Linq's missing link

LinqKit came in handy back in 2009

Ian Mercer
Ian Mercer

It's all about disk speed

Why disk speed is the most critical aspect for most modern PCs and servers

Ian Mercer
Ian Mercer

Google Chart API

Ian Mercer
Ian Mercer

Cache optimized scanning of pairwise combinations of values

Using space-filling curves to optimize caching

Ian Mercer
Ian Mercer

Threading and User Interfaces

A rant about how few software programs get threading right

Ian Mercer
Ian Mercer

Take out the trash!

Why Windows shutdown takes so long

Ian Mercer
Ian Mercer

Giving up on Internet Explorer

Ian Mercer
Ian Mercer

New Home Automation Server

Ian Mercer
Ian Mercer

Dell upgrades - a pricey way to go

Ian Mercer
Ian Mercer

Programming mostly C#

Ian's advice on programming

Ian Mercer
Ian Mercer