Mathias Brandewinder on .NET, F#, VSTO and Excel development, and quantitative analysis / machine learning.
by Mathias 6. May 2012 12:48

I am digging back into the Bumblebee code base, to clean it up before talking at the New England F# user group in Boston in June. As usual for me, it’s a humbling experience to face my own code, 6 months later, or, if you are an incorrigible optimist, it’s great to see that I am so much smarter today than a few months ago…

In any case, while toying with one of the samples, I noted that performance was degrading pretty steeply as the size of the problem was increasing. Most of the action revolved around producing random shuffles of a list, so I figured it would be interesting to look into it and see where I messed up how this could be improved upon.

Here is the original code, a quick-and-dirty implementation of the Fisher-Yates shuffle:

open System

let swap fst snd i =
   if i = fst then snd else
   if i = snd then fst else

let shuffle items (rng: Random) =
   let rec shuffleTo items upTo =
      match upTo with
      | 0 -> items
      | _ ->
         let fst = rng.Next(upTo)
         let shuffled = List.permute (swap fst (upTo - 1)) items
         shuffleTo shuffled (upTo - 1)
   let length = List.length items
   shuffleTo items length

let main argv = 
     let test = [1..10000]
     let random = new Random()
     let shuffled = shuffle test random

Running the test case in fsi, using #time, produces the following:

Real: 00:00:42.735, CPU: 00:00:42.734, GC gen0: 2307, gen1: 6, gen2: 0

(Digression: #time is absolutely awesome – just typing #time;; in a fsi session will automatically display performance information, allowing to quickly tweak a function and fine-tune it “on the fly”. I wish I had known about it earlier.)

My initial assumption was that the problem revolved around performing multiple permutations of a List. However, I figured it would be interesting to take the opportunity and use the Performance Analysis tools provided in VS11 – and here is what I got:


Uh-oh. Looks like the shuffle is spending most of its time doing comparisons in the swap function – and what the hell is HashCompare.GenericEqualityIntrinsic doing in here? Something is off.

Looking into the swap function provides a hint:


F# has identified that the function could be made generic. It’s great, but in our case it comes with overhead, because we simply want to compare integers. Let’s mark the function as inline, to avoid that problem (we could also make the function non-generic, by marking one of the inputs as integer):



by Mathias 17. October 2010 16:13

The current project I am working on requires writing large amount of data to Excel worksheets. In this type of situation, I create an array with all the data I want to write, and set the value of the entire target range at once. I know from experience that this method is much faster than writing cells one by one, but I was curious about how much faster, so I wrote a little test, writing larger and larger chunks of data and measuring the speed of both methods:

private static void WriteArray(int rows, int columns, Worksheet worksheet)
   var data = new object[rows, columns];
   for (var row = 1; row <= rows; row++)
      for (var column = 1; column <= columns; column++)
         data[row - 1, column - 1] = "Test";

   var startCell = (Range)worksheet.Cells[1, 1];
   var endCell = (Range)worksheet.Cells[rows, columns];
   var writeRange = worksheet.Range[startCell, endCell];

   writeRange.Value2 = data;
private static void WriteCellByCell(int rows, int columns, Worksheet worksheet)
   for (var row = 1; row <= rows; row++)
      for (var column = 1; column <= columns; column++)
         var cell = (Range)worksheet.Cells[row, column];
         cell.Value2 = "Test";

Clearly, the array approach is the way to go, performing close to 1000 times faster per cell. It also seems to improve as size increases, but that would require a bit more careful testing.



by Mathias 13. June 2010 12:30

In my last post I explored how ExcelDNA can be used to write high-performance UDFs for Excel, calling .Net code without the overhead of VSTO. Using .Net instead of VBA for intensive computations already yields a nice improvement. Still, I regretted that ExcelDNA supports .Net up to 3.5 only, which puts the Task Parallel Library off limits – and is too bad  because the TPL is just totally awesome to leverage the power of multi-cores.

As it turned out, this isn’t totally correct. Govert  Van Drimmelen (the man behind ExcelDNA) and Jon Skeet (the Chuck Norris of .Net) pointed that while the Task Parallel Library is a .Net 4.0 library, the Reactive Extensions for .Net 3.5 contains an unsupported 3.5 version of the TPL – which means that it should be possible to get parallelism to work with ExcelDNA.

This isn’t a pressing need of mine, so I thought I would leave that alone, and wait for the 4.0 version of ExcelDNA. Yeah right. Between my natural curiosity, Ross McLean’s comment (have fun at the Excel UK Dev Conference!), and the fact that I really want to know if I could get the Walkenbach test to run under 1 second, without too much of an effort, I had to check. And the good news is, yep, it works.

Last time we saw how to turn an average PC into a top-notch performer; let’s see how we can inject some parallelism to get a smoking hot calculation engine.


by Mathias 7. June 2010 10:23

Some time ago, I came across ExcelDNA, an open-source library designed to integrate .Net into Excel, via a  post by the Grumpy One, who described it as an interesting way to get Excel to talk to a compiled library. Sounds right down my alley, but I still managed to let 6 months pass until I finally tried it.

This reminded me of another post, by J-Walk this time, where he uses a random walk simulation in VBA to benchmark system performance. Back then, I ran the VBA code, and also the equivalent C# in a console app, out of curiosity: 11.38 seconds, vs. 2.73 seconds. Why not try the same experiment, and see if we can get the best of both worlds and bring some of the C# power into Excel via ExcelDNA?

So I created a Class Library, with the following method, a close equivalent to the VBA benchmark code:

public class Experiment
  public static string RandomWalk()
     var stopwatch = new Stopwatch();
     var position = 0;
     var random = new Random();
     for (var run = 0; run < 100000000; run++)
        if (random.Next(0, 2) == 0)
     var elapsed = (double)stopwatch.ElapsedMilliseconds / 1000d;
     return "Position: " + position.ToString() + ", Time: " + elapsed.ToString();


by Mathias 3. October 2009 15:10

The current version of Akin, my free Excel worksheet comparison application, has been out for a bit now, and people have sent me some interested suggestions on how to make it better. However, my biggest personal issue so far has been speed. Opening large file hasn’t been an issue, but displaying comparisons of large worksheets (say, 200 x 200 cells) was taking a long time. The typical user for Akin is likely to be working with large files (tracking differences wouldn’t be an issue otherwise), so I had to do something about it.

I have bitten the bullet – I changed the design, and completely re-wrote the user interface where the comparison is displayed, and I hope that you will be pleased with the performance improvement. Where a 200 x 200 cells comparison took over 20 seconds to display, a 500 x 500 cells comparison is now virtually instantaneous. While I was at it, I did some cosmetic improvements on the looks as well.

You can download the new version here. Now that this performance problem is out of the way, I can get back to implementing the features that have been suggested so far. Stay tuned!


Comment RSS