Mathias Brandewinder on .NET, VSTO and Excel development, and quantitative analysis.
by Mathias 6. May 2012 12:48

I am digging back into the Bumblebee code base, to clean it up before talking at the New England F# user group in Boston in June. As usual for me, it’s a humbling experience to face my own code, 6 months later, or, if you are an incorrigible optimist, it’s great to see that I am so much smarter today than a few months ago…

In any case, while toying with one of the samples, I noted that performance was degrading pretty steeply as the size of the problem was increasing. Most of the action revolved around producing random shuffles of a list, so I figured it would be interesting to look into it and see where I messed up how this could be improved upon.

Here is the original code, a quick-and-dirty implementation of the Fisher-Yates shuffle:

open System

let swap fst snd i =
   if i = fst then snd else
   if i = snd then fst else
   i

let shuffle items (rng: Random) =
   let rec shuffleTo items upTo =
      match upTo with
      | 0 -> items
      | _ ->
         let fst = rng.Next(upTo)
         let shuffled = List.permute (swap fst (upTo - 1)) items
         shuffleTo shuffled (upTo - 1)
   let length = List.length items
   shuffleTo items length

[<EntryPoint>]
let main argv = 
     let test = [1..10000]
     let random = new Random()
     let shuffled = shuffle test random
     System.Console.WriteLine("done...")
     0

Running the test case in fsi, using #time, produces the following:

Real: 00:00:42.735, CPU: 00:00:42.734, GC gen0: 2307, gen1: 6, gen2: 0

(Digression: #time is absolutely awesome – just typing #time;; in a fsi session will automatically display performance information, allowing to quickly tweak a function and fine-tune it “on the fly”. I wish I had known about it earlier.)

My initial assumption was that the problem revolved around performing multiple permutations of a List. However, I figured it would be interesting to take the opportunity and use the Performance Analysis tools provided in VS11 – and here is what I got:

image

Uh-oh. Looks like the shuffle is spending most of its time doing comparisons in the swap function – and what the hell is HashCompare.GenericEqualityIntrinsic doing in here? Something is off.

Looking into the swap function provides a hint:

image

F# has identified that the function could be made generic. It’s great, but in our case it comes with overhead, because we simply want to compare integers. Let’s mark the function as inline, to avoid that problem (we could also make the function non-generic, by marking one of the inputs as integer):

image

More...

by Mathias 21. April 2012 05:34

I presented “For Those About to Mock”, an introduction to Mocking for C# developers, at the San Francisco chapter of Bay .NET last week, and promised I would make the material available.

Here it is: you can download the code “For Those About to Mock” here.

Thanks for everyone who made it, it was a great crowd with lots of good questions – I had a great time!

by Mathias 7. April 2012 10:50

For no clear reason, I got interested in Convex Hull algorithms, and decided to see how it would look in F#. First, if you wonder what a Convex Hull is, imagine that you have a set of points in a plane – say, a board – and that you planted a thumbtack on each point. Now take an elastic band, stretch it, and wrap it around the thumbtacks. The elastic band will cling to the outermost tacks, leaving some tacks untouched. The convex hull is the set of tacks that are in contact with the elastic band; it is convex, because if you take any pair of points from the original set, the segment connecting them remains inside the hull.

The picture below illustrates the idea - the blue thumbtacks define the Convex Hull; all the yellow tacks are included within the elastic band, without touching it.

Convex-Hull

There are a few algorithms around to identify the Convex Hull of a set of points in 2 dimensions; I decided to go with Andrew’s monotone chain, because of its simplicity.

The insight of the algorithm is to observe that if you start from the leftmost tack, and follow the elastic downwards, the elastic turns only clockwise, until it reaches the rightmost tack. Similarly, starting from the right upwards, only clockwise turns happen, until the rightmost tack is reached. Given that the left- and right-most tacks belong to the convex hull, the algorithm constructs the upper and lower part of the hull by progressively constructing sequences that contain only clockwise turns.

More...

by Mathias 1. April 2012 09:47

Last week’s StackOverflow newsletter contained a fun problem I had never seen before: Bipartite Matching. Here is the problem:

There are N starting points (purple) and N target points (green) in 2D. I want an algorithm that connects starting points to target points by a line segment (brown) without any of these segments intersecting (red) and while minimizing the cumulative length of all segments.

Image from the original post on StackOverflow

I figured it would be fun to try out Bumblebee, my artificial bee colony library, on the problem. As the accepted answer points out, the constraint that no segment should intersect is redundant, and we only need to worry about minimizing the cumulative length, because reducing the length implies removing intersections.

As usual with Bumblebee, I’ll go first with the dumbest thing that could work. The solution involves matching points from two lists, so we’ll define a record type for Point and represent a Solution as two (ordered) lists of points, packed in a Tuple:

type Point = { X: float; Y: float }
let points = 100
let firstList = [ for i in 0 .. points -> { X = (float)i ; Y = float(i) } ]
let secondList =  [ for i in 0 .. points -> { X = (float)i ; Y = float(i) } ]

let root = firstList, secondList
We’ll start with a silly problem, where the 2 lists are identical: the trivial solution here is to match each point with itself, resulting in a zero-length, which will be convenient to see how well the algorithm is doing and how far it is from the optimum.

How can we Evaluate the quality of a solution? We need to pair up the points of each of the lists, compute the distance of each pair, and sum them up – fairly straightforward:

let distance pair =
   ((fst pair).X - (snd pair).X) ** 2.0 + ((fst pair).Y - (snd pair).Y) ** 2.0

let evaluate = fun (solution: Point list * Point list) -> 
   List.zip (fst solution) (snd solution)
   |> List.sumBy (fun p -> – distance p)

More...

by Mathias 25. March 2012 10:53

In a previous post, I looked at creating a Sierpinski triangle using F# and WPF. One of the pieces I was not too happy about was the function I used to transform a Triangle into a next generation triangle:

type Point = { X:float; Y:float }
type Triangle = { A:Point; B:Point; C:Point }

let transform (p1, p2, p3) =
   let x1 = p1.X + 0.5 * (p2.X - p1.X) + 0.5 * (p3.X - p1.X)
   let y1 = p1.Y + 0.5 * (p2.Y - p1.Y) + 0.5 * (p3.Y - p1.Y)
   let x2 = p1.X + 1.0 * (p2.X - p1.X) + 0.5 * (p3.X - p1.X)
   let y2 = p1.Y + 1.0 * (p2.Y - p1.Y) + 0.5 * (p3.Y - p1.Y)
   let x3 = p1.X + 0.5 * (p2.X - p1.X) + 1.0 * (p3.X - p1.X)
   let y3 = p1.Y + 0.5 * (p2.Y - p1.Y) + 1.0 * (p3.Y - p1.Y)
   { A = { X = x1; Y = y1 }; B = { X = x2; Y = y2 }; C= { X = x3; Y = y3 }}

Per se, there is nothing wrong with the transform function: it takes 3 points (the triangle corners), and returns a new Triangle. However, what is being “done” to the triangle is not very expressive – and the code looks rather ugly, with clear duplication (the exact same operation is repeated on the X and Y coordinates of every point).

Bringing back blurry memories from past geometry classes, it seems we are missing the notion of a Vector. What we are doing here is taking corner p1 of the Triangle, and adding a linear combinations of the edges p1, p2 and p1, p3 to it, which can be seen as 2 Vectors (p2 – p1) and (p3 – p1). Restated that way, here is what the transform function is really doing:

A –> A + 0.5 x AB + 0.5 x AC

A –> A + 1.0 x AB + 0.5 AC

A –> A + 0.5 x AB + 1.0 x AC 

In graphical form, the first transformation can be represented as follows:

image

In order to achieve this, we need to define a few elements: a Vector, obviously, a way to create a Vector from two Points, to add Vectors, to scale a Vector by a scalar, and to translate a Point by a Vector. Let’s do it:

type Vector = 
   { dX:float; dY:float }
   static member (+) (v1, v2) = { dX = v1.dX + v2.dX; dY = v1.dY + v2.dY }
   static member (*) (sc, v) = { dX = sc * v.dX; dY = sc * v.dY }

type Point = 
   { X:float; Y:float }
   static member (+) (p, v) = { X = p.X + v.dX; Y = p.Y + v.dY }
   static member (-) (p2, p1) = { dX = p2.X - p1.X; dY = p2.Y - p1.Y }

type Triangle = { A:Point; B:Point; C:Point }

Thanks to operators overloading, the transform function can now be re-phrased in a much more palatable way:

let transform (p1:Point, p2, p3) =
   let a = p1 + 0.5 * (p2 - p1) + 0.5 * (p3 - p1)
   let b = p1 + 1.0 * (p2 - p1) + 0.5 * (p3 - p1)
   let c = p1 + 0.5 * (p2 - p1) + 1.0 * (p3 - p1)
   { A = a; B = b; C = c }

… and we are done. The code (posted on fsSnip.net) works exactly as before, but it’s way clearer.

It can also be tweaked more easily now. I got curious about what would happen if slightly different transformations were applied, and the results can be pretty fun. For instance, with a minor modification of the transform function…

let transform (p1:Point, p2, p3) =
   let a = p1 + 0.55 * (p2 - p1) + 0.5 * (p3 - p1)
   let b = p1 + 1.05 * (p2 - p1) + 0.45 * (p3 - p1)
   let c = p1 + 0.5 * (p2 - p1) + 0.95 * (p3 - p1)
   { A = a; B = b; C = c }

… we get the following, bloated “Sierpinski triangle”:

BeerPinski-triangle

Add a bit of transparency, some more tweaks of the linear combinations,

let transform (p1:Point, p2, p3) =
   let a = p1 + 0.3 * (p2 - p1) + 0.6 * (p3 - p1)
   let b = p1 + 0.8 * (p2 - p1) + 0.3 * (p3 - p1)
   let c = p1 + 0.6 * (p2 - p1) + 1.1 * (p3 - p1)
   { A = a; B = b; C = c }

and things get much wilder:

SnowflakePinski-triangle

I don’t think these are really Sierpinski triangles any more, but I had lots of fun playing with this, and figured someone else might enjoy it, too… If you find a nice new combination, post it in the comments!

Source code: fsSnip.net

Comments

Comment RSS