Mathias Brandewinder on .NET, VSTO and Excel development, and quantitative analysis.
by Mathias 24. July 2010 17:35

When I moved to California a few years back, I soon realized that to get anything done in the Silicon Valley, you pretty much have to have a car. So, I purchased my first car. Fast forward today: I live in San Francisco now, and noticed that I am driving less and less. Bicycle is very convenient in my neighborhood, and I don’t have to commute to work on a daily basis. Which got me thinking – do I really need a car? Public transportation only is not an option, because coverage is too spotty, but what about using a car sharing service?

The 2 major services available in my area are ZipCar and CityCarShare; their pricing system is largely similar: they both:

  • charge by the hour of usage,
  • charge a higher cost over the week-end,
  • offer a discount for full-day rental,
  • have a pay-as-you-go option, and better rates with minimum commitment plans.

Both include gas, with one difference: ZipCar charges by the hour, whereas CityCarShare has a hybrid pricing, with a lower per-hour cost, and a per-mile cost.

By contrast, when you own a car, you

  • pay a large upfront investment (buying the car),
  • recoup some of the upfront cost if you resell eventually.
  • pay regular fixed costs (insurance, registration taxes, garage),
  • pay by the mile (gas),
  • pay some additional costs, like maintenance, which are somewhat linked to mileage.

In addition to that, you bear the risk that your car gets damaged or totaled in an accident.

More...

by Mathias 28. June 2010 13:14

A client asked me recently a fun probability question, which revolved around figuring out the probability of success of a research program. In a simplified form, here is the problem: imagine that you have multiple labs, each developing products which have independent probabilities of succeeding – what is the probability of more than a certain number of products being eventually successful?

Let’s illustrate on a simple example. Product A has a 30% probability of success, and product B a 60% probability of success. Combining these into a probability tree, we work out that there is an 18% chance of having 2 products successful, 18% + 12 % + 42% = 72% chance of having 1 or more products succeed, and 28% chances of a total failure.

SimpleBinaryTree

It’s not a very complicated theoretical problem. Practically, however, when the number of products increases, the number of outcomes becomes large, fairly fast – and working out every single combination by hand is extremely tedious.

Fortunately, using a simple trick, we can generate these combinations with minimal effort. The representation of integers in base 2 is a decomposition in powers of 2, resulting in a unique sequence of 0 and 1. In our simplified example, if we consider the numbers 0, 1, 2 and 3, their decomposition is

0 = 0 x 2^2 + 0 x 2^1 –> 00

1 = 0 x 2^2 + 1 ^ 2^1 –> 01

2 = 1 x 2^2 + 0 x 2^1 –> 10

3 = 1 x 2^2 + 1 x 2^2 –> 11

As a result, if if consider a 1 to encode the success of a product, and a 0 its failure, the binary representation of integers from 0 to 3 gives us all possible outcomes for our two-products scenario.

More...

by Mathias 13. June 2010 12:30

In my last post I explored how ExcelDNA can be used to write high-performance UDFs for Excel, calling .Net code without the overhead of VSTO. Using .Net instead of VBA for intensive computations already yields a nice improvement. Still, I regretted that ExcelDNA supports .Net up to 3.5 only, which puts the Task Parallel Library off limits – and is too bad  because the TPL is just totally awesome to leverage the power of multi-cores.

As it turned out, this isn’t totally correct. Govert  Van Drimmelen (the man behind ExcelDNA) and Jon Skeet (the Chuck Norris of .Net) pointed that while the Task Parallel Library is a .Net 4.0 library, the Reactive Extensions for .Net 3.5 contains an unsupported 3.5 version of the TPL – which means that it should be possible to get parallelism to work with ExcelDNA.

This isn’t a pressing need of mine, so I thought I would leave that alone, and wait for the 4.0 version of ExcelDNA. Yeah right. Between my natural curiosity, Ross McLean’s comment (have fun at the Excel UK Dev Conference!), and the fact that I really want to know if I could get the Walkenbach test to run under 1 second, without too much of an effort, I had to check. And the good news is, yep, it works.

Last time we saw how to turn an average PC into a top-notch performer; let’s see how we can inject some parallelism to get a smoking hot calculation engine.

More...

by Mathias 28. May 2010 08:29

Via the INFORMS newsletter, I found out this cool competition (trains and analytics, how much cooler can it get?): can you find the best plan to refuel the locomotives of a railroad company?

Create a cost-effective plan to fuel the locomotives that power a railroad's trains. Specify how many fuel trucks to contract at each yard and how much fuel to dispense into the locomotives of trains that run over a specified time horizon. Ensure that no locomotive runs out of fuel en route between yards. Sounds easy? We'll see about that!

Learn more about the challenge at the competition website, and register by June 15, for glory, bragging rights, a shot at a first prize of $2,500 – and help spare some poor chaps an unpleasant day:

by Mathias 14. January 2010 13:41

In the previous installment, we discussed the dynamics of a (very) simple network of queues, and showed how much extra capacity was required to accommodate the build-up of population inside the queue, based on two factors: the rate at which people enter and leave the queue.

Today, we will look at a related question. Last time we determined the expected queue size at equilibrium, given the flow of people into the queue. This time, we want to consider the reverse problem: if you knew how many people are in the queue at equilibrium, what population breakdown would you expect between the two queues?

The question may sound theoretical – it isn’t. If you knew the total size of a market, the relative preferences of consumers between the products, and how long it takes them to replace their product, then determining how many consumers would be using each product at any given time is equivalent to the question we are considering.

thechoice Let’s illustrate on a fictional example. Imagine there is a disease, which can be treated two ways – using a blue pill, or a red pill. Doctors prescribe the blue pill to 25% of the patients, and the red one to 75%. The blue pill treatment takes 5 weeks, and the red pill treatment 8 (which we convert to average rates of exit of 0.2 and 0.125 per week). Suppose you knew that currently, 1000 people were under treatment: how many patients would you expect to be treated with a blue pill?

(picture from www.hackthematrix.org)

More...

Comments

Comment RSS