Mathias Brandewinder on .NET, VSTO and Excel development, and quantitative analysis.
by Mathias 28. June 2010 13:14

A client asked me recently a fun probability question, which revolved around figuring out the probability of success of a research program. In a simplified form, here is the problem: imagine that you have multiple labs, each developing products which have independent probabilities of succeeding – what is the probability of more than a certain number of products being eventually successful?

Let’s illustrate on a simple example. Product A has a 30% probability of success, and product B a 60% probability of success. Combining these into a probability tree, we work out that there is an 18% chance of having 2 products successful, 18% + 12 % + 42% = 72% chance of having 1 or more products succeed, and 28% chances of a total failure.

SimpleBinaryTree

It’s not a very complicated theoretical problem. Practically, however, when the number of products increases, the number of outcomes becomes large, fairly fast – and working out every single combination by hand is extremely tedious.

Fortunately, using a simple trick, we can generate these combinations with minimal effort. The representation of integers in base 2 is a decomposition in powers of 2, resulting in a unique sequence of 0 and 1. In our simplified example, if we consider the numbers 0, 1, 2 and 3, their decomposition is

0 = 0 x 2^2 + 0 x 2^1 –> 00

1 = 0 x 2^2 + 1 ^ 2^1 –> 01

2 = 1 x 2^2 + 0 x 2^1 –> 10

3 = 1 x 2^2 + 1 x 2^2 –> 11

As a result, if if consider a 1 to encode the success of a product, and a 0 its failure, the binary representation of integers from 0 to 3 gives us all possible outcomes for our two-products scenario.

More...

by Mathias 14. January 2010 13:41

In the previous installment, we discussed the dynamics of a (very) simple network of queues, and showed how much extra capacity was required to accommodate the build-up of population inside the queue, based on two factors: the rate at which people enter and leave the queue.

Today, we will look at a related question. Last time we determined the expected queue size at equilibrium, given the flow of people into the queue. This time, we want to consider the reverse problem: if you knew how many people are in the queue at equilibrium, what population breakdown would you expect between the two queues?

The question may sound theoretical – it isn’t. If you knew the total size of a market, the relative preferences of consumers between the products, and how long it takes them to replace their product, then determining how many consumers would be using each product at any given time is equivalent to the question we are considering.

thechoice Let’s illustrate on a fictional example. Imagine there is a disease, which can be treated two ways – using a blue pill, or a red pill. Doctors prescribe the blue pill to 25% of the patients, and the red one to 75%. The blue pill treatment takes 5 weeks, and the red pill treatment 8 (which we convert to average rates of exit of 0.2 and 0.125 per week). Suppose you knew that currently, 1000 people were under treatment: how many patients would you expect to be treated with a blue pill?

(picture from www.hackthematrix.org)

More...

by Mathias 20. September 2009 07:10

I was recently inspired by an article in OR/MS mag to write 2 posts on using Excel data tables to resolve optimization problems. As it turns out, these puzzles are not only published in the magazine: they also have their online home at Puzzlor.com. So if you like optimization and math problems in general, and like puzzles, check it out!

by Mathias 18. September 2009 06:12

ghostmantisI found a bug in my code the other day. It happens to everybody - apparently I am not the only one to write bugs – but the bug itself surprised me. In my experience, once you know a piece of code is buggy, it’s usually not too difficult to figure out what the origin of the problem might be (fixing it might). This bug surprised me, because I knew exactly the 10 lines of code where it was taking place, and yet I had no idea what was going on – I just couldn’t see the bug, even though it was floating in plain sight (hint: the pun is intended).

Here is the context. The code reads a double and converts it into a year and a quarter, based on the following convention: the input is of the form yyyy.q, for instance, 2010.2 represents the second quarter of 2010. Anything after the 2nd decimal is ignored, 2010.0 is “rounded up” to 1st quarter, and 2010.5 and above rounded down to 4th quarter.

Here is my original code:

public class DateConverter
{
    public static int ExtractYear(double dateAsDouble)
    {
        int year = (int)dateAsDouble;
        return year;
    }

    public static int ExtractQuarter(double dateAsDouble)
    {
        int year = ExtractYear(dateAsDouble);
        int quarter = (int)(10 * (Math.Round(dateAsDouble, 1) - (double)year));
        if (quarter < 1)
        {
            quarter = 1;
        }
        if (quarter > 4)
        {
            quarter = 4;
        }
        return quarter;
    }
}

Can you spot the bug?

More...

by Mathias 3. September 2009 17:50

In my last post, I illustrated how to quickly to pick the best value from a selection to get the optimal result, by using Excel Data Tables. This time, we will see how to pick the best possible pair of values.

We are trying to figure out which 2 bridges we should build, in order to minimize the overall travel time for the inhabitants of the island.

Island

I worked out the math for one bridge last time. We will start we a similar setup, but adjust our spreadsheet so that for each islander, we compute the travelling distance for 2 bridges, and select the shortest route.

image

The ranges B1 and B2 are named Bridge1 and Bridge2. Column I now contains the formula computing the shortest route for each islander. For row 5 for instance, the formula is

=MIN(D5+E5-H5,F5+G5-H5)

Cell I10 is the total of the vertical distances travelled by each individual.

We can select from 4 bridge locations: 2, 4, 7 and 12. What we need is to find out which 2 numbers give us the lowest total travel. Let’s build our data table, this time using 2 bridge positions.

More...

Comments

Comment RSS