### 100 Locomotives Problem

100th Puzzle \m/ \m/

Source: "Fifty Challenging Problems in Probability" by Mosteller F.

Problem:
A railroad numbers its locomotives in order 1,2,3.. N. One day you see a locomotive and see that its number is 100. Guess how many locomotives the company has.

You have looked at 5 locomotives and the largest number you have seen is 100. Again guess how many locomotives does the company have?

Update(26/03/10):
This is not an exact question. I need to see various approaches to solve this problem. I see this as a real problem: I saw some locomotives and then someone asks me - Guess the number of locomotives.
Since I have not provided enough data, all "creative" answers are correct.

1. The first part is easy.Assuming each locomotive has probability 1/N of being spotted, the Expected number of locomotive spotted at a point of time is (N+1)/2.Which yields N=199.

For the 2nd part, the expected value of the maximum number seen is:
1/C(n,5) * ( n*C(n,4) + (n-1)*C(n-1,4) +........+ 4*C(4,4))

Couldn't solve this for n.
But a general intuition says that the 5 randomly chosen numbers would in a long run be centered around equi-spaced points in the interval (1,N) i.e. around N/6, 2N/6,....5N/6 respectively.
So that the expected value of maximum is 5N/6 or N=120.

2. @Aman:- That makes no sense. Perhaps there's some part I'm missing here.

You have to maximise P(N|100). What you have done is to solve E[x|N]=100. The two have no obvious connection.

I doubt if you can solve this unless you have some sort of idea about the distribution of N. Clearly if it is a real company, N cannot be arbitrarily large(there's a limit to how many rail carriages can be made from the Earth's crust)

Thus, we could probably assume a gaussian distribution for N centered around some mean with some variance and then maximise P(N|100).

3. @Tiger- The difference in our views is due to the way we have interpreted the question. You have treated N as a random variable and I have treated it as an unknown which we need to guess.

In my case, the random variable is the number of the vehicle spotted on a particular day whose density I have assumed to be uniform. The only dilemma I have in my mind is that I have used Expectation measure to predict the outcome of a single trial, which in practice is ought to give erroneous results.
But I had no other choice.

Conclusively, I guess the question should be more specific about what assumptions we need to make.

4. @Everyone,
This is not an exact question. This is more of a qualitative question. I need to see various approaches to solve this problem. I see this as a real problem: I am seeing locomotives and someone asks me: "Guess the number of locomotives".

Since I have not provided enough data, all "creative" answers are correct.