### Coupon Collector Problem

Source: Peter Winkler book, Placement quant company interview, Wikipedia

Problem:
1) How many times do you need to flip a coin before you expect to see both heads and tails?
2) How many times do you need to roll a die before you expect to see all the numbers 1-6?
3) A company — say Coca-Cola, for concreteness — is holding a contest where everyone who collects one each of n different "coupons" wins some prize. You get a coupon with each purchase of a Coke, and each coupon is equally likely. What’s the expected number of Cokes you have to buy in order to collect all the coupons?

Update (10/01/10): Solution posted by Giridhar (IITK alumnus, Yahoo! Engineer) in comments!!

1. when do you expect an event to happen? i guess when its probability is more than 1/2. if thats the case, here is the solution.
solving the last part first, let us calculate the probability of the event not happening, that is, he is not able to collect at least one coupon, that is, every time he takes a coupon, it is out of the n-1 remaining coupons. probability of that happening is (n-1)/n. let k be the number of coke bottles required. then possibility of event not happening is n*((n-1)/n)^k (we multiply by n because it could be any of the n coupons he doesn't get). this should be less than 1/2. thus, k is the ceiling function of log(1/2n)/log((n-1)/n).
the first two parts are similar. n being 2 in the first part, and 6 in the second. the answers being 3 for the first part and 14 for the second.
is it right?

2. actually the ceiling function of log(1/2n)/log((n-1)/n)+epsylon, to make it greater than 50%

3. @suyash..
In how many chances I expect E to happen? is equivalent to
What is the expectation value of the number of chances for E to happen.

For eg: Expected number of times I need roll a dice to get a 2 is 6.
This is because probability of getting 2 in x rolls is (5/6)^(x-1)1/6 = 5^(x-1)/6^x

So, Expected value = sigma (x=1 to infinity) (x/5)*(5/6)^x
Expected value = 6

4. Divide the whole process in to stages. Each stage ends with collecting a coupon which is new ( different from the coupons we already have )

Let C_i be the random variable which denotes the number of coupons we buy in the stage i.

Let C be the random variable which denoted the number of coupons we buy in order to get all n coupons
Now,
C= C_1 + C_2 + ... + C_n
Note C_1 = 1;
In general, while we are during stage i , we already have with us (i-1) different coupons. So, the probability of getting a new coupon is p_i = n-(i-1)/n. Also note that each C_i is a Geometrically distributed random variable with success probability equal to p_i. So, E(C_i) = 1/p_i = n/(n-i+1).
By using Linearity of Expectations,
E(C) = E(C_1) +E(C_2) + ...+E(C_n)
= n(1/n + 1/(n-1)+ ... +1)
~ nlogn.

5. @Giridhar.. Nice solution.. Thanx

Just finding the answers using the way suggested by Giridhar:

1) 2/2+2/1 = 3
2) 6/6 + 6/5 + 6/4 + 6/3 + 6/2 + 6/1 = 14.7
3) n*(H_n) ~ n(ln n)