No, the problem I bring is not from the land of Harry Potter
One of the standard ways of applying collaborative filtering here is similarity measures, a popular one being cosine similarity. You represent all customers who bought product A as a vector, and do the same for product B. Then sim(A, B) = A.B / |A||B|. If A is the product a user has already bought, then you wish to find B such that it maximizes sim(A, B). One can see that if a Harry Potter book, H, had a large enough set of customers common with A, then it may drown out other products with fewer purchase data. Also, cosine similarity is symmetric, which does not appear to be a desirable property here: if say, "Seven habits of highly effective people"
That formulation of the problem (what is the customer likely to buy next) leads to a probabilistic definition for recommendations: P(B|A) = P(AB)/P(A): what is the probability of a customer buying B, given that s/he purchased A? Since A is a given, we want to find B such it maximizes P(AB) in the RHS. As Richard Cole points out in comments below, data sparsity often poses problems here, i.e. you might find P(AB) to be zero for some Bs. To not discount such events that have not occurred in the training data yet, the numerator and denominator are often boosted with priors. He also points out other approaches below of dealing with data sparsity. I suppose quantifying the effect of the Harry Potter problem is hard because it depends entirely on the dataset, and getting access to real-world datasets is hard. The Netflix challenge (see below) is the best known publicly available commercial dataset. I guess I should try out these approaches on it!
Recommended reading:
- Greg Linden's article in IEEE Spectrum
- The Netflix challenge was an interesting event on this front, here's the winning team's strategy. Note thought that the focus there was more on extracting temporal relations from a movie reviews database, than with dealing with the Harry Potter problem directly