Sunteți pe pagina 1din 1

Coding Challenge

The attached text file "Artist_lists_small.txt" contains the favorite musical ar


tists of 1000 users from LastFM. Each line is a list of up to 50 artists, format
ted as follows:
Radiohead,Pulp,Morrissey,Delays,Stereophonics,Blur,Suede,Sleeper,The La's,Super
Furry Animals\n
Band of Horses,Iggy Pop,The Velvet Underground,Radiohead,The Decemberists,Morris
sey,Television\n
etc.
Write a program that, using this file as input, produces an output file containi
ng a list of pairs of artists which appear TOGETHER in at least fifty different
lists. For example, in the above sample, Radiohead and Morrissey appear together
twice, but every other pair appears only once.
Your solution cannot store a list of all possible pairs of bands (don't use a 'b
rute force' approach). Your solution MAY return a best guess, i.e. lists which a
ppear at least 50 times with high probability, as long as you explain why this t
radeoff improves the performance of the algorithm. Please include, either in com
ments or in a separate file, a brief one-paragraph description of any optimizati
ons you made and how they impact the run-time of the algorithm.
Your solution should preferably be implemented Java or PHP. Other languages may
be considered on a case-by-case basis.

S-ar putea să vă placă și