Hi DiS tech team.
I've just searched Scroobius Pip. On looking at his profile I noticed that Nestor had rated him 0/10.
I then noticed that the 'Users who liked...' function said 'Users who liked Scroobius Pip also liked Bad Sandwich'.
I thought 'NO-ONE likes Bad Sandwich', so I went to have a look.
I noticed that Nestor had also rated Bad Sandwich 0/10.
So clearly the technology doesn't take into account HOW the bands it links were actually rated. It should really say 'Users who commented on artist x also commented on artist y'.
Can this be refined in any way?
I'm so negative :(
I didn't want to convey that impression :(
I disagree with you on Pip but there should be minus ratings for bands like Bad Sandwich.
don't start that again!
there was originally a negative ratings sytem on this incarnation of DiS and EVERYONE kicked off about it.
oops, I didn't think people were actually serious about that!
It's based on correlation
correlation between negatives counts just as much as that between positives
-1 * -1 = 1
which is a roundabout way of saying, for now, nope, deal with it :)
Guess we could change the language though, to 'opinions on Scroobius Pip are highly correlated with opinions on Bad Sandwich', but that doesn't read so easily.
That hurt my brain
Oh yes, I should say
that the ratings system currently treats 5 as the midpoint, ratings below 5 as negative, above 5 as positive.
This is explained on the page about it.
There are other more sophisticated algorithms we could use, but they're pretty expensive/tricky/specialised things to implement, so it depends mainly on what DiS's priorities are.
what are DiS' priorities?
out of interest
I see
It did cross my mind that both artists have a relatively low number of user comments, therefore this instance could be out of the ordinary (generally for any band you'll have more than 10 people rating them 9 or above etc.).
But cheers for explaining it!
Yeah,
when there aren't very many ratings, it's easy for one extreme case to affect that bit.
It's a pretty simple cosine similarity between ratings vectors (if that helps at all :).
I am well aware of better techniques, but no longer have the time to implement them just for fun. If someone can get enough value from them to pay me, though :)
Some more explanation
Someone gives Scroobius Pip and Bad Sandwich both highly negative ratings --> positive correlation
Someone gives Scroobius Pip and Bad Sandwich both highly positive ratings --> positive correlation
Someone gives Scroobius Pip a good rating and Bad Sandwich a bad rating --> negative correlation
Someone gives Scroobius Pip a bad rating and Bad Sandwich a good rating --> negative correlation
Yeah, I get it.
And I have no clue how to explain this, but including positive correlations between negative ratings in the algorithm can be useful in the context, because people who hate bands of a certain genre tend to hate them all alike. So if someone gives Oasis 0 and Stereophonics 0 - as MOR indie bands - it's a reasonable-ish indication that someone with different musical taste, someone who likes Oasis, for example, will also like Stereophonics.
Now I know how it works I'm much more comfortable with it. As you can tell from my original post I thought it was much more simplistic than it is.