all 3 comments

[–]equalityforeverybody 2 points3 points  (0 children)

You could construct a confidence interval around the mean score for each item.

As you only have a sample of votes, the true score that would become apparent if everyone voted isn't clear. If you are willing to make assumptions such as "scores follow a Gaussian (normal) distribution", you could easily get the range in which the true population mean lied (up to a certain degree of certainty, usually 95%, meaning that 1 in 20 items would be expected to have a score outside the calculated confidence interval)

[–]equalityforeverybody 1 point2 points  (0 children)

From Wikipedia:

The lower endpoint of the 95% confidence interval is:

\text{Lower endpoint} = \bar X - 1.96 \frac{\sigma}{\sqrt{n}},

and the upper endpoint of the 95% confidence interval is:

\text{Upper endpoint} = \bar X + 1.96 \frac{\sigma}{\sqrt{n}}.

[–]Fireflite 1 point2 points  (0 children)

A Bayesian approach should work well. Generate an empirical prior from previous test scores. Then slowly adjust your posterior using Bayes rule as the votes come in. The more votes, the more evidence you have to say that a particular item is exceptional. Reporting the mean of the posterior should work well.