Community Ratings/Rankings for tournament performance

Ortheore

Emeritus
2 1 3 3 3 1 2 2
Something I've been thinking about a little lately is the idea of having a rating/ranking system for tournament performance. I think it's a really interesting way of visualising player skill and it just seems like it'd be neat feature to have, although obviously not very important. However there are a few questions that must be asked
  • Is this something people would even want?
  • How would the rating system work?
  • What distinction would be made for current vs all-time ratings?
  • What are some issues with such systems, and how can we deal with them?
I can't answer the first one since I can't speak for everyone, but I can toss ideas around for all of the

What are some issues with such systems?
I figured I'd throw this out there before discussing different types of systems. Aside from the usual parameters of accuracy and responsiveness there are some other issues.
  • First is the issue of ratings inflation and deflation. This is especially prevalent when things like rating floors are implemented. One of the key features any rating system would need is that everything is perfectly symmetrical- a battle should reward the victor just as much as it penalises the loser. Either way, this is something that should be tracked.
  • Players with extremely high ratings have an extremely skewed risk-reward balance, with losses seeing massive drops while wins offer limited gains. Gonna suggest right now that for current ratings they should only be based on results in the past X time period, which would mean that players who don't play due to a high rating will eventually see their rating removed.
  • Players with low ratings get discouraged. I think the solution for this one is simple: maintain low ratings but don't publish them (I don't like rating floors due to causing inflation)
  • Depending on the rating system, severe imbalances in rating can lead to situations where one player has literally nothing to gain even if they win (and vice versa). I think the best solution to this is excluding those results from data sets if they come to fruition. Also there are plenty of ways to tweak a given system so that they're highly improbable
  • Alts and different communities. This isn't something where we should exclude other communities imo- it makes no sense to ignore something like SPL just because it's on smogon. It would have to be up to players to register alts with whoever was maintaining the rating system. Also alerting the administrator to new tour results is worthwhile, since I don't think it's reasonable to expect an individual to monitor here/smogon/PO for all tour results.
All-Time rankings
Imo these should be a cumulative thing, where players earn points based on how they went in a tour and what the quality of opposing players was. Such a system does tend to overlook poor performances, which I think is fine. As for a current rating system, it would balance recent wins against losses.

How would the system work?
There are multiple systems available, but afaik ELO is the most popular. Also I'd rather have the ELO starting point at 0 so everything balances out. This blog post explains it pretty well. Also PS has a couple other systems that are interesting and worth discussing

I tried thinking of my own system, which probably isn't that great but I'll explain it anyway:

Rating= i+a*nw+b*rw-(a*nl+b*rl)
i an arbitrary initial value (I like 0)
a and b are arbitrary constants
nw and nl are the total numbers of wins and losses respectively in the given dataset
rw is the mean rating of opponents you beat
rl is the mean negative rating of opponents you lost to
When applying this system to a completely new data set you'd probably assume all players to have a rating of i, and then repeatedly calculate the ratings, as they would change each time. Anyway, the reason I decided to post this even though there are probably gaping flaws that I'm missing is that the a and b values allow us to control how much we value quantity vs quality. I felt this was important because you need both to truly be considered good.
 
This looks like a great idea.
You could use different arbitrary constants depending on the kind of tournament, I mean... in SPL you're likely to face players with a similar ranking so you're not going to lose a bazillion of points even if the cap is high.
I see a hole in mean ratings of opponents, because some of them outside of PP won't be tracked - you could set their mean rating to the average for that tournament I guess.
 
Top