Glicko-2: chess rating system with uncertainty
Glicko-2
Definition
Glicko-2 is a player rating system created by statistician Mark Glickman as an improvement over the Elo system. It models both a player’s estimated strength and the uncertainty in that estimate. Each player has three key numbers: a rating (r), a rating deviation (RD) that measures confidence/uncertainty, and a volatility (σ) that measures how much a player’s strength tends to fluctuate over time.
How it works
- Rating (r): The skill estimate you usually see displayed (e.g., 1820). Many sites show only this number.
- Rating Deviation (RD): A measure of uncertainty. Large RD means the system isn’t very sure about your true strength; small RD means high confidence. RD grows when you’re inactive and shrinks as you play more games.
- Volatility (σ): Captures how “swingy” your performances are. If you produce wildly unexpected results, σ rises; if your results are steady, σ tends to fall. Higher σ allows ratings to adjust more quickly.
Glicko-2 updates ratings over discrete rating periods or after each game (as many online platforms do). The math uses a logistic expectation and weights opponents by their RD (you learn more by playing opponents with well-known ratings). Internally, Glicko-2 uses a transformed scale (μ, φ) and a conversion factor 173.7178 between the Glicko-2 and Elo-like scales.
Usage in chess
Glicko-2 is widely used on online chess servers for different time controls and variants, because it adapts quickly to changing playing strength and handles new or returning players gracefully. For example, bullet, blitz, rapid, and classical pools are rated separately. Lichess uses Glicko-2; other platforms use Glicko-based approaches. Over-the-board federations such as FIDE primarily use Elo.
Strategic and practical significance
- Faster adaptation for improving players: If you’re under-rated and win a lot quickly, high RD and/or σ let your rating rise fast.
- Fairer matchmaking: RD increases during inactivity, so your first few games back move your rating more, quickly re-centering you.
- Stability for established players: With many recent games, RD and σ are small, so single results don’t swing your rating as much.
- Confidence intervals: Roughly, your “true” rating is within about ±2×RD with high confidence (on the displayed scale). Smaller RD means a tighter, more reliable estimate.
Comparison with Elo (and Glicko-1)
- Versus Elo: Elo uses a fixed or ad hoc K-factor; Glicko-2 effectively “self-tunes” K through RD and σ, letting the system learn faster when it’s less certain and stabilize when it’s confident.
- Versus Glicko-1: Glicko-2 adds volatility (σ), letting the system react more nimbly to real changes in player strength.
Examples
- New account effect: Start at r ≈ 1500, RD ≈ 350, σ ≈ 0.06. After 10–20 games, RD can drop dramatically (e.g., to 90–120), so early results cause large rating moves. If you go 8/10 against 1500–1600 opposition, expect a big jump.
- Returning player: You were 1900 with RD ≈ 50, then took a long break. RD rises during inactivity (e.g., to 120+). Your first few games back can swing your rating by 50–100 points or more, quickly updating your estimate.
- Stable veteran: A 2100 with RD ≈ 40 and σ small might gain only 4–8 points for a single upset win over 2200, and lose a similar amount for an upset loss to 2000. The system believes it already “knows” their strength.
Historical notes and anecdotes
Mark Glickman introduced Glicko in the late 1990s and Glicko-2 shortly thereafter, refining how uncertainty and volatility are modeled. Online chess embraced Glicko-2 because rapid game throughput and varied time controls benefit from faster, more nuanced rating updates. A nerdy tidbit: the 173.7178 constant converts between the internal Glicko-2 scale (μ, φ) and the familiar rating scale centered at 1500.
Common misconceptions
- “Ratings always move by the same amount.” In Glicko-2, your RD and σ govern how much you move; the same result can yield different point changes for different players.
- “Inactivity changes your rating.” Inactivity doesn’t change r directly; it increases RD, which can make subsequent changes larger once you play again.
- “Upsets always move ratings symmetrically.” The opponent’s RD, your RD, and σ matter. Beating a high-rated opponent with huge RD can be worth less than you’d expect; beating a high-rated, low-RD opponent is worth more.
Tips for players
- To stabilize your rating, play regularly to keep RD low; avoid long breaks if you care about minimizing swings.
- If you’re improving quickly, frequent play lets Glicko-2 track your rise faster (higher RD/σ enables larger jumps).
- Treat your rating as an estimate with uncertainty. A small RD means your displayed number is a more reliable benchmark.
Related terms
- Elo: The classic rating system used by FIDE for OTB chess.
- Rating Deviation: The Glicko/Glicko-2 confidence measure for rating uncertainty.
- K-factor: Elo’s sensitivity parameter; Glicko-2 replaces this with RD and σ dynamics.
Optional visualization
Example of how a player’s blitz rating might evolve over years (higher volatility early, stabilizing later):