Changelog

Every change that affects ratings is recorded here. If you see a rating move without a tournament result, one of these changes was the cause. For methodology details, see How It Works.

2026-04-17 — v3.5

Overall: tightened prior RD from 250 to 150. The v3.4 correlation-shrinkage prior fixed the raw Overall rating pathology (elite Premier-only beating mediocre dual on rating), but the leaderboard orders by conservative sort (rating − 2 × RD), and the 250 prior RD was pushing elite Premier-only players back below dual-format players with tighter RDs. Measured the empirical Limited SD among rated players (124.6) and derived the math-justified prior RD: 124.6 × √(1−0.65²) = 94.7. Landed on 150 — slightly conservative vs the theoretical floor, preserving a small "not yet proven" residual without the 3.5× rank drop v3.4 was producing. Elite Premier-only players (Dylan Sumner, JamesUgly, Spencer Freeman) moved up hundreds of Overall ranks; top-300 Premier-only players are now mostly in top-500 Overall instead of 501-1000.

2026-04-17 — v3.4

Overall rating: correlation-shrinkage prior for missing formats. Community feedback during alpha: a player ranked #269 in Premier with no Limited history was sitting ~2,000 places lower on Overall, and mediocre dual-format players (1,400/1,400) were outranking elite Premier-only players (1,900). Root cause: filling the missing-format slot with the population mean (1,500) assumed zero correlation between Premier and Limited. That's empirically false — Pearson ρ between Premier and Limited ratings among dual-format players is 0.65 (0.67 for RD ≤ 80, 0.63 for RD ≤ 120). New formula: if only one format is known, the missing-format prior is 1,500 + 0.65 × (known_rating − 1,500) with RD 250. A 1,900 Premier player now gets Overall 1,830 (was 1,700), and the top-50 Overall leaderboard is unchanged (all dual-format). Conservative sort still discounts unmeasured formats, just less harshly. Recomputed every Overall rating in one pass.

2026-04-16 — v3.3

Fixed RD decay compounding bug. Pre-v3.3, every tournament processed after a player's last match re-applied (tournament_date − last_played_at) / 30 periods of decay to an already-decayed RD. Over dozens of tournaments this inflated stale ratings far beyond their correct Glicko-2 values (example: a 56 RD inflated to 134 over 89 days when the correct value was ~58). Added a per-rating last_decayed_at anchor so each pass only applies newly-accumulated periods. All 790 tournaments re-rated from scratch against the corrected logic; every player's rating + RD is now a clean Glicko-2 result. Confidence tooltips on profile pages now explain RD inflation when it occurs ("RD 90 — inflated from 56 after 42 days of Premier inactivity").

2026-04-16 — v3.2

Overall redesigned as Galactic-seeding rating. Replaced the match-weighted composite with the equal-weighted average of Premier and Limited: (Premier + Limited) / 2. Missing format slots fill with the Glicko-2 population mean (1,500 rating, RD 350), so single-format players take an honest hit — a pure Premier grinder with no Limited experience sits roughly midway between their Premier rating and 1,500. Eternal is excluded. The new Overall is designed to be the canonical rating for seeding Galactic Championship (which requires both Premier and Limited).

2026-04-16 — v3.1

Single leaderboard + Overall composite. Replaced the Ranked/Provisional cohort split with one leaderboard for all players. The conservative sort (Rating − 2 × RD) already places provisional players naturally at the bottom until they build a record. Replaced the RD-band Status column with a Confidence %, a principled 0–100% mapping of RD within its valid range. Replaced the deprecated Overall-via-double-Glicko with a match-weighted composite of per-format ratings — no match is ever counted twice. Overall is now the default format everywhere (leaderboard + profile hero). Published the live clamp rate in System Parameters.

2026-04-16 — v3

Reliability pass. Added RD-based ranked/provisional gating at the leaderboard (since superseded in v3.1), credible-interval display, RD-band badges (Established / Developing / Provisional), multiplier-clamp tracking on snapshots, match-level audit trail on profiles, DB-level match integrity constraints, deprecated Overall as the profile headline rating (later replaced in v3.1). No rating values changed as a result of this pass.

2026-04-16 — v2.1

Default RD on new player ratings changed from 350 to 250, allowing first tournament to tighten uncertainty faster.

2026-04-16 — v2.0

Replaced opponent-inflation tier weighting with post-hoc K-factor multipliers. Added ±150 cap on the multiplier bonus. Added per-player RD decay using last_played_at per format. Fixed draw handling (draws now produce near-zero change regardless of tier).

2026-04-14 — v1

Initial release. Glicko-2 engine validated against Glickman's 2012 paper. Data pipeline built on Melee.gg match data and SWU Competitive Hub tournament metadata.