← All articles

The 'big six' label is statistically incoherent

The "big six" label has become the default framing for the Premier League's hierarchy. The six teams — Manchester City, Manchester United, Liverpool, Chelsea, Arsenal, and Tottenham — get grouped together in tactical previews, in fixture-difficulty analyses, in transfer-market reporting, and in fan discourse. The grouping treats these six clubs as a coherent tier above the rest of the league. The data does not support that grouping. The gap inside the big six is now wider than the gap between its bottom edge and the next group down. Treating the six as a single tier produces consistently bad tactical and predictive analysis, and the label survives more as a brand convention than as a statistical category.

What the label originally captured

The "big six" emerged as a useful shorthand in the early 2010s, when the six clubs were genuinely clustered together in points-per-season, expected-goals output, transfer spending, and European qualifications. Across the 2012-2017 window, the bottom of the big six finished around 65-70 points a season and the seventh-place team finished around 55-60 points. That ten-point gap was meaningful and consistent. The label described something real about the structure of the league. A pundit using the phrase in that era was naming a coherent group of clubs whose competitive profile was actually similar.

That structure has been gone for several years. The gap between the top three (or sometimes four) clubs and the bottom of the "big six" has widened to the point where the comparison no longer holds. Manchester City alone has averaged a points total in the high 80s across recent seasons. The bottom of the big six has, in multiple recent seasons, finished in the 55-65 range — below or at the same level as the teams immediately outside the label. The within-group variance has exceeded the between-group variance for years, and the label has not caught up.

The seventh-place test

The cleanest way to assess whether a grouping is statistically meaningful is to compare the gap inside the group to the gap between the group and the next tier. For the big six over the last five Premier League seasons, the average points difference between the top of the six and the bottom has been roughly 25 points. The average points difference between the bottom of the six and seventh place has been roughly 4 points. The internal gap is six times the boundary gap. By that measure, the "big six" is not a tier. It is a list that contains at least two distinct tiers stapled together for marketing convenience.

The same conclusion holds for xG-based metrics. The top tier of Premier League clubs averages somewhere around 70-80 expected goals scored per season at the high end, with expected goals against in the 30-35 range. The bottom of the "big six" frequently sits at 55-60 xG scored and 50-55 xG against. Seventh place looks structurally similar to the bottom of the six on the underlying numbers. Treating them as different tiers obscures the actual competitive structure of the league.

The fixture-difficulty distortion

The most consequential effect of the label is on fixture-difficulty analysis. Tactical previews and scheduling-based articles routinely treat games against any of the big six as "tough fixtures" worth a competitive adjustment. For matches against the top three clubs, that adjustment is correct. The xG and points expectations against those teams are unambiguously different from matches against the rest of the league. For matches against the bottom of the big six, the adjustment is much smaller than the label implies. A mid-table team's expected points against the seventh-place club and against the bottom of the big six are usually within 0.1 points of each other.

The result is that fixture-difficulty analyses systematically miscalibrate the run-in projections of mid-table teams. Their schedules look harder than they are because the bottom-of-big-six matches get weighted as if they were top-three fixtures. The teams the analyses keep predicting will fade in May tend to outperform expectations, because the matches that were supposed to be hardest are not actually that hard. The label is doing the miscalibration, not any genuine error in the analyst's framework.

The transfer-market self-fulfillment

The big-six label also distorts the transfer market in ways that compound the underlying competitive divergence. A player linked with a "big six" move attracts more reporting, higher fee expectations, and stronger leverage in negotiations than a player linked with a comparable club outside the label. The premium is paid even when the receiving club is at the bottom of the six and structurally similar to a seventh-place club that would have offered similar money. The label produces a price premium that the actual competitive position of the receiving club no longer justifies.

The premium is small per transaction but cumulative across a decade. The clubs at the bottom of the six have, over time, spent more on incoming transfers than their points totals would predict, partly because the label gave their offers a recognition advantage. The teams immediately outside the label have spent less than their actual competitive positions would predict for the same reason. Some of those outside teams have, in multiple recent seasons, outperformed the bottom of the big six on the pitch. The market keeps treating them as second-tier clubs because the label keeps privileging the six over the seventh, eighth, or ninth-place teams that have closed the gap.

Why the label persists

The label survives despite its incoherence for the same reasons most over-staying broadcasting conventions survive. It is convenient. It is recognizable. It is built into the commercial structure of how Premier League content is sold to international audiences, where the "big six" appears in broadcast packaging, fixture marketing, and merchandise distribution. The clubs inside the label have a commercial interest in keeping it alive even when the on-pitch evidence has stopped supporting it. The clubs outside the label have no organized way to push back against it. The result is a category that no longer matches the data but keeps appearing in every analysis as if it did.

There are also Champions League qualification habits behind it. For the better part of two decades, the big six were the realistic pool from which top-four finishers emerged. The structural assumption was reasonable in that window. Across the last five seasons, the top-four pool has regularly expanded to include teams outside the six. The label hasn't expanded to match. It still lists the same six clubs even when one of them has missed the European places for multiple seasons running. The category has stopped tracking the underlying European-qualification reality it was originally meant to describe.

What replaces it

The honest version of the modern Premier League hierarchy is a tier of two or three clubs at the top with genuine title contention, a competitive group of five to seven clubs below them competing for European places, and a clear gap before the relegation contenders. That structure does not map onto the big-six framing. The top tier is too small to be called the big six. The European-places tier is too large to be the big six. The label combines the top of one tier with the middle of another in a way that obscures the actual shape of the competition.

Tactical and predictive analyses that have abandoned the label in favor of a cleaner tier structure — using season-to-date xG, points-per-game rolling averages, or published power rankings — have consistently outperformed analyses that lean on the big-six shorthand. The shorthand is doing real harm in projection accuracy and the analytical layer has slowly been moving away from it for years. The broadcast and editorial layer has not. The label will probably stay alive for a while longer because the people whose jobs depend on it keep using it and the people whose jobs would be helped by retiring it have no institutional incentive to fight it.

How to read a Premier League table

A practical habit when reading any Premier League analysis is to translate "the big six" into the underlying tier structure the writer probably means. Most of the time the writer means either "the top two or three" or "the European places contenders," and the analysis improves when the translation is done. Treating any matchup involving a big-six club as equivalent to any other big-six matchup is the specific failure mode the label produces, and most of the worst-aging tactical predictions in modern Premier League coverage have shared that failure mode at their root.

The label is a convenient piece of furniture in a room whose layout has changed. The furniture hasn't been moved because nobody wants to move it. The walls are in different places than they were a decade ago, and pretending otherwise has been quietly costing the league's analysis layer accuracy for years.