Methodology

How the role models are built

Zauberpass ranks players from prepared model outputs built in the pipeline, not from live browser-side calculation. The public site reads role-specific ranking files that already combine event data, identity context, injury context, archetype outputs, and similarity inputs into season-by-season scoring models for midfielders, full-backs, and center-backs.

Open rankings

Sources

Data stack

WhoScored-derived event data is the foundation of the model. It captures what the player actually did on the pitch: ball progression, territory gain, chance creation, defensive interventions, and direct attacking output.

Transfermarkt supplies the context layer. It contributes player identity, role label, age, nationality, market value, and season-level injury information. That is what lets the model separate attacking midfielders, central midfielders, defensive midfielders, full-backs, and center-backs instead of forcing all profiles into one undifferentiated pool.

Understat sits alongside that as a supporting attacking context source where available, contributing extra output and possession-value context into the wider feature layer without replacing the event model itself.

The public app is therefore reading prepared ranking outputs, not rebuilding the pipeline on demand. That keeps the site fast while ensuring the visible scores remain tied to the full underlying data model.

Role logic

Separated pools

The model does not rank every role together. It builds separate comparison pools for attacking midfielders, central midfielders, defensive midfielders, full-backs, and center-backs.

That distinction matters because the job description changes by role. An attacking midfielder should not be judged by the same expectations as a holding midfielder, and a full-back should not be judged by the same blend as a central midfielder.

Both ranking and similarity are role-specific. Every visible score on the site is intended to compare the player only against others working in broadly the same lane.

The ranking model, archetype model, and similarity model all start from those role pools. That separation is one of the main guardrails in the product.

Pillars

Role models

The ranking model is built from role-specific pillars. Each one is formed from underlying football actions, standardized within the appropriate league-role environment, and then rescaled to a 0-100 score before the overall mark is calculated.

Midfielder model: Passing, Progressiveness, Threat, Game State, G/A, Defensive, and Availability.

Full-back model: Passing, Progressiveness, Dribbling, Threat, Game State, Defensive, and Availability.

Center-back model: Defense, Passing, Progression, Aerial Ability, and Availability.

Passing measures secure and useful ball circulation. For midfielders it blends passes attempted per 90, passes completed per 90, long balls completed per 90, long-ball completion percentage, and overall pass completion percentage. For full-backs, crossing volume and crossing completion are folded into the same pillar so delivery quality from wide areas is captured inside passing rather than split off as a separate score.

Progressiveness measures how reliably a player moves possession up the pitch. It blends progressive passes, progressive carries, final-third entries, box entries, and deep completions.

Dribbling is currently a full-back-specific live pillar. It blends take-ons attempted per 90, take-ons completed per 90, take-on success rate, final-third-entry carries, and box-entry carries to reward flank progression at the feet.

Threat measures how much danger the player creates once the ball reaches him. It combines expected-threat value from passes and carries with key-pass output.

Game State measures value when the score still needs changing. Half of this pillar comes from xT in non-winning states, and the other half comes from direct attacking output in those same states, so territorial threat and decisive contribution under pressure are treated together.

G/A is currently a midfielder pillar only. It measures direct attacking output through goals per 90, true assists per 90, and combined G/A per 90. In the live model, an assist is treated conservatively: it must be a pass marked as an intentional assist that is immediately followed by a same-team goal in the same half.

Defensive measures disruption and regain work. It blends tackles, interceptions, recoveries, overall defensive action volume, high regains, and defensive activity adjusted for opponent possession.

Availability measures both trust and durability. The live model gives this pillar an 80% weighting toward minutes, appearances, and starts, with the remaining 20% coming from injury burden through injury days missed, injury games missed, and injury episodes.

Aerial Ability is currently a center-back-specific live pillar. It blends aerials attempted per 90, aerials won per 90, and aerial win percentage so both duel load and duel quality are represented.

Once those role pillars are formed, the model applies role-specific weights to create the final overall score. The result is a role-aware ranking, not a generic all-player average.

Normalization

League-relative, then cross-league

The live model uses a hybrid normalization approach. It does not compare every raw stat across all five leagues in one step, and it does not leave every league isolated either.

First, the underlying metrics are standardized within league inside the selected role pool. In practical terms, a Premier League central midfielder is first compared to other Premier League central midfielders, a La Liga central midfielder to other La Liga central midfielders, and so on.

Those league-relative standardized values are then blended into the seven pillar raw scores. After that, the finished pillar scores are rescaled onto one shared 0-100 ladder across the full Big 5 role pool.

That means the model is trying to preserve two things at once: how exceptional a player is inside his own league environment, and where that translated profile sits on the wider Big 5 leaderboard.

This is the core reason the app can show one unified Big 5 table without pretending every league lives in exactly the same statistical environment.

Archetype

Style families

The archetype layer is not a second ranking. Its job is to describe the player's statistical style in football language.

To do that, the model looks across a broader standardized style profile built from passing, output, progression, threat, defensive work, and game-state contribution within the same role pool. Those inputs are standardized first so the model is reading profile shape rather than raw volume alone.

Players are then grouped into clusters of similar behaviour. Those clusters are interpreted back into football language and surfaced as archetype labels. In other words, the archetype is a summary of how the player plays, not where he ranks.

For midfielders, the live archetypes are Creator, Progressor, Ball Winner, Controller, Final-Third Threat, and Hybrid. For full-backs, the live archetypes are Progressor, Distributor, Carrier, Chance Creator, Ball Winner, and Hybrid. For center-backs, the live archetypes are Stopper, Distributor, Progressor, Aerial Dominator, Controller, and Hybrid.

The point of the archetype layer is interpretive clarity. Two players can rank highly for very different football reasons, and the archetype label is there to make that legible immediately.

Midfielder Archetypes

Archetype

Creator

Chance-first attacking midfielders who tilt the model through threat, key passes, and final-third invention.

Archetype

Progressor

Midfielders who win value by carrying or passing the ball up the pitch repeatedly and safely.

Archetype

Ball winner

Profiles driven by disruption, regains, and repeat defensive activity relative to opponent possession.

Archetype

Final-third threat

Players whose model shape is pulled upward by goals, assists, box access, and danger near the goal.

Archetype

Controller

Tempo setters who combine pass security, completed volume, controlled progression, and distribution range.

Archetype

Hybrid

Mixed profiles that do not lean overwhelmingly into one lane, but blend multiple midfield jobs credibly.

Full-Back Archetypes

Archetype

Progressor

Full-backs who drive the model through repeat upfield movement, especially progressive passes, carries, and territorial entries.

Archetype

Distributor

Wide defenders whose profile is built on secure circulation, completed volume, long distribution, and delivery quality.

Archetype

Carrier

Ball-carrying full-backs who progress through take-ons, dribble success, and repeated carries into advanced zones.

Archetype

Chance Creator

Final-ball full-backs whose shape is driven by crossing quality, key passes, xA, and attacking delivery from wide areas.

Archetype

Ball Winner

Full-backs who lean most strongly toward regains, disruption, defensive actions, and flank protection.

Archetype

Hybrid

Balanced full-back profiles that contribute across multiple lanes without being dominated by one clear speciality.

Center-Back Archetypes

Archetype

Stopper

Center-backs whose model shape is driven most strongly by defensive interruption, regain work, and repeat duel prevention.

Archetype

Distributor

Back-line passers who lean on circulation volume, pass security, and long distribution from deep.

Archetype

Progressor

Center-backs who move territory through line-breaking passes and calmer carry support from the first line.

Archetype

Aerial Dominator

Profiles pulled most strongly by aerial load, aerial wins, and command of first contacts in defensive spaces.

Archetype

Controller

Balanced on-ball defenders who blend pass security, circulation calm, and stable possession management from the back line.

Archetype

Hybrid

Multi-lane center-backs who contribute across defending, progression, circulation, and aerial control without one dominant extreme.

Weight models

Role-specific weighting

Zauberpass does not use one universal weighting. Each role has its own default blend because the balance of responsibilities changes across the pitch.

Attacking midfielders: Passing 12, Progressiveness 18, Threat 22, Game State 15, G/A 15, Defensive 8, Availability 10.

Central midfielders: Passing 22, Progressiveness 18, Threat 12, Game State 10, G/A 8, Defensive 18, Availability 12.

Defensive midfielders: Passing 24, Progressiveness 14, Threat 6, Game State 10, G/A 4, Defensive 28, Availability 14.

Full-backs: Passing 16, Progression 22, Dribbling 14, Threat 10, Game State 12, Defensive 16, Availability 10.

Center-backs: Defense 32, Passing 18, Progression 18, Aerial Ability 22, Availability 10.

These are not cosmetic slider defaults. They are the actual role profiles used by the live ranking model. In practical terms, attacking midfielders still lean hardest toward threat and final-third value, central midfielders carry a much stronger passing load, defensive midfielders lean most heavily into defending and distribution, full-backs are pulled most strongly by progression with crossing folded into passing, and center-backs are anchored by defending and aerial command without ignoring modern on-ball responsibility.

Model stack

Ranking, archetype, and similarity

The public product is built from three linked model layers. The first is the ranking model, which turns event and context data into seven visible pillars and one overall score. The second is the archetype model, which turns a broader standardized feature set into a style family. The third is the similarity model, which compares players inside the same role pool.

The similarity view is a style-comparison tool, not a second ranking table. It works in two layers. The first layer compares the visible pillar profile. The second layer compares a wider latent statistical profile built from the underlying features that sit beneath those pillars.

Those two views are blended so that the nearest match is not simply the player with the closest headline scores. It is the player whose broader role profile most closely resembles the target. That makes the comparison more robust when two players arrive at similar outcomes through slightly different action mixes.

When the app says two players are similar, it means they occupy a similar statistical lane inside the same role bucket. It does not imply identical quality, value, or career stage.