Using Clustering Analysis to Categorize NHL Defensemen

By Trey Elder Jan 8, 2025


For this project I used k-means clustering to classify all NHL defensemen into eight groups based on their play at 5 on 5 so far in the 2024-2025 season. I used sixteen variables for the clustering, which were chosen with the goal of capturing the full picture of a defenseman's play, including statistics like on-ice goals against, shots on goal against, blocked shots, and takeaways. A full list of the variables can be found in the appendix. To limit the scope of this exercise to defensemen who have been regular NHLers this year, only defensemen who had played at least 20 games by the beginning of the league’s Christmas break, which started on December 24, 2024, were considered, which cut the sample size down to 215 defensemen. The data was also restricted to before December 24. I also performed principal component analysis (PCA) on the data in an effort to determine which variables had the greatest influence on the clusters, the results of which can also be found in the appendix. I chose to use eight clusters for my analysis because although according to the plot below, the “optimal” number of clusters is likely around five, I felt that five clusters did not create groups that were distinct enough. The large number of players in each cluster made it very difficult to determine trends in the variables that differentiated them from each other. The eight clusters are listed below in order from worst to best (in my opinion), along with a list of their players, a short description, and a “surprise player,” someone who I did not expect to be in their assigned cluster.

Note: For all variables that weren’t in the form of a percentage, I standardized them on a per-60-minute basis. Thus, if in the body of this report a sentence states that “Cluster A has 2.4 on-ice goals for,” this means that the average player in Cluster A is on the ice for 2.4 goals for their team per 60 minutes of icetime for the player.

Luke Schenn, Brenden Dillon, Arber Xhekaj, Brayden Pachal, Radko Gudas, Noah Juulsen, Simon Benoit, Nikita Zadorov, Jeremy Lauzon, Connor Clifton

When it comes to hitting opponents and taking penalties, these guys are in a league of their own. They hold the ten highest rates of hits per 60 minutes while also posting above average penalty minutes. This group features a decent mix of guys who play on the top pair like Brenden Dillon, Radko Gudas and Jeremy Lauzon and depth defenders like Simon Benoit and Connor Clifton. They are, however, united in their offensive ineptitude, as these ten players have combined for only seven total goals before the Christmas break and sport the lowest average on-ice goals for of all eight clusters. One may assume that this is because they are relied upon to play a more defensive role, as the Bruisers average the second-highest defensive zone start percentage. But they aren’t particularly good at defense either, being on the ice for an average of 2.44 goals against per 60 minutes which, when subtracted from their goals for numbers, leaves them with the worst on-ice goal differential of the eight groups. This is where offense goes to die and defense fails to make up for it. But hey, they can hit.

Surprise Player: Brenden Dillon

Playing on the Devils’ top pair, Dillon’s numbers are actually pretty respectable. His impressive 0.54 on-ice expected goals percentage is even an improvement on last year’s performance, which saw him in a second-pair role for one of the best defensive teams in the league in the Winnipeg Jets. And unlike the other players in this cluster, offense does seem to follow him, as he is on the ice for around 3.2 goals for per 60 minutes (89th percentile) and actually has a positive on-ice goal differential. The only reason he’s in this cluster is because his hits and penalty minutes are so high that the clustering algorithm can’t help but put him alongside the Bruisers.

Zac Jones, Cody Ceci, Ben Chiarot, Kaiden Guhle, Troy Stecher, Nicolas Hague, Henry Thrun, Jake Christiansen, Philip Broberg, Jamie Oleksiak, Kevin Bahl, Jan Rutta, Nolan Allan, Kris Letang, Morgan Rielly, Pavel Mintyukov, Jeff Petry, Braden Schneider, Henri Jokiharju, Jack Johnson, Matt Grzelcyk, Travis Hamonic, Timothy Liljegren, Logan Stanley, Brian Dumoulin, Marcus Pettersson, Rasmus Andersson, Alexandre Carrier, Owen Power, Ryan Lindgren, Cam Fowler, Pierre-Olivier Joseph, Jake Bean, Jacob Trouba, Mario Ferraro, Dylan DeMelo

The Turnstiles are a rag-tag bunch of youngsters (Pavel Mintyukov), fanbase punching bags (Jacob Trouba) and San Jose Sharks (Jan Rutta, Henry Thrun, Cody Ceci, Mario Ferraro). They are defenders only in name, as they offer very little resistance to opposing teams (hence the cluster name). Their specialty is allowing the opponent to fire an inordinate amount of pucks on net, which causes them to give up 2.67 goals per 60 minutes, the worst mark of any group. If that wasn’t bad enough, their on-ice goals for and on-ice goal differential are the second-worst of the clusters, behind only the Bruisers. These defenders’ advanced stats are equally as brutal, as two-thirds of the Turnstiles find themselves in the bottom third of both Corsi and expected goals percentages. They are essentially a black hole on the ice and would be ranked last if not for the Bruisers.

Surprise Player: Nicolas Hague

As much as this list is about clustering individual players, team success does seem to be a major factor in the groupings. Out of the 36 players in the Turnstile tier, only three of them play for teams with more wins than losses as of the Christmas break. I find the most surprising of this bunch to be Nic Hague, who not only plays for a winning team, he plays for the Vegas Golden Knights, who lead the Pacific division and were Stanley Cup champions just two years ago, a title which Hague was a part of. Him and partner Zach Whitecloud have been rock solid on the third pair for years now, so Hague’s poor on-ice analytics (0.43 Corsi and expected goals) combined with opponents averaging 31 shots on goal per 60 minutes (7th percentile) when he’s on the ice is a bit unexpected.

Ilya Lyubushkin, Jacob Bryson, Egor Zamula, T.J. Brodie, Isaiah George, Mason Lohrei, Jordan Harris, Olli Maatta, Jamie Drysdale, Ryan McDonagh, Declan Chisholm, Chris Tanev, David Savard, Bowen Byram, Justin Holl, Tyler Myers, Scott Perunovich, Alex Vlasic, Dennis Cholowski, Jon Merrill, Ryan Suter

The Robots are the most boring cluster of defensemen. They contribute very little offense, with their teams scoring the third-fewest goals when they’re on the ice, and aren’t very good on defense either, allowing the second-most goals. There isn’t one particular role that Robots seem to be tasked with; their defensive zone percentages range from almost 40% (Chris Tanev) to just under 17% (Scott Perunovich). Their possession metrics are relatively mediocre, save for Jordan Harris’ comically bad 0.36 on-ice expected goals percentage. What brings them together is not just their inability to generate sustained offense, but their lack of a desire to try, as they only attempt an average of 6.75 shots per 60 minutes, nearly 30% fewer attempts than the second-lowest cluster. One positive thing about the Robots is that they are tough as nails, averaging the most blocked shots per 60 minutes, with just over five.

Surprise Player: Chris Tanev

There are a few names I could have listed here, as this cluster houses several players who actually have positive goal differentials and are performing well defensively like Ryan McDonagh and Olli Maatta. But I decided to go with Tanev, whose on-ice goal differential of 0.44 and corsi percentage of 0.53 are both in the top third of all defensemen. By both reputation (not many 34-year-old defensemen get handed six-year contracts in free agency) and by the numbers, Tanev is a great defensive defenseman, much better than the clustering is giving him credit for. Judging by the loadings of the first three principal components, which emphasize shotsongoalagainstper60 and shotattemptsper60, Tanev is being grouped as a Robot almost entirely because of his shot output, which on a per-60-minutes basis is the third-lowest out of 215 eligible defensemen. If Tanev were to attempt, say, 8 shots per 60 minutes, which equates to less than three per game, instead of his measly 5.74 per 60 minutes, he would likely find himself in the third-ranked cluster on this list, one that more accurately captures his defensive prowess.

Filip Hronek, Vincent Desharnais, Ian Cole, Rasmus Ristolainen, Martin Fehervary, Carson Soucy, Mattias Samuelsson, Brandon Carlo, Josh Manson, K’Andre Miller, Adam Larsson, Moritz Seider, Jayden Struble, Nick Seeler, Matt Roy, Erik Cernak, Haydn Fleury, Oliver Ekman-Larsson, Colin Miller, Ryker Evans, Calvin de Haan, Marc Del Gaizo, Connor Murphy, Emil Lilleberg, Zach Whitecloud, Wyatt Kaiser, MacKenzie Weegar, Justin Faulk, Joshua Mahura, Dmitry Kulikov, Alexander Romanov, Matthew Kessel, Jake McCabe, Brayden McNabb, Rasmus Sandin, Scott Mayfield, Zach Bogosian, Ryan Graves, Will Borgen, Ty Emberson, Tyler Kleven, Niko Mikkola, Charlie McAvoy, Joel Edmundson, Jake Middleton, Ryan Pulock, Colton Parayko, Andrew Peeke

As a whole, this group is pretty mediocre both offensively and defensively, ranking fifth in on-ice goals against and goal differential and fourth in on-ice goals for. Because there are so many members of this cluster it’s hard to identify large trends amongst the cohort, other than the fact that this group has a little grit to them, with every single member of this cluster being above the 50th percentile in hits per 60 minutes. For most of the other variables, there is an equal distribution of players above and below the average. They are grouped together primarily because they attempt an average number of shots and allow an average number of shots on goal and as we saw with the Robots, those two variables played a large role in how the clusters were formed.

Surprise Player: Mackenzie Weegar

I would be justified in giving this spot to Filip Hronek, whose on-ice goal differential of 2.49 goals per 60 minutes leads all defensemen, but the fact that he has played fewer games than most due to injury and is usually stapled to the reigning Norris-Trophy-winner Quinn Hughes when he does play makes me a bit more weary of arguing for him to be placed in a better cluster. Weegar, on the other hand, has mainly been playing with Daniil Miromanov who, before this season, had just 49 games of NHL experience. Yet, the Flames are only allowing 22.77 shots on goal against when Weegar is on the ice, which is in the 92nd percentile for all defenders. What makes Weegar a true unicorn in today’s NHL is that he is the only defenseman in the league to be above the 75th percentile in shots on goal allowed, on-ice goals against, hits, and blocked shots, and he actually ranks above the 90th percentile in all four. Not only is he stellar defensively, he is also an incredibly gritty and hard-nosed defender, a combination that as the data shows, is extremely tough to come by.

His offense hasn’t been super effective this season, but he’s more than capable, as last year Weegar became only the eleventh active defenseman to hit the 20-goal plateau in a single season. He has gone from being essentially the kicker piece in the Matthew Tkachuk-Jonathan Huberdeau trade from three years ago to a fixture on Calgary’s top pair and an effective PP1 quarterback, a transformation that at this point, probably makes him more valuable to the Flames than Huberdeau. And with his position on the shot attempts for / shots on goal against scatterplot (top-left-most orange dot) he is right on the cusp of two much, much better clusters (ranked numbers 3 and 1 on this list), both of which he deserves to be in. Ironically, it's likely that his physicality is what's keeping him out of these "better" clusters, as Weegar has more hits per 60 minutes than anyone from Clusters 3 and 1 and hitsper60 is present as a loading in each of the first three principal components from the PCA, which means it has a large impact on the clustering.

Samuel Girard, Devon Toews, Josh Morrissey, Nick Jensen, Mikhail Sergachev, Erik Gustafsson, Nick Perbix, Daniil Miromanov, Adam Fox, Damon Severson, Noah Hanifin, Jake Sanderson, Alex Pietrangelo, Cam York, Brett Kulak, Jaccob Slavin, Simon Edvinsson, Ivan Provorov, Erik Brannstrom, Jackson LaCombe, Dylan Samberg, Juuso Valimaki, Shea Theodore, Dante Fabbro, Neal Pionk, Mike Matheson, Brock Faber, Lane Hutson, Trevor van Riemsdyk, Jared Spurgeon, Darren Raddysh, Noah Dobson, Esa Lindell, Travis Sanheim, Miro Heiskanen, Conor Timmins, J.J. Moser

As a whole, the Two-Ways are very similar to the Unremarkables, with much more name-brand value. They attempt and allow almost the exact same numbers of shots, yet the Two-Ways are on the ice for significantly more goals for their team and thus have a much better goal differential. Even though a two-way defensemen is theoretically supposed to be proficient at both offense and defense, the term is often used to refer to a player that is very good at offense and is passable defensively, which is why it felt like an appropriate name for this cluster. While the Two-Ways leave something to be desired defensively, with the third-worst average on-ice goals against per 60 minutes, they are second in takeaways and third-best in shots on goal against, implying that they may just be getting a bit unlucky with their goaltending. Many of these players have been very solid so far this season, but a cut below the top tier of the league.

Surprise Player: Nick Jensen

After being dealt along with a third-round pick for Jakob Chychrun in the summer, Nick Jensen has had a resurgent season with the Ottawa Senators. Not only has his expected goals percentage improved from 0.44 last year to 0.53 this year, but he also sports the 6th-best on-ice goal differential in the entire league, at 1.297 goals per 60 minutes. His new role playing alongside Thomas Chabot, in which his defensive zone shift start percentage has gone down from almost 36% last year to 30% this year, has helped his offense too, as his 13 points in 34 games before the Christmas break is almost tied with the 14 points he had all of last season (in 78 games).

Jonas Siegenthaler, Jonas Brodin, Emil Andrae, Vladislav Gavrikov, Jonathan Kovacevic, Uvis Balinskis, Nate Schmidt, Jordan Spence, Mikey Anderson, Jacob Bernard-Docker, Brett Pesce, Luke Hughes

This crop of defensemen aggressively suppresses offense by allowing the opponent very few shots and scoring chances. They don’t generate much offense themselves (their average on-ice goals for is fifth out of eight clusters) but they allow so few shots and goals that their Corsi and expected goals for numbers are sparkling, with eleven of the twelve players sporting percentages above 50% in both categories. They rank first in goals allowed and second in on-ice goal differential. The former can be partially attributed to their ability to limit prime scoring chances, namely high danger shots and rebounds.

Surprise Player: Jonas Brodin

More than any other cluster, this group features players who work as defensive partners. This likely isn’t a coincidence as it’s relatively common for one player’s strong performance to inflate the other’s. For example, given that Luke Hughes has played 96% of his minutes this season with Brett Pesce, it’d be safe to say that Hughes, whose on-ice expected goals percentage leads the entire NHL at 0.63 and is up from last year’s 0.52, is seeing his statistics benefitting substantially from playing with the defensive-minded Pesce. Of the four players whose regular defensive partner isn’t a fellow shutdown, the only one who doesn’t play on the bottom pair is Jonas Brodin. Not only that, he’s playing top pair minutes for one of the top teams in the league in the Minnesota Wild, still going strong as one of the league’s best shutdown defensemen in his thirteenth season.

Erik Karlsson, Zach Werenski, Brandon Montour, Roman Josi, Brady Skjei, Nils Lundkvist, Jake Walman, Olen Zellweger, Darnell Nurse, Aaron Ekblad, Victor Hedman, Dougie Hamilton

Simply put, the Anchors shoot the puck a lot and give up a lot of shots on goal. Their defense and offense in terms of on-ice goals are actually both slightly worse than the Two-Ways, but their goal differential is better, which gives them the edge. They are sometimes guilty of sacrificing defense for offense, as they lead all clusters in shots attempted but also give up the second-most high-danger shots to the opposing team, behind only the Turnstiles. However, these defensemen are so talented that teams will usually put up with their occassional defensive blunders because of the offensive production they can bring (Erik Karlsson being a prime example). With a few exceptions, these are number one defensemen that can be the long-term centerpiece of a team’s blueline.

Surprise Player: Olen Zellweger

Nils Lundkvist is undoubtedly the most surprising name to show up in this cluster, but I chose Zellweger because his presence in the upper echelon feels more sustainable. Given that Lundkvist’s poor play got him benched during last year’s playoffs in favor of 32-year-old Alex Petrovic, who had been playing in the AHL for the past five seasons, there needs to be more consistency from Lundkvist before his favorable statistics can be viewed without skepticism.

What makes Zellweger’s statistics so outstanding is that he plays on one of the worst teams in the NHL in the Anaheim Ducks. Despite playing on a bottom-five team, Zellweger’s on-ice goal differential is in the 82nd percentile and his on-ice goals against is in the 92nd percentile, which is nothing short of remarkable. He is also the only player in this cluster, and one of only seven out of 215, to have his team not allow a single rebound goal when he’s been on the ice. While part of the reason he finds himself in this cluster and not the best one is because his team allows a lot of shots on goal when he’s on the ice, they haven’t resulted in many goals for the opposition. Zellweger’s offense hasn’t been eye-popping this year (his 0.6 points per 60 minutes are only 29th percentile league-wide despite his high shot attempts) but his defensive play has been so good that although he didn’t end up in the Shutdown cluster, I’d be surprised if he doesn’t start developing a reputation as one of the more promising young defensemen in the entire league.

Mattias Ekholm, Quinn Hughes, John Carlson, Thomas Harley, Sean Walker, Michael Kesselring, Brandt Clarke, Dmitry Orlov, Brent Burns, Jakob Chychrun, Cale Makar, Thomas Chabot, Evan Bouchard, Gustav Forsling, Sam Malinski, Shayne Gostisbehere, Jalen Chatfield, Rasmus Dahlin

This is the best of both worlds. The Elites score the most goals, allow the second fewest, and have the best on-ice goal differential. They crush their minutes by driving offense like the Anchors but playing defense like the Shutdowns. Surprisingly, there is a solid mix between established superstars like Hughes and Makar and highly touted youngsters like Harley and Clarke. Given the fact that with a quick Google search I can find multiple videos of Evan Bouchard defensive blunders from this season alone, there is a bit of a “the best defense is good offense” component to some of the players in this cluster, but they can still clamp down when they need to, as they lead all clusters in takeaways per 60 minutes. While they may not all be household names, according to the clusters, these players have been the cream of the crop to start the 2024-2025 season.

Surprise Player: Michael Kesselring

I could have gone with Jalen Chatfield, but seeing as how the Hurricanes have five defensemen on this list, it feels like Chatfield’s inclusion says more about how the team plays than his play as an individual. Instead, I want to highlight Michael Kesselring, a young defenseman for the Utah Hockey Club who is putting together an outstanding second full year in the NHL. Utah’s devastating injuries on the blue line have elevated Kesselring to top-four status, and he has responded by posting an expected goal percentage of 52%, a shot attempt rate in the 82nd percentile, and an on-ice goal differential of 1.176, the 12th-best out of 194 eligible defenseman. With just over 100 NHL games to his name it’s definitely too early to call him a truly elite defenseman, but given that he is the only Utah blueliner to feature in any of the top three clusters, Kesselring is certainly off to a promising start. Not to mention he is the only member of the Elites playing on a team that as of the Christmas break, had lost more games than they had won.






As can be seen from the PCA results (in the appendix), the variables that likely had the greatest impact on which cluster a defenseman was part of were shot attempts and shots on goal against. This is due to the fact that both were present in the loadings of each of the first three principal components, which together explain 91% of the variability in the dataset. (Hits were also a loading in each of the first three principal components but the only clusters that were clearly defined by their hit statistics were The Bruisers and The Unremarkables.) It's also pretty obvious when we examine the full scatterplot with shotattemptsper60 on the y axis and shotsongoalagainstper60 on the x axis, color coded by cluster. Here is the full scatterplot for those two variables, color coded by cluster:

While these clusters were by no means perfect, they actually did a pretty decent job of separating defensemen into tiers that are somewhat indicative of how good they are, as most of the clusters contain players of similar skill levels. However there were also quite a few surprises that made me gain a greater appreciation for certain players (Weegar and Zellweger come to mind). Improvements could surely be made, but overall this was an insightful and effective exercise.






Appendix

List of Variables Used in Clustering Analysis

I chose these variables subjectively, based on the factors that I thought would, combined together, help create a full picture of a defenseman's play.

  1. goalsper60 - Average number of goals scored by the player per 60 minutes of icetime for the player
  2. assistsper60 - Average number of assists by the player per 60 minutes of icetime for the player
  3. takeawaysper60 - Average number of takeaways by the player per 60 minutes of icetime for the player
  4. blocksper60 - Average number of shots blocked by the player per 60 minutes of icetime for the player
  5. hitsper60 - Average number of hits by the player per 60 minutes of icetime for the player
  6. penaltyMinutesper60 - Average number of penalty minutes by the player per 60 minutes of icetime for the player
  7. onIce_xGoalsPercentage - Percentage of the total expected goals belonging to the player’s team when the player is on the ice
  8. onIce_corsiPercentage - Percentage of total shot attempts by both teams belonging to the player’s team when the player is on the ice
  9. shotattemptsper60 - Average number of shot attempts by the player per 60 minutes of icetime
  10. shotsongoalagainstper60 - Average number of shots on goal by the opposing team when the player is on the ice per 60 minutes of icetime for the player
  11. onIcegoalsForper60 - Average number of goals by the player’s team while the player is on the ice per 60 minutes of icetime for the player
  12. onIcegoalsAgainstper60 - Average number of goals by the opposing team while the player is on the ice per 60 minutes of icetime for the player
  13. oniceahighdangershotsper60 - Average number of high danger shots by the opposing team while the player is on the ice per 60 minutes of icetime for the player
  14. oniceareboundgoalsper60 - Average number of rebound goals by the opposing team while the player is on the ice per 60 minutes of icetime for the player
  15. dzoneshiftstartpct - Percentage of total shifts the player starts his team’s defensive zone
  16. ozoneshiftstartpct - Percentage of total shifts the player starts his team’s offensive zone

Principal Components Analysis Results