Dead-Ball Defense

Poor defensive teams don’t just struggle on offense because bad teams are bad on both ends but also because the start of an offensive possession is often affected by the conclusion of that same team’s previous defensive possession. If a team can get a stop, a rebound, and a push up the floor that following offensive possession seems more likely to produce a positive result than if the ball has to be inbounded and walked up the court. I believe this happens because the offense can play faster after a live-ball and the defense is pressured to play in transition. These factors allow the offense to dictate the action. I wanted to confirm this notion as well as to explore how the event before an offensive play affects its efficiency. Does it matter more if it was a positive versus negative defense event or does it matter more that the ball is dead versus live?

Method

Offensive plays from the 2013-14 NBA regular season were pulled from basketball-reference.com as well as the events prior to the start of those offensive plays. Offensive results were categorized as “made 2”, “made 3”, “missed fg”, “turnover”, “FT 2 of 2”, “FT 1 of 2”, “FT 0 of 2”, “FT 3 of 3”, “FT 2 of 3”, “FT 1 of 3”, “FT 0 of 3”, “made 2 and 1”, “made 2 and 0”, “made 3 and 1”, or “made 3 and 0” with point values of 0, 1, 2, 3, and 4 assigned accordingly. Prior events were categorized as “TOV ball stolen”, “TOV pass stolen”, “TOV 3 sec”, “TOV shot clock violation”, “TOV ball lost out of bounds”, “TOV pass out of bounds”, “TOV out of bounds”, “TOV offensive foul”, “TOV traveling”, “TOV other”, “player offensive rebound”, “player defensive rebound after ft”, “player defensive rebound after fg”, “team defensive rebound after ft”, ”team defensive rebound after fg”, “team offensive rebound”, “made shot”, “kicked ball”, “personal foul”, “made ft”, “technical ft”, “jump ball”, “loose ball foul”, “timeout”, or “start of quarter”.

Seven of these events (TOV ball stolen, TOV pass stolen, player offensive rebound, player defensive rebound after ft, player defensive rebound after fg, and jump ball) result in in a live-ball and the others result in a dead-ball. Three are negative defensive events (made shot, made ft, defensive rebound after ft) and fifteen are positive defensive events (all TOV’s, both player defensive rebounds, and both team defensive rebounds).

deadball bar

 

Previous Event Points Per Play
Dead Ball 0.883
Live Ball 0.994
Previous Event Points Per Play
Positive Defensive Event 0.973
Negative Defensive Event 0.891
Points Per Play Allowed Following Event
Positive Defensive Outcome Negative Defensive Outcome
Dead Ball 0.891 0.890
Live Ball 0.984 0.890

Interpretation

As expected, plays following steals resulted in the highest offensive efficiencies since they’re what lead to fast-break opportunities. This is part of what makes steals valuable; they not only allow the opposing team zero points but they result in good offensive plays because the ball is still in play and the opposing team is usually playing defense way out of position. What also makes them significant is their marginal value; if a particular player hadn’t stolen a ball, it’s unlikely a teammate or replacement player would have. Compare this to a particular player getting a rebound or scoring a basket on a possession. It is much more likely that someone else would have done it anyway. That is not to say that going for steals is valuable, but actually getting a steal is valuable.

Plays following a live-ball result in more points per play on average than following dead-balls and plays following a positive defensive outcome result in more points per play on average than following negative defensive outcomes. If we only look within dead-ball defensive outcomes, there isn’t a significant difference between positive and negative defensive outcomes. But looking within positive defensive outcomes, there seems to be a dead-ball effect. More specifically live-ball TO’s result in 1.142 PPP compared to 0.882 after dead-ball TO’s and player defensive rebounds after fg result in 0.950 PPP compared to 0.928 after team defensive rebounds after fg. Oddly there isn’t a dead-ball affect within negative defensive outcomes. I’m guessing this could either because there is an effect of having a negative defensive possession or because the nature of the only possible live/negative event which is a player rebound after a free-throw. The shooting team usually is already matched up in the backcourt and free-throw block-outs and the rebounding team almost never makes it a point to push the ball because of that. It seems that the higher efficiencies in offense have more to do with the ball staying in play and less to do with the psychological effect of just having done something good on defense.

Compared to other dead-ball turnovers, offensive 3-second calls might seem to produce plays that are too high in efficiency and double-dribble seem to produce plays that are too low. These two calls are the least frequent violations and account for only 0.16% and 0.02% of previous events so there might be some kind of sample size inadequacy going on. An alternative plot with confidence intervals can be seen below.

deadball scatter

It is also interesting to note that plays following timeouts produce 0.865 points per play while plays following player defensive rebounds produce 0.950 points per play. You could also throw in plays following the start of a quarter, 0.859, because even though they aren’t timeouts, teams get a huddle. This suggests to me that either offenses are overall poor at executing ATO plays or maybe defenses become more prepared or energized. You could also argue that when down late, a coach is better off letting his team play after a defensive rebound rather than calling a timeout to draw up a play. However these are season-wide averages that aren’t specific to teams, time on clock, or point deficit.

Lastly, while these results suggest defense that leads to live-balls creates good offense, it can also be viewed from the opposite perspective. Creating an easier to defend situation relies on creating a dead-ball situation from the previous offensive possession. Therefore, made baskets are valuable to a team not only for the points they generate but because of the opportunities they give for teams to get back and get set. Again these are league and season-wide averages that do not apply to all situations. This is clear to anyone who’s ever seen Chandler Parsons sprint down the court for an easy lay-up after the Rockets allow a made basket.

Player Types: D-League Call-Ups

 1757690_SP_0128_Lakers_WJS

Many people’s impressions of the D-League are high pace and no defense. There’s definitely a lot of truth to that perception and I think some of it is due to what too many of the players are there trying to accomplish. There’s probably somewhat of a misunderstanding that they need to go out and gun for thirty every night to get to the NBA. In fact, Manny Harris earned himself a call-up after leading the D-League in scoring and about a third of players were their teams’ leading scorers at the time of their call-ups. While it’s important for a potential NBA player to display that he can get buckets among inferior competition, they are most likely being offered contracts to be role players at the next level. For this reason I wanted to find some quantifiable evidence that different roles and contributions outside of high-use scoring can earn NBA contracts.

This task required both a method to define styles of play and selecting statistics that would be describe styles of play. I used k-means clustering to define styles since it’s a pretty straightforward way categorize data points and I chose Synergy %time offensive categories, since they would describe offensive style of play, along with some advanced box score statistics to describe other contributions. Statistics of 324 NBA players who played at least 41 regular season games were normalized relative to their level of competition and statistics of 33 D-League players who were called-up were normalized relative to their competition. I used the first few iterations to remove outliers who were consistently far away from their cluster centroids then defined 12 styles of play by clustering the remaining NBA data points. I then assigned the D-League players to 1 of the 12 well-defined clusters. D-League call-ups ended up being assigned to 7 of the 12 different clusters.

Final

Above is a scatter plot of all players plotted by 15 variables broken down into two principal components. A quick glance at the plot shows distinct separation of ball-handling, wing, and big players.

 post

Players were assigned to the above cluster most commonly. I would subjectively describe these nine players as bigs who primarily operate out of the post. They were statistically distinguished by deriving their offense from post-ups, put-backs, and cuts while gathering a high percentage of offensive and defensive rebounds. My eye says that in general they all have legitimate size for the NBA and were usually bigger players for their match-ups in the D-League. They were distinguished from another post cluster by having a much lower usage rate and being involved in fewer ball screens. That other cluster of “screening high-use post scorers” included players like Al Jefferson, Anthony Davis, Blake Griffin, DeMarcus Cousins, Dirk Nowitzki, Kevin Love, LaMarcus Aldridge, Marc Gasol, Nene, Paul Gasol, and Tim Duncan.

perimeterbig

Four players also got assigned to a cluster of perimeter-oriented bigs. They differed from the previous group by posting up less and spotting up and screening on the perimeter more. They also gathered a lower percentage of offensive and defensive rebounds than the two groups of bigs previously described.

utility

These six players were assigned to a cluster of NBA players that I would subjectively describe as versatile utility wings. They were statistically distinguished by being able to derive their offense off the ball from spot-ups as well as with the ball in isolations. They also gathered a much higher percentage of defensive rebounds than other wing clusters. This cluster looks like it’s on the bigger side for wing players and is composed of guys who are called power-forwards but have perimeter skills.

wing

There ended up being three clusters of shooters; one that mostly just spaced the floor and would catch and shoot, one that had plays run for them to get shots off screens, and this group. This cluster of shooters is a little more balanced by not only being able to spot-up and come off screens but also able to score a lot of points in transition. They also rebounded less and had a lower usage rate than the previous group of wings.

play

Four players were assigned to this cluster of playmakers. They are characterized as having the ball in their hands frequently by deriving their offense from ball-screens and isolations while having a high assist percentage. This is a little surprising to me since running an offense and make great decisions out of ball-screens seem like skills that would be difficult to project to the NBA. Kendall Marshall did it to some level of success this year but he’s also a lottery pick who only played 7 games in the D-League and has already spent some substantial time in the NBA. The other two still have to prove they can bring that same skill-set to the next level.

combo

Another cluster of playmakers showed up which seem to me like a group of combo-guards that can play with or off the ball. They were distinguished by the other group of ball-handlers by deriving more of their offense off the ball in spot-ups and hand-offs while also having a lower assist percentage. This is mostly an offensive description of these players so it matches my perceptions of DeAndre Liggins and Mustafa Shakur but Jorge Gutierrez actually has more of an identity as a perimeter defender. Perhaps next time I can find statistics that better reflect defensive contributions.

score

Four players ended up falling into a cluster of high-use perimeter scorers. They were statistically distinguished by deriving their offense from isolations, off screens, and hand-offs while having a lower assist percentage than the other two ball-handling clusters. While being a reliable go-to scorer in the D-League is a skill that teams believe can translate to the NBA, there are many other ways to get in. There were two clusters of high-use scoring players (one post and one perimeter) and only four call-up players ended up being assigned to either of them.

Though all of this seems a little obvious and it’s a fun exercise to analytically show it, it’s a big problem for coaches to deal with. Many players have been successful for most their lives by playing a role that is glorified by fans, AAU, and the media but it is one that seldom works at the highest level. Many players’ are motivated to fill a box score while many coaches’ are trying to win games; both are hoping to make it to the next level. The D-League is still a great way to prepare for the NBA (the rules, court, and talent are more similar to the NBA than any other league) but the culture of conflicting interests could use some improvement.

Please leave your thoughts and criticisms.

2013 Team Projections v.2

I received a lot of feedback on my initial Team Projections, including some very constructive criticism. Let me take the time to acknowledge and thank my readers for taking the time to provide me with their thoughts, and address some specific points that were raised:

  • Team VORP was calculated by multiplying expected minutes per player with expected VORP of said player. This methodology required a little bit of guesstimating minutes allocations. These minute allocations take into account known injuries and historical durability.
  • Offensive and Defensive Rank is on a per-possession basis, rather than total PPG. Teams should not be inherently penalized or rewarded for the pace at which they play.
  • I used a simple logarithmic assumption based on draft spot to project rookie performance. Projecting rookie performance is something that I fully intend to address by the time the next draft rolls around; however, I simply didn’t have time to build and ground a model for this specific dimension. We’ll have to roll with these rookie assumptions for now.
  • Many readers keyed in on significant impacts to starting rosters, but ignored bench impacts. For example, Golden State’s acquisition of Andre Iguodala helps their starting lineup, but we must also take into account the departures of Jarrett Jack and Carl Landry, both exceptionally good bench players. The downgrade from Jack to Toney Douglas in particular should have a significant impact on the Warrior’s performance, given the almost 30 MPG he played.
  • A team’s number of wins does not always match up to their performance, which adds a margin of error to projecting wins. Even Pythagorean Wins, which are calculated retroactively, do not match 1:1 with observed wins. I would caution against losing the forest for the trees.

I have also added in some new improvements of my own. Here is the changelog:

buy cipro 250

  • Updated rosters to reflect trades (e.g. Gortat/Okafor), player adds and drops
  • Changed rotations and minutes distributions to better buy viagra online uk account for player news (e.g. Nerlens Noel out for the year), team strategy (e.g. Sixers blatantly hardcore tanking), and other updates to injured player return timelines (e.g. Russell Westbrook out 4-6 weeks)
  • Adjusted rookie assumptions to be less strongly sloped
  • Added in an expanded role adjustment for players who are expected to see their minutes more than double, and is based on the observation that backup defensive big men who are thrust into larger roles see their performance plummet to account for superior observation. I did not apply this adjustment to Andrew Bogut, Kevin Love, or Amare Stoudemire on the basis that their minutes were limited due to injury, and thus “expanded role” probably doesn’t apply to them.
  • Added in a refined tanking factor.

Here are the revised 2013 win predictions:

2013 Season Predictions

These will be my final predictions before the season starts. Here’s looking to a great season!

BBB Exclusive Interview with Muthu Alagappan – Part 3

Muthu pic

Hope you’ve been enjoying the interview content! Here’s Part 1 and Part 2.

Mike L:            Going in a slightly different direction, a lot of your work focuses on player similarities and team similarities and team composition. You say that the model is not used as a player skill set appraisal. For instance, Devin Ebanks is in the same category as Josh Smith but, Josh Smith, most people would consider a much more talented player. Have you considered rolling in a skill element whether it’s PER, VORP, or some other type of player analysis tool on top of this topological method you’ve created to dig a little deeper?

Muthu:            Yes, we have. The reason we often do styles because style allows us to find undervalued players. If it’s skill, it’s just going to tell you stuff you already know. By adding in skill level you just basically get a grouping of good players and bad players.  It’s going to clump all of the good players together and we’re going to say okay – LeBron James, Chris Paul – they’re all going to end up in the same place despite their different styles. With style, it allows us to separate types of good players and mix them in with players who have similar styles.  We just find more value in the style network.  We’ve tried it with skill and it’s good.  I just personally don’t see as much value in doing it that way but it definitely works.

Andrew:          Yes, that makes sense.  Let’s move on to the responses to criticism section of the interview.  What are the shortcomings of using per-minute stats (the scoring rate, rebounding rate)?  Do you plan to do it in a different way in the future analysis?  Also, what do you say about the confounding factor being pace, that people who play at a very fast pace wind up with a higher rebounding rate and higher scoring rate, et cetera?

Muthu:            That’s true.  On the confounding question, it’s definitely true that when you do things per-minute, team’s pace is going to influence the stats that their players put out, but at the same time, the alternative way to do it would be to do it per-possession, but per-possession doesn’t account for a player naturally playing at a faster pace.  If you use per-possession, you’re going to say these two guys both score 1.2 points per possession.  One guy might play the game at an inherently faster pace than the other and might be quicker at scoring than the other.  You don’t get a sense of that, you just get a sense for if you give them both one possession how much are they going to score.  You don’t get a sense for how many possessions that they’re going to create for you in a certain amount of time.

                    Basically, you win some and you lose some.  If you do it per-possession you lose a sense of how fast are the players going to play for you and if you do it per-minute then you’re going to lose a sense of how many possessions is the team naturally giving them which is causing their per-minute to be higher.  It’s variable.  We’ve actually done it with per possession.  So leading up to South by Southwest, we ran the same network with per-possession and it looked pretty much the same.  The differences really aren’t as big as people might think they would be.  They’re pretty comparable.

Andrew:          What about the changing the permanent stats into more commonly used per game metrics?

Muthu:            Yes, you could do it like per 36 minutes or you can do it per game.  Per game, it counts for skill more than style so we tend to stay away from that.

Andrew :         The next question we had was Rob Mahoney of The New York Times called your model a novel execution of productive thought.  But he cited the one-of-a-kind catch all category as a weak point.  You kind of answered this when you said you forced them into groups, but when the model forced players into the groups did you guys have to do it manually or is there a tool that you guys use to somehow fit them in?  What’s your response to that criticism?

Muthu:             There’s a tool within the software called gain. Topology gain is a sense of the amount of overlap between bins that are being clustered.  That kind of goes deeper into topology but essentially it’s a measure of how many connections you want to draw between nodes.  The lower the gain, the more distinct the groups are.  As you increase the gain, it kind of creates more overlaps between nodes.  What we do is, if we see a lot of one-of-a-kinds, we usually increase the gain and what that’s going to do is it’s going to cause the one-of-a-kinds to be connected with someone else.  Based on who they’re connected to we then use that as a sense for who they should be next to.

Andrew:         It’ll map it to, you said, another person or the whole cluster itself?

Muthu:           It’ll basically force that one-of-a-kind to be connected in with the larger network.

Andrew:         The next criticism was from TrueHoop’s Tim Calvan.  He said that the team configuration suggestions that you showed in your Sloan presentation didn’t describe how it created more wins.  What’s your thinking around team configuration leading to scoring differential or win prediction?  Do you have a response or was that something that you just entirely weren’t considering?  Have you looked into whether higher performing teams are more balanced, et cetera?

Muthu:            Yes, that’s a good question.  It kind of goes back to the question about what grouping of positions work best.  Again, it’s really tough because there is not one style of making a team is successful, there’s multiple ways to be successful.  Some teams are really balanced, while the teams like the Heat are very imbalanced as a team yet they are really good.  It’s hard to make categorical claims about team construction.  In the example I gave it was basically one hypothesis that maybe diversity on a team is related to a team’s success.  We haven’t delved too much deeper into that to prove that it’s true or false.  I think it’s kind of an unfair question for us to even want to answer because it is so tough to really think about.

Andrew:          That makes sense.

Mike L:             A question I have stemming off of that: do you believe that there’s any position that a team absolutely needs in order to be successful or do you think that a team can get around the lack of say a paint protector with sufficient talent at the other positions?

Muthu:            I think there’s some positions that are kind of stock positions that every team needs.  Every team needs at least one or two ball handlers.  There’s a bunch of different types of ball handlers but we need one of those.  I think it’d be hard to be successful without a paint protector on your roster.  The two-way All Star category is essential in that people are always saying you need a star to win.  No team has really ever won without a superstar so I think you kind of need two-way All Star mostly.  Mathematically, it’s hard to say exactly the formula that you need but I think there’s some that you definitely will need on your team.

Andrew:         Back to Ayasdi’s work:  the next question that I had was regarding applying the model to other sports.  Since you already gave the example of football, are you allowed to give out hypotheses on how to use the topological mapping tool on football or is that private information?

Muthu:            Yes, we probably can’t talk to you about specific football-related stuff but some of the stuff we do with basketball applies.  For example, the way we handle drafting in basketball pairs even better over to football. There’s more data for college football players because they stay in college longer.  The combine also gives them more data.  We can do drafting, I think, even better in football than we do in basketball.  We can do some injury prediction stuff even better in football, same reason.  There’s some limitations in football, everyone’s doing a different job.  Some people are blocking, others are just kicking, so on.  That makes it a little tougher.  But I would say a lot of our lessons for basketball do carry over to football.

Mike L:            When you say we are you referring to Ayasdi?

Muthu:            Broadly, yes.  It’s their tool, but on the sports side I work with one or two other people who help me with some of the projects, that’s kind of what I mean by we.

Mike L:            You formed a small team that focuses on sports analysis then.

Muthu:            Yes, it’s a small team of two or three people.

Mike L:            Sounds like you guys are doing some exciting work.

Muthu:            It’s been fun for sure.

Andrew:         Have you guys considered adding in on/off statistics?  I know there’s a lot of stuff around plus/minus as well as on/off.  What do you think about those statistics?

Muthu:            They tend to work pretty well.  When we’re trying to do a lot of columns and not just seven or 10, if we’re doing 20 or 30 or more we throw in stuff like plus/minus or adjusted plus/minus.  When we’re doing smaller stuff we don’t just because plus/minus is confounded by teammates and it tends to throw off the analysis in smaller columns.

Andrew:         Right, it would favor good teams.

Muthu:            Yes, exactly.

Andrew:         What about mapping players within preset age ranges to predict people’s career arc. For example looking at historical groups of sub age 24 players and looking at where they are at and how they figure within people’s development cycle?  Is that something you’ve looked into or have you not really tried to weight people by an age curve type thing?

Muthu:            We kind of do that with drafting.  We do it by age group.  We want to be able to predict career trajectories based on similarities by age points.  Yes, it’s something we definitely do.

Andrew:          Have you noticed any positional changes by people’s tendencies?  Like someone who’s really high rebounding rate is more likely to become X position?  Basically, the question is centered around positional changes.  Have you noticed anything around that?

Muthu:            Yes, we have, especially, again, in drafting we’ll see that college players who have a lot of assists tend to be really good corner three point shooters in the NBA, for example.  Something like that, that’s not intuitive.  You viagra online order wouldn’t think that assists translates to corner threes.  I’m just using that as an example and I’m not saying it does but things like that we have found.  So certain college statistics translate into certain NBA statistics that you wouldn’t intuitively suspect.

Mike L:            How does that apply across college basketball? There’s a wide range of colleges and a wide range of talent, right? So playing in the Ivy League is not necessarily the same thing as playing in the ACC.  Just the pool of talent that you’re facing is different.  What sort of adjustments do you do in order to account for the difference in talent and the skill discrepancy?

Muthu:            We don’t do a lot of adjusting actually because I think when you adjust you start using subjective modeling and hypotheses that make the adjustments.  By doing so you’re skewing the data.  What we do is we just put in the raw data and it turns out that players who play in the Ivy League or smaller conferences tend to have statistics that just look different.  Their stat lines just look different from the guys from ACC because you tend to have one guy that leads the team in everything and some guys who don’t do anything.  It’s just a different looking stat line and because of that, it’s kind of cool, you naturally find players in smaller conference grouped together.  You’ll see Stephen Curry, Damian Lillard, and C.J. McCollum and all these guys in the same place.  You’ll see the small conference guys together even though you didn’t make any adjustments for that.  That’s cool because then you can map apples to apples; you can map small conference guys to small conference guys.  It’s pretty cool.

Mike L:            Then continuing off this question, let’s say you’re Jeremy Lin and you’re playing in the Ivy Leagues and you’re posting extremely good stats, wouldn’t your stat line be higher overall when you compare it to someone playing in the ACC as a freshman who it necessarily generating the same sort of volume just due to the talent he has on his own team?

Muthu:            It’s true but it’s the same reason that Josh Smith and Devin Ebanks are together because, again, our network is more style based.  Even though Jeremy Lin might have higher volume across everything, we’re looking at more of the distribution of the stats across rather than just the magnitude of them.  In that sense, Jeremy Lin is not only grouped next to other guys in high volume stats, he’s grouped next to guys who have a similar distribution across the entire arc.  Again, I think in a lot of ways it better not to make new adjustments.  If Jeremy Lin does show up by himself because no one else is putting these kind of numbers up, that’s fine, I’d rather treat him as an anomaly than try to force him to be like Kyrie Irving or someone else just after my adjustments.

Mike L:            The question tying off of that then, how do you determine which players are two-way All Stars because I don’t really have great insight into categorization.  I would’ve assumed that it would be just players that have really high volume in certain stat categories.  How did you actually come up that category?

Muthu:            The same way I talked about how we came with every other position, we used the statistical tables and K-S scores to come up with how each position is different from each other position.  With two-way All Stars we just saw that in terms of every single column they were just better on both offense and defense and every other position.  Maybe except blocks or rebounding.  For the most part, they were just way above average.

Andrew:         You mentioned there’s a way to avoid or plan for injuries.  Are you allowed to share any insights on how you approached that?  I wouldn’t even know what columns to consider maybe outside of minutes played or something like that.

Muthu:            Yes, unfortunately, I would love to but I can’t go into details about that.

Andrew:         I had some miscellaneous questions.  You mentioned in your Sloan presentation that Jeremy Lin got compared to DeMarcus Cousins, which I found surprising because he’s a post player.  How did that grouping work out?  Do you remember specifically?  That’s a specific question I had.

Muthu:            We used a lot more statistics in college to group the players.  I believe DeMarcus played one year in college so his data was pretty sparse, it wasn’t very full.  Jeremy Lin played a lot more.  Again, in college the hard part is that the talent is varied and so huge so if you’re good you’re often grouped next to other good players even if you guys are totally different.  You will have post players next to wings just because they probably do good in certain categories.  Since that network we’ve gotten a lot better at doing college networks and we don’t have post players next to guards anymore.  But back then that’s just the way things clicked and the way we rationalized it was just that Cousins didn’t have a lot of data because he played for such a short time and they both were putting up pretty big numbers across the board.

Andrew:         Where do you see basketball analytics going in terms of tools that can be applied outside of topological mapping or was this one you just happen to hit gold working for Ayasdi.  What do you think?

Muthu:            I got pretty lucky there.  There’s a lot of tools.  Not just in medicine but in finance, in energy, in weather prediction.  Every industry is dealing with data analysis now.  It’s always cool to see how other industries handle data and trying to carry over some of the same lessons to score on sports and vice versa.  I don’t think this is the last tool we’ll see cross over from one area to another.  I think there’s going to be many, many more.  It’s a matter of the size of your data and what other industries are doing.  In terms of basketball analysis, I think the biggest challenge is still going to be moving from analytics to action because there’s still a huge wall between the front office and analytics and then getting to the court and influencing how an NBA player is going to play in the heat of the moment.  When his emotions are high, when he’s playing a game he’s played for 23 years, why is he going to do something differently because of what some guy with a calculator told him.  It’s still pretty uncertain so I think that’s really where the challenges is now – how do you get those analytics to turn into actual results.

Mike L:            I think we have time for a couple of not so serious questions now.  Do you have any predictions for this upcoming season?

Muthu:            I tend to stay away from predictions, I don’t know.

Andrew:         How many wins do the Rockets get, 55?  I’m hoping for 55.

Muthu:            I think the Rockets are going to do well.  I’m pretty high on them.

Andrew:         I’m pretty bullish too.

Michael N:     Did you ever use draft measurements, like height, arm span, standing reach, stuff like that because I know that you try to stay away from pigeon holing someone just because their measurements don’t match tradition but I think there’s a level of validity to say that well if you’re not this tall you probably shouldn’t be playing a certain style.

Muthu:            We have access to all the pre-draft data – bench press, hand size, and so on.  We’re working on it now again on the private side with different partners.  Yes, we’re looking at that stuff for sure.

Andrew:         Have you guys been using historical data more because if you don’t normalize for pre-1980s their pace of play is much faster than it was today.  Have you been using historical data and has that come into context more with works for teams or is it more of within today’s world?

Muthu:            The historical stuff we do is more for fun than it is for teams.  It’s more for media insight and entertainment.  We do run into an adjustment problem, a pace problem, but we kind of live with it.

Andrew:         How much of a time commitment is your work with Ayasdi?

Muthu:            It fluctuates.  Over the summer, it’s over 10 hours a week.  Then this winter it’s probably a little under 10 hours a week.  We have other people working on the same stuff now, like I said, on a team.  We’re able to get stuff done still.

Andrew:         Thanks for your time.  I appreciate your time in taking time out of your busy schedule.  I know Medical School is pretty busy.  What’s your med school life like?

Muthu:            It’s not bad.  Monday through Friday good amount of class but it definitely doesn’t kill you.

Andrew:         What are your plans for the future in terms of sports work or are you going to be a full time doctor?

Muthu:            I don’t know, it’s still to be determined.  So far, I’ve been able to do both because of proximity and Ayasdi’s down the street from Stanford Med and the time commitment is okay on both sides.  It’s been fine so far.  At some point, I will have to figure it out but for now I’m just enjoying it.

Andrew:         Is it mostly people your age at Ayasdi?  Are the founders Stanford alums?

Muthu:            The CEO is maybe 30-something.  He’s a former PhD graduate at Stanford and the other two co-founders are math professors that have been teaching math for decades so they’re on the older side.

Andrew:         Thanks for joining us.

Muthu:            Thank you guys.  Good luck with everything you are doing.

2013 Team Projections based on VORP

Some must fall so that others may rise

Some must fall so that others may rise

Time to toss my hat into the win projection arena.

Before I jump into the predictions, let me lay out the big assumptions I used in forecasting:

ASSUMPTIONS

  • Calculations were based off of BBB’s VORP. For those of you who need a refresher on VORP
  • Players were largely assumed to have the same VORP as last season
  • Rookies were assigned an OVORP/DVORP based solely on a logarithmic function of their draft position (e.g. #1 pick Anthony Bennett has a 500% OVORP and a 200% DVORP, #2 Victor Oladipo has a 425% OVORP and a 175% DVORP etc.)
  • Trades (obviously) are not predicted and accounted for
  • Injured superstars got manual adjustments to their most recent seasons’ VORPs. Obviously, this is subject to lots of judgment

Let me state off the bat that some of these assumptions are, quite frankly, bad. Some of these assumptions will also have a significant impact on the predictive power. Unfortunately, we are forced to use these assumptions given personal time constraints. We will address these assumptions as time permits in the future, which should improve the model’s predictive power.

With that said, let’s get right to it:

2013 Predictions

Team Name
Wins
Losses
Off. Rank
Def. Rank
Atlanta Hawks46361016
Boston Celtics27552915
Brooklyn Nets5527313
Charlotte Bobcats27552227
Chicago Bulls4537194
Cleveland Cavaliers41411219
Dallas Mavericks4834714
Denver Nuggets5032611
Detroit Pistons4636185
Golden State Warriors4438169
Houston Rockets4636817
Indiana Pacers4933172
LA Clippers5032224
LA Lakers26562626
Memphis Grizzlies4339243
Miami Heat622016
Milwaukee Bucks37452312
Minnesota Timberwolves37451521
New Orleans Pelicans34481128
New York Knicks5131425
Oklahoma City Thunder552797
Orlando Magic30522522
Philadelphia 76ers25573010
Phoenix Suns27552818
Portland Trailblazers32501429
Sacramento Kings30521330
San Antonio Spurs582451
Toronto Raptors35472023
Utah Jazz32502720
Washington Wizards4240218

Some Quick Notes

  • Overall, greater parity is predicted for this upcoming season than the previous season
  • Teams expected to be significantly better: Cleveland, Detroit, Washington, Orlando
  • Teams expected to be significantly worse: LA Lakers, Boston, Memphis, Utah
  • Despite the addition of Dwight Howard, Houston is not projected to be significantly better
  • The Lakers are projected be absolutely awful… and that’s assuming Kobe will be able to play approx. 20% of the available SG minutes

METHODOLOGY (More In-Depth)

Creating the 2013 projection was broken down into a few steps: determining team roster and minutes, projecting or adjusting for players without prior data, making adjustments, collating into a singular team OVORP/DVORP, adjusting for model fit, doing the baseline projection, and finally adjusting for team strategy.

Determining team roster was largely straightforward – I pulled the 10 players I thought most likely to see significant playing time per team (generally cheap prescription viagra 2/position). Minutes were assigned by my best judgment, taking into account talent, historical minutes, injury proneness, and team direction. I then assigned each player’s 2012 VORP to the respective team.

Some players did not have a 2012 VORP, or a skewed VORP due to limited minutes. For rookies, I assigned a VORP based on a logarithmic function centered on their draft spot: the higher the draft spot, the higher their offensive and defensive VORP. For players who missed 2012 due to injury (Rose, Bynum), I assigned them their 2011 VORP with a flat percentage reduction according to their reported rehab progress (e.g. Bynum’s reduction is more severe than Rose’s). For players who played very limited minutes (e.g. Danny Granger), I applied some judgment in determining their 2013 production.

Upon determining all team’s rosters, minutes, and player VORPS, I could then create a minute-weighted total VORP.

Due to running validation on the 2011 and 2012 seasons, this initial VORP could then be adjusted for model fit. By dividing up the total number of wins over the summed total VORP, we could determine the VORP% required to generate a single win (approx. 860% = 1 win). This ratio was then applied to a given team’s VORP to create a preliminary win forecast. This preliminary forecast was then adjusted for the probability and expected severity of tanking.

Determining O-Rank and D-Rank was as simple as ranking the adjusted total team OVORP and DVORP.

Anyways, those are my predictions. Comment away!