Steven Clinton, also known to BeerLife Sports fans as “The Professor”, is an expert quantitative modeler and former college football researcher at his alma mater of Northwestern, where he broke down film for the Wildcats.
In 1976, a British statistician named George Box wrote that “All models are wrong, but some are useful.” This quote may seem a strange way to start off learning about Sports Betting Prediction Models, but I believe it’s important to understand that when someone says they’re running a model to come up with picks, it doesn’t mean any one thing, and it certainly doesn’t mean that their numbers are “right.”
What could they mean? On a simple level, running a model to come up with picks indicates that the individual is using a computer program to create game projections based on past data.
There are a wide variety of “machine learning” techniques that employ different statistical methods to create these projections. If you’re familiar with the formula for the slope of a line, y = mx + b, then you have some general familiarity with the most basic form of machine learning models, which is termed linear regression.
If you took one class of students who had taken a midterm and final exam, you could draw a “best-fit” line through the scores that “minimizes error.” It wouldn’t be perfect, but it would provide an estimate, based on one sample, of the expected final exam score based on the midterm score, and that estimate could be applied to the next class after they take their midterm to predict their final exam scores. In this example, the first class would be “the training group” and the second class the “test group.” You can get into using what is called a “validation group”, but that’s beyond the scope of this discussion.
In this case, we would call the midterm score the “predictor” variable and the final exam score the “response” variable, which are terms I will use as we move forward into a discussion of my NFL model.
Machine learning gets much more complicated as you go beyond linear regression, but that example should be enough to get us started. I can’t speak to the details of how others have built their models, but with our example in hand, I’m going to walk you through the general method behind my NFL model, and call out where I have considered doing things differently.
Step 1: Pace of game metrics
I use projected turnover and 3rd down conversion rates as predictor variables to project the number of drives each team will have in a game. For turnovers, I look at a team’s pre-recovery fumble rate over the past few seasons and multiply it by 0.5 to get rid of “recovery luck.”
For interceptions, I look at a quarterback’s historic interception rate and a projected number of pass attempts per game for their team. In the case of quarterbacks with no or limited data, I rely on results produced by a player with a similar scouting profile’s results from early in the “comp’s” career. I’ve played with many other predictor variables, but this model doesn’t produce a huge range of outcomes, and I’m of the belief that simpler is always better unless there are notable returns from getting complicated.
Step 2: Drive predictions
I create relevant training sets of drive-by-drive data for each offense that are primarily based on the offensive coordinator/quarterback combination. These drives have ratings for the offense and defense attached, and in the test set, I assign a current rating to each offense and defense in the NFL.
I ultimately rely on my own rating systems based on my film study, but also use some publicly available rating systems produced by others to monitor my own performance and consider whether or not I’m overrating or underrating a particular unit.
Once the training set is trimmed, and the current ratings are assigned, I use six different machine learning methods to produce predictions of the expected points per drive for a given offense against a given defense. All of these methods produce different results, and while I currently only use two in my final model, I like to monitor results from the others, which I built in during experimental phases, as they don’t take a significant amount of time to run. No one modeling technique works optimally for every machine learning problem, which is why I have experimented with so many.
I combine the expected points per drive with the projected number of drives to produce score projections. In addition to this projection, I use the drive-by-drive information to produce expected passes per drive and runs per drive, which I carry forward to carry and target shares.
Step 3: Carry and Target Shares
I work with a 5 player carry share that includes the quarterback, and a 10 player reception share. As I do in turnover projections for quarterbacks with a limited NFL sample, I use player comps for running backs, wide receivers, and tight ends who haven’t accumulated significant NFL data.
These shares rely on two things to determine who will get the ball: past carry/target shares and a ranking of “who eats first.” In many cases, I’ll use a “comp” who has a larger or smaller carry or target share for a player who I project will have similar efficiency with more or less volume, but rank that player lower in the “food chain” which accomplishes my aim. As I alluded to, this process also produces efficiency projections: yards per carry for runners and target conversion rates and yards per reception for receivers.
All of the receiving numbers are aggregated to produce the quarterback’s passing statistics, and I have a number of scripts to print out these projections in different formats.
Again, this is only one way to go about modeling NFL outcomes. Some models will run thousands of simulations that incorporate the yard line for each drive and play out many iterations of the same game. Others get down to play by play granularity. I’ve played around with both those methods, but ultimately determined that my “average drive” method was more efficient and accomplished my purposes.
If I had more computing power on my machine, I might reconsider, but I’ve been pleased with the results I’ve gotten over the past few years. That said, while I use the average method, I have no criticism of anyone running more simulations or getting down to play-by-play granularity; such methods are perfectly valid.
Of course, the inevitable injuries in the NFL complicate this whole process, particularly when doing full-season projections before the season kicks off.
I’ve played with limiting the number of games for injury-prone players in the past, but I currently drop such players in the target share to limit their game by game production. The downside to this method is that on a per-game basis, the player’s estimates come in low, but the downside to guessing what games they might be out is that I have no clue when or if the player will be injured.
To compensate, I note which players are injury-prone in a “Risk” column for Fantasy Boards, which lets my readers know which players should produce above their projection if they are healthy. However, in the case of suspensions, I will take a player out of the rotation for the relevant weeks. For instance, Will Fuller of the Dolphins is suspended for Week 1 of 2021, so he comes out, and Preston Williams and Jakeem Grant move up the food chain, and Lynn Bowden Jr. gets a seat at the table in target share. As an aside, the Dolphins have a lot of capable players vying for targets in 2021.
My least favorite scenario is what is currently happening in Green Bay with Aaron Rodgers. I’m currently splitting the difference by starting Jordan Love for a number of games and constantly debate how many in my mind. I’m actually on the verge of yanking Rodgers entirely, but it would certainly make my life easier if the situation gets resolved soon.
Then again, I’m going through the same exercise with the Drew Lock vs. Teddy Bridgewater, Jimmy Garoppolo vs. Trey Lance, Andy Dalton vs. Justin Fields, and Cam Newton vs. Mac Jones, but I suppose that’s because I know I won’t have a clear answer to those competitions before the season, they don’t bother me as much.
The machine learning methods are worthless if the training sets aren’t relevant. I spend the offseason projecting what will happen when Matt Ryan joins up with Arthur Smith in Atlanta, or what Kyle Shannahan’s offense will look like if/when Trey Lance takes over for Jimmy Garoppolo at quarterback. There’s no perfect solutions here, but if I simply ran my model off the last “X” years of drives data without accounting for context, I would get abysmal results.
That’s where subject matter expertise becomes so important. When I was getting my Master’s Degree, I worked on machine learning problems in a variety of areas, and while I understood modeling well enough to produce decent results, any expert in a specific area would have run circles around me.
The same was true when I worked for an automotive research and development company; my modeling knowledge took me part of the way, but it was my collaboration with our engineers and their willingness to help me build my subject matter knowledge that had the bigger impact on whether my models were useful.
At its core, any type of machine learning or artificial intelligence uses past results to project future outcomes, but simply saying you “use analytics” is like saying you “do art” or “exercise”. Such statements give a general idea, but the specifics can vary wildly. If you’re going to rely on someone’s model, I’d recommend that you get an idea of the modeler’s football expertise, which most fans are capable of determining. The individual’s modeling expertise would be difficult to determine for a fan, but you can certainly assess their model output to get an idea of whether the results make sense.
As George Box said, the model will certainly be wrong, but it may be useful.