How does this all work?
We take polling, a bunch of other key indicators (like money raised from individual contributors) and throw them all into a machine to create estimates.
Haha. Very funny, but really how does this work?…
Without getting too nerdy too fast, there are a few key elements you should know about the forecast overall.
The forecast will be presented as the margin between the two leading candidates or parties in all cases when a forecast is made.
The estimates will change at least a little bit daily (and sometimes more) based on new information, so check back often!
There will be a margin of error (a 95% confidence interval) around each forecast. The potential for a misfire is based upon how well our models did in the past. There are many different errors that can accumulate and they can be correlated with each other, so we try to control for as many as we can. A result is going to fall outside of our confidence interval 1 out of 20 times.
Want more specifics? … Read on …
The House forecast is mostly based off of data since 2006 and has four main elements.
The fundamentals: How much money each candidate has raised from individuals, if the incumbent is running, the ideology of the incumbent, whether the incumbent has suffered a scandal, whether the incumbent is a freshman and how the district has voted in past federal, state and local elections. (These final pieces of data were provided by TargetSmart.)
Race ratings: These are how CNN, the Cook Political Report, the Crystal Ball and Inside Elections “rate” each race (e.g. strongly Democratic, toss-up, strongly Republican, etc.). Historically, these organizations have done an excellent job of assessing races.
National polling: We adjust the fundamentals and race ratings in each district by the generic congressional ballot. Because it is so important in our forecast, we base our confidence intervals on this measure by midterm polling dating back all the way to 1942.
District polling: We use both polls by nonpartisan groups and partisan organizations, but control for whether the poll is conducted for a partisan or nonpartisan organization. District polls are far from perfect, but can be leading indicators and catch onto late trends.
Both the House and Senate model give more weight to higher quality pollsters and more recent polls receive the most weight.
The Senate model is simpler than the House model. It is mostly based off of data since 1992 and has three main elements.
Statewide polling: This is simple enough, but keep in mind, no partisan polls are used in the Senate.
Fundamentals: How much money is raised from individuals, how the state has voted in past federal, state and local elections and the quality of the candidates (senators and governors are rated highest, while those with no elected experience are rated lowest). The generic congressional ballot adjusts all of these measures to account for the national political environment.
Combining the statewide polling and fundamentals: Polling is pretty good at this point, but the fundamentals will always carry some weight. In states with more polling, the fundamentals receive less weight. In states with less or no polling, the fundamentals receive considerably more weight in our forecast.
One final forecasting note: a party may be forecasted to win more individual races than our topline forecast suggests they will. On the Senate side, for example, our forecast has Democrats winning a lot of close races. Although the results from one state to another and one district to another are correlated to some degree, the correlation is far from perfect. We expect on average the leading party to lose a few of the seats that are very close.
Now keep in mind this is the first time anyone at CNN has gotten into forecasting. This is an experimental product and not an official CNN forecast. We don’t expect to get every race right.
When we forecast, we’re aiming not just to tell you about the here and now. We’re trying to figure out the future and how certain we are about our future estimate.
You’ll note I keep saying we, and that’s because this is a team effort.
First off, my partner, CNN’s Sam Petulla, helped to oversee this product from start to finish. If he was not on-board, these forecasts would never have made it past the planning stages.
Indeed, an entire website was built in record time by Vijith Assar, Matt Conlen, Will Mullery, Brad Oyler and Sam. They worked around the clock, and even if the forecast ends up being wrong, this website will perform superbly and look great.
The forecasts too were a team effort. The House forecast, for instance, is a product of the work between Ohio State Professor Brice Acree, data scientist Parker Quinn (who wrote his masters on predicting House elections), and myself. They, in fact, did the majority of the House work.
If you’ve made it this far, you may have wanted more. We’ll talk more about the models in particular in a later post.