December 5, 2013

I think most people who have watched "Breaking Bad" can agree it is really good. But how good exactly? That's what I wanted to know, so I turned to statistics, and more specifically, I turned to R. With this open source programme, you can load various packages that can do all sorts of cool analyses. For this piece, I was inspired by this post, on the demise of "How I met your Mother," whose author, inkhorn82, was in turn inspired by this post on the "Simpsons" by DiffusePrioR.

For the first part of this article on Breaking Bad, I more or less copied inkhorn82's work. Thus, I will not bore you with the syntax, just head over to his site! First, I retrieved the ratings for each episode of Breaking Bad from IMDB and loaded them in R. Then, using some packages (changepoint, ggplot2, and zoo) I created a plot that would show whether there was a distinct moment during Breaking Bad's run, when the ratings went either up OR down.

So, people seem to be right! The ratings do go up with a distinct step near the end of the show's run. To be more specific, the jump happens during Season 4, at episode 8 ("Hermanos"). This episode centers around Gus, his fingerprint, and Hank with his gut-feelings. It was one of the first episodes to receive a 9.0 rating. However, just as inkhorn82, I wanted to know a bit more. It seems as though the average ratings go up even more towards the final episodes. Therefore, I created a plot that would show the proportion of episodes (in 10 episode intervals) with a rating above 9.

As you can see, the proportion of ratings above 9 skyrocketed to over 0.80 (80%) for the last 4 episodes. Apparently, people really loved the ending. Also, we see a little dip from episode 11 to 20 (the beginning of Season 2). This is to be expected, as viewers always have high expectations of a show's second season, and are quickly disapointed when the show doesn't deliver what it has promised. We must remember that this dip only indicates that a small proportion of ratings was above a 9. If we were to look at proportion of ratings above an 8, all episodes would qualify!

So far, so good. This is where inkhorn82 stopped his research. I still wanted to know a little more. Perhaps I could predict the rating of a certain episode by some variables that were easily available from IMDB? Instead of starting an extensive qualitative text analysis of all reviews given, I decided to go with two simpler options: episode number and number of votes. To look at the effects of both, I ran a simple multivariate (multiple explanatory variables) regression. The outcomes are shown below:

lm(formula = brb$grade ~ brb$epnum + brb$vote) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 8.410e+00 8.403e-02 100.076 < 2e-16 *** brb$epnum 1.001e-02 2.604e-03 3.845 0.000298 *** brb$vote 3.278e-05 7.860e-06 4.170 0.000101 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.3267 on 59 degrees of freedom Multiple R-squared: 0.4999, Adjusted R-squared: 0.483 F-statistic: 29.49 on 2 and 59 DF, p-value: 1.323e-09

These results show that both episode number (i.e. how long the show had been running) and the number of votes had a positive influence on the average rating of a certain episode. More specifically, each extra episode added 0.01 point to the average rating, while 1000 extra votes were good for 0.03 extra rating points. Not a large effect, but an effect nonetheless! Especially if we factor in that all ratings were between an 8 and 10, not much wiggle room. Together, these two effects explain 48.3% of all the variance in episode ratings, which is not bad for two variables that have nothing to do with the content of the episodes. To give you a bit of an idea of what this looks like, I made the following plot:

In this plot, you can see the observed ratings (the red line) versus the model predicted ratings (the green line). As you can see, this model cannot really account for unexpectedly low ratings but seems quite accurate for the last (very high) ratings. Looking at this graph, you can see that the model would still make a lot of mistakes predicting the ratings, even though the results of the model seemed so promising. Finally, I made a plot showing the correlation between the episode number and number of votes:

This plot shows that most episodes had a similar amount of votes, however, near the end of the show, episodes started getting many more votes. Episode 5.14 ("Osymandias") got a record high of 41,413 votes, which also made it the highest rated episode (10.0). The last episode (5.16 "Felina") received 28,753 votes and was the second highest rated episode (9.9).

I believe it is not so much that Breaking Bad got higher ratings because the show was on the air longer. It is more that Breaking Bad got higher ratings because the show became more popular. More viewers started expressing their (positive) opinion, perhaps pushing other viewers to express an even more positive opinion to still add to the conversation. More research on this important topic is definitely needed! ;)

What do you think? Did you appreciate Breaking Bad more towards the end? Or do you have an explanation for these findings? Do you expect similar patterns for other shows?

Copyright Winter Statistics 2014