Data Feature: Game of Thrones Edition

5 minute read

A lot of us are watching Game of Thrones these days. Its fourth season is in full force and it is, after all, quite a fitting show for political scientists. It has intrigues, power struggles and violent conflict. In fact, quite a few authors have explored the show’s/books’ world through the lens of political science. So, while I was idly procrastinating on the web searching for poli-sci reactions (or others) on the latest episode, I came across this fascinating forum entry. (Naturally, spoilers follow.)

It turns out GoT-Fan Forum user “Sellsword” has in fact collected data on each character’s screen time on the show. He or she sat with a stop watch in front of the TV and measured the time each character appeared on the screen in season one to three. The author of this list adhered to relatively strict coding rules, which is to some extent similar to the collection of political science data:

I’ve given the characters time when they appear on screen as opposed to in the scene, for example the Hound could be in a 6 min long scene but only be visible on screen for 55 seconds thus he only gets 55 seconds.

Game of Thrones being the gory work of fiction it is, this led to some interesting coding specifications:

I also gave characters screentime for “deadtime” when the whole body is still intact. Khal Drogo’s corpse got screentime, Ned’s head dit [sic] not.

Note that these coding rules aren’t the only ones that one could apply. Others have used different rules to assess GoT screen time (yes, there are other people who have attempted this. Apparently it’s a thing. Yes, I’m baffled, too. But then, I’m writing a blog post about how to quantify this data in a statistical software package, so I’m every bit as nerdy as they are).

Thankfully, “Sellsword” gives us his or her results in a neat list. The list is even formatted consistently and looked something like this:

  1. Tyrion Lannister – 166:15 (1,2,3; 28)

  2. Daenerys Targaryen – 127:35 (1,2,3; 25)

  3. Jon Snow - 126:41 (1,2,3; 24)

  4. Arya Stark - 100:11 (1,2,3; 27)

Since I am GoT fan and use R in my day job (jeez, if you’d look up “nerd” in the dictionary, this is what you’d find), I was curious whether this list could be easily translated into an R data frame.

I copy and pasted the list from the forum into Notepad++ and had to tweak the text formatting a bit (replacing “-“ with “=” for text encoding reasons) to produce this .txt. I read the file with the readLines command into R, splitting each line into different variables: I first generated a count variable of a character’s screen time in seconds. Plus, I coded each character’s sex manually, judging from their name and my knowledge of the character from the books. I put the R code to generate the data on this Gist, but you can simply the download the .RData file here.

The data set includes the following variables:

name Name of Character
minutes Minutes of screen time
seconds Seconds of screen time (in addition to minutes)
seasons Character string of seasons in which the character appeared
episode_count Number of episodes in which the character appeared
screentime_seconds Count of screen time seconds (minutes * 60 + seconds)
family I extracted the family name from the „name“ variable (if a character has only one name and no family name (e.g. “Varys”) this name is taken
sl_dummy I created a categorical variable indicating whether the character is a Lannister, a Stark or from another family
sex A character’s sex
season_count The number of seasons in which a character appears

The first couple of rows look something like this:


I invite you to play around with the data. It’s a pretty simple dataset, but allows for some interesting graphs and basic analysis. I’ve put together some initial thoughts on this, but mostly this post is supposed to provide the data and not to go too deep into analysis. Let me know in the comments if you’re doing anything fun with it! (An obvious extension would be to read in “Sellsword’s” breakdown of screen time by season one to three which is also in the forum entry. This could easily be done by adapting the code provided above).

First, let’s have a look at the face validity of the data. We would expect any character’s screen time to increase with the number of episode in which he or she appears. And that’s exactly what we find.


There’s clearly a pretty significant outlier at episode nine. To everybody who’s watched the show, this isn’t a big surprise. This is Eddard Stark, who didn’t survive the ninth episode of the first season, but has clocked a lot of screen time in the short time he was around.

Ned Stark’s bulge of screen time does not necessarily mean that this family consistently produces characters that get famous on TV (measured in screen time). No, this honor goes to the Lannister family (a finding which is mostly driven by Tyrion Lannister being the character with the most screen time). Also, it doesn’t help that the Starks are consistently being killed throughout the show.


Note that we have an outlier in the Stark camp who hasn’t been on screen much of the time. This is Benjen Stark, a brother of Ned’s. He’s around only a couple of episodes (three to be precise).

Another question we can answer with the data at hand is if a character’s sex makes a difference in the amount of screen time he or she gets. Well apparently it does. Female characters have a higher average and median screen time count than male characters – obviously there are exceptions to rule with Tyrion Lannister being the one character with the most screen time.


Now, the amount of screen time a character gets does not say anything about the quality of this time. The show has received its fair share of criticism regarding its negative treatment of female characters – a criticism which has provoked replies itself, because the show arguably features many strong female characters. The data here won’t settle this debate a bit, since it’s very limited regarding the actual content of a character’s screen time.

Nevertheless, since fiction is always a representation of ourselves, it is fascinating to see how much attention different characters get. But the data also shows that context (i.e. knowing the show or the books) is necessary to understand and interpret what it tells. In that, it is no different from any other data set out there, describing real world phenomena.