Last week TfL released some data about crowding on Tube trains. There’s a good write-up by Diamond Geezer.
Basically, TfL give each 15-minute slot on each line in each direction a rating from 1 (quiet) to 6 (stuffed). I thought I’d have a go at turning that data into some heatmaps, so you can pick out the busiest times more easily.
Now, this was my first time trying to make heatmaps, my first time using Seaborn and my first time using Pandas. The plots are not great. I basically got to the end of my tether trying to sort the code out and ended up spending a chunk of time polishing the resulting images. Even then…
Before we go any further, all the plots are linked to the SVG files if you want to have a closer look as the labels appear quite small. Also, I’m only going to describe the oddities in the plots, not draw conclusions from them — others can do that if they’re interested.
Let’s start with the Waterloo & City line because it’s simplest:
Hopefully you can pick out the labels on the left showing the source and destination stations, the times (5am to 2am) and the change in colour showing how crowded the trains are. It’s quite easy to pick out the peaks here: Waterloo to Bank around 8 and 9am, Bank to Waterloo around 5 and 6pm.
Now, in alphabetical order, let’s go through the rest. Bakerloo:
Note here something that applies to the rest: not all pairs of stations are listed as I was having trouble making them legible. I tend to leave out every other pair so, above, it’s implied that after Harrow & Wealdstone → Kenton and before South Kenton → North Wembley there is Kenton → South Kenton. But, as we’ll see, this isn’t always the case.
Looking at the early morning services in Kenton and Wembley it is pretty obvious, as Diamond Geezer pointed out, that someone has been mucking about with the data.
This was one of the first plots I “tidied” which is why the scale is missing. Note here the apparent disappearance of lots of travellers in the top-left and bottom-right of each plot. This is because of branching lines. I’ve tried to keep in the first station for each branch for each direction but it isn’t always the case. You may want to read these alongside the Tube map.
Not this is only the “Outer Rail” and “Inner Rail” — clockwise and anti-clockwise — sections of the Circle line, starting at Edgware Road. I left off the “jug handle” between Paddington and Hammersmith, which was pretty quiet anyway. In fact the whole line is fairly quiet — something DG pointed out — so the data is a bit suspect.
All those cut-off bits are because of the five branches in west London. It’s a weird line for historical reasons. Again I’m sceptical about the data because the District is the busiest sub-surface line, but the plots show that to be the Hammersmith & City:
No branches. Hurrah. Next.
I love how Canary Wharf slices across these charts.
Let’s sod off to Metro-land:
The Metropolitan line appears to be weirdly spacious in the evenings given how rammed it is in the morning. I’ve no idea if either is true to life.
More branching fun with the Northern line:
I know, it’s weird, I should have done separate charts or something but TfL need to get on with it and split it up.
Strange pattern there, but it’s another strange line. Thankfully the last one is straightforward, the Victoria:
Doesn’t look too bad, that one.
Sorry the charts were so poor, do let me know what I should’ve done differently, or just whinge — I’m just glad this is over.