Python’s zip function, which knits together two iterables, is indispensable for me. It works like this:

list_one = [1, 2, 3]
list_two = ["a", "b", "c"]
for pair in zip(list_one, list_two):
# (1, 'a')
# (2, 'b')
# (3, 'c')

If the two iterables differ in length, zip halts after the shortest is exhausted. If we add an additional element to one of the lists above, we get the same results:

list_one = [1, 2, 3]
list_two = ["a", "b", "c", "d"]  # Note extra item, "d"
for pair in zip(list_one, list_two):
# (1, 'a')
# (2, 'b')
# (3, 'c')

But the actual mechanics of this surprised me. Today I was working on the “chunked” problem from Python Morsels (which is great and you should totally try out if you write Python), and was left scratching my head after elements of my iterable started disappearing.

The basic problem for chunked is this: given some iterable, return its elements in count-length lists. Trey likes you to think in terms of “any iterable” so you can’t depend on list-like behaviour, such as being able to index into the iterable or check its length without consuming it.

It’s safer to assume you get one traversal. So, my solution starts like this, creating an iterator from the iterable.

def chunked(iterable, count):
    iterator = iter(iterable)

Then (eliding the scaffolding) I build up a new count-length chunk using zip in a comprehension:

temp = [item for item, _ in zip(iterator, range(count))]

Here I use the “earliest finish” behaviour of zip paired with range — the amount of numbers in the range (count-many of them) determines how many items I fetch from the iterator.

Let’s give this a try, using your imagination to flesh out the rest of chunked:

for chunk in chunked(iterable=range(10), count=4):
# [0, 1, 2, 3]
# [5, 6, 7, 8]

Er, hm. Not what I was expecting, which was:

# [0, 1, 2, 3]
# [4, 5, 6, 7]
# [8, 9]

Somehow, the program is consuming an extra item from iterator each time I create a chunk. But that list comprehension is the only place where I touch iterator. What gives?

Well, how does zip know when to terminate? If you take a look in the documentation, you’ll see a handy code sample that is “equivalent to” the implementation of zip. There we see that zip builds up a list of results by taking an item from each of the given iterables, but if any of those iterables are finished, it just returns — and discards the result list!

So what happens with zip(longer, shorter) is that it takes from longer, stashes the item, discovers shorter is exhausted, and discards the item from longer. And that’s what happens to the missing numbers in the example above.

This situation arises because I’m zipping the same iterable repeatedly, until it’s empty, and because the iterator is the first argument to zip. This small change works fine:

# Old, broken
temp = [item for item, _ in zip(iterator, range(count))]
# New, fixed
temp = [item for _, item in zip(range(count), iterator)]

In the new version, zip discovers that the iterator over the range is exhausted first, before it takes an item from iterator, so no items are ever discarded.

So, is this OK? Really, really not! This is super-fragile. It’s not obvious that switching the arguments will break the code. And really it just looks wrong, because surely the ignored tuple element (assigned to the underscore) should come after the item that we care about?

Thankfully, the itertools module has what we need (as always!). The reason I originally used the list comprehension-zip-range combo is because you can’t slice every iterable. For example:

(x**2 for x in range(10))[:4]
# ---------------------------------------------------------------------------
# TypeError                                 Traceback (most recent call last)
# <ipython-input-2-17f2a627cc7c> in <module>
# ----> 1 (x**2 for x in range(10))[:4]
# TypeError: 'generator' object is not subscriptable

But you can with islice:

list(islice((x**2 for x in range(10)), 4))
# [0, 1, 4, 9]

And this works great with iterators where you care about the current state:

to_10_sq = (x**2 for x in range(10))
list(islice(to_10_sq, 4))
# [0, 1, 4, 9]
list(islice(to_10_sq, 4))
# [16, 25, 36, 49]
list(islice(to_10_sq, 4))
# [64, 81]

Which leads us to the most straightforward way of building up those chunks.

chunk = list(islice(iterator, count))

(The chunks have to be “concrete” sequences as the problem requires some length-checking for one of the bonus parts, hence the list call.)

Thanks for reading. If I have some key messages, they’re these:

  • Python is lovely, but it’s not magic!
  • itertools might have solved your iteration problem already.
  • Check out Python Morsels. The problems are short, fun, and a nice way to improve your Python skills.

It looks like my 2011 iMac might be on the way out. I’ve been having odd graphical problems today and yesterday, and I think that it might be the graphics card overheating. Running Apple Diagnostics (Hardware Test as was) reports an error with the hard drive (the fan specifically) which I’ve seen before. My working theory at this point is that a failed or obstructed fan coupled with dust build-up and the fairly hot room has led to this point.

Photo of an iMac where part of the right-hand section of the screen image is displayed physically on the left

I’m going to get some compressed air and see if that helps matters at all, and then see if I can get it serviced. Unfortunately Apple now lists the 2011-model iMac as obsolete (or “vintage”) so we’ll see how that goes.

Funnily enough, this is not the original graphics card but a replacement installed by Apple when something similar (but not quite the same) happened several years ago, sorted out just before their replacement period ended (it was an acknowledged, somewhat widespread problem with the cards).

Honestly it’s happened at a bit of a naff time. The machine is otherwise fine, and it still feels incredibly fast (which I put down to the SSD). I was hoping it would last as long as many of the machines we have at work, almost all of which are from 2008 (despite what other press reports claim), though they do feel sluggish — particularly the couple that I (foolishly) upgraded past OS X 10.6. I was certainly not planning to replace it yet.

It would be an expense that I could do without too — having just bought a real chair and weird keyboard in anticipation of having to do much more work at home from October when I start an evening Masters in Computer Science at Birkbeck.

Which brings us neatly to the real annoyance about this: I was planning to buy a laptop to use on the course. I’ve bought a couple of bottom-end MacBook Airs for reporters at work, and they seem like decent enough machines. This idea was on the assumption that it would not be my main computer, so it could be less capable as I would be using it for focused tasks and leaving everything else — including much of the academic work of the course — to be done on my giant iMac at home.

But if the iMac isn’t in the picture anymore, what should I do? As I see it, I have two feasible options:

  • Buy a more capable laptop as my main computer.
  • Buy a basic laptop and a new iMac.

I spent a lot of money on the 27" iMac in October 2011, buying more than I needed really (partly because I was still playing computer games then). I wouldn’t replace it with something as high-end now as I know that I just don’t need the power, and I want to minimise the hit to my savings.

Part of me thinks that I should buy a MacBook Air now, according to plan but sooner than planned. Then I have something to tide me over other than a six-year-old iPad (running iOS 9!) and a five-year-old iPhone 5S, and then I can try to get the iMac fixed or replace it at a later date. (I’m also a bit anxious to buy the Air sooner rather than later, even though it’s still basically an old design, as I don’t really want to risk having to buy a laptop with a dodgy keyboard and only a couple of those odd USB-C sockets.)

Then the other option is to shelve completely the idea of buying another desktop and just buy a more powerful laptop. This appeals to me less, because I do want a bigger screen and I do want more storage.

I’ve been throwing the storage matter around in my head today and I can’t decide on a position. Backblaze tells me I have 530GB backed up, most of which is my iTunes library, so I don’t know if it’s a bit of a distraction — if I only had a laptop I’d have to keep it on an external drive at home, and is that so different from keeping it only in an iMac on my desk?

It’s difficult to know what to do. My gut feeling is to rush out and buy a basic Air immediately so as to not interrupt my life too much (particularly, I need to change jobs before the course starts as the hours don’t fit; this is not a secret to my boss or colleagues).

But my head says that I should see if I can crack on using my iPad for now, enquire about getting the iMac serviced as soon as possible, and then make a more considered decision at a later time.

It’s not a comfortable position for me, since this machine has been a fixture in my life for over six-and-a-half years. When the graphics card went last time, it was calming to know that it was a problem that Apple had committed to take care of (even if I almost missed the cutoff and had to drag the heavy thing all the way to Chafford Hundred on the train). No such luck this time.

(On a, er, positive note, I guess this will finally force me to work out a way of blogging from my iPad that isn’t as painful as I’m sure it will be to get this post up. [Later: It took some faffing about and too much manual work, but having (an old version of) Coda and shell access to my Linode server did the trick fine.])

My post detailing a Keyboard Maestro macro to open Jupyter notebooks had a dumb bug in the second shell pipeline, which fetches the URL of the desired notebook.

You’d hit it if:

  • You have more than one notebook server running.
  • The working directory of one is beneath another.
  • The subdirectory server was started more recently.
  • You tried to open the parent server with the macro.

The shorter path of the parent would match part of the child’s path.

The original grep pattern was:

grep "$KMVAR_dir"

And is now:

grep ":: $KMVAR_dir$"

So that it only matches the exact directory chosen in the list prompt, and not one of its children.

I’ve updated the Keyboard Maestro macro file too.

When I use images here, I tend to give ones without any transparency a border, which is done using CSS and applied to img tags unless they have a no-border class.

Like a good web citizen, I also specify image dimensions in HTML:

“The image’s rendered size is given in the width and height attributes, which allows the user agent to allocate space for the image before it is downloaded.”

In fact my BBEdit image snippet makes it a doddle:

<p <#* class="full-width"#>>
        alt="<#alt text#>"
        <#* class="no-border"#>

But this causes a problem, which I’ve spotted in a couple of my recent posts.

If you specify the image dimensions, and use a CSS border, and have your CSS box-sizing set to border-box, then the CSS border shrinks the amount of space available to the image to its specified dimensions − 2 × the border width.

So if you specify your img dimensions to match the dimensions of the file, then the image itself will be shrunk within the element.

This animation shows this situation, and what happens when you toggle the CSS border. Watch what happens to the image itself.

An animation showing an image being squeezed within the space it has been allocated, causing distortion.

(It’s got a slight offset from the text because it’s a screenshot of this blog and includes some of the background on each side.)

In contrast, this animation shows what happens when the dimensions are not specified, and so the image is free to grow when the border is applied:

An animation showing an image growing when a CSS border is applied, with no distortion to the image itself.

Really the culprit here is box-sizing: border-box, forcing the border to remain within the size of the img element itself. This is a behaviour you actually want, as it solves the old CSS problem of juggling widths, borders and padding within a parent element. Check out MDN’s box-sizing page to see what I mean.

What are my options, then?

  • Change box-sizing.

    I’m not touching this because the potential sizing headaches are not worth it, even just for img elements.

  • Apply a border to the image files themselves.

    No, because if I change my mind about the CSS, previously posted images are stuck with the old style forever. CSS borders should also work correctly across high-density displays, whereas a 1px border in the file may not.

  • Don’t specify dimensions in the HTML.

    I don’t like the idea of making pages of this site slower to render, but I think this is the least bad option, particularly given that this site is already pretty fast.

It’s not ideal, but that BBEdit snippet is now just:

<p <#* class="full-width"#>>
        alt="<#alt text#>"
        <#* class="no-border"#>

Hey, at least it makes images quicker to include in posts!

I have a startup item that launches a Jupyter notebook so that the server is always running in the background. It’s an attempt to reduce the friction of using the notebooks.

By default, Jupyter starts the server on port 8888 on localhost, but expects a token (a long hexadecimal string) before it’ll let you in. If you list the currently running servers in the terminal you can see the token and also the server’s working directory.

% jupyter-notebook list
Currently running servers:
http://localhost:8889/?token=…hex… :: /Users/robjwells
http://localhost:8888/?token=…hex… :: /Users/robjwells/jupyter-notebooks

We can use this to make finding and opening the particular notebook server you want a bit easier, using Keyboard Maestro.

A screenshot showing the (minimised) Keyboard Maestro steps

The macro uses the jupyter-notebook command, so that’ll need to be in your $PATH as Keyboard Maestro sees it.

The first and third steps both execute jupyter-notebook list and use Unix tools to extract parts from it.

In between, if there’s more than one notebook server running, the macro prompts the user to choose one from a list of their working directories.

A Keyboard Maestro list selection dialogue

Here’s the first step, where we fetch the list of working directories.

jupyter-notebook list | tail -n +2 | awk '{print $3}'

Our +2 argument to tail gets the output from the second line, chopping off the “Currently running servers:” bit. Then awk prints the third field, which contains the directory. (The first is the URL, the second the double-colon separator.)

The third step fetches the corresponding URL for a directory:

jupyter-notebook list | grep ":: $KMVAR_dir$" | awk '{ print $1 }'

Since the user has specified a directory already, we use grep with the Keyboard Maestro variable to find just that one line, and use awk again to extract the URL field.


There was a bug in the original version of this snippet of shell script, where a parent path could match a child path (as it was only looking for the path itself without an anchor on either side). It was only luck that had me miss this with my example, with the more recently started home directory notebook server being listed ahead of one in a subdirectory, which grep would have also matched. The code above and the macro file have been fixed.

Obviously, this won’t work if you have more than one notebook server running from the same directory. (But you wouldn’t do that, right?)

Here’s the macro file if you’d like to try it out.