I’ve been participating regularly in Advent of Code for the past couple of years. It’s one of the highlights of my holiday season. The puzzles are fun, the stories are appropriately ridiculous, and it’s a neat way for me to keep the cobwebs brushed off some of the things I learned years ago that I don’t regularly use. Every year there are puzzles that take me a couple of minutes to solve, and puzzles that take me hours: I will forever curse the Intcode puzzles from 2019.

Last night’s puzzle was something new. The problem itself was pretty straight-forward (finding values that are common in multiple collections), but it resulted in a 45-minute debugging session that culminated in finding a bug in Swift’s implementation of Set.intersection(_:).

The nature of last night’s problem was that, when I had a bunch of inputs, I could expect that there was only a single common element between all of them. After getting lucky and solving the problem correctly, I started golfing my code to make it terser. That’s when I started noticing something odd.

My initial version of the code looked something like this:

let firstGroup: String = ...
let secondGroup: String = ...
let thirdGroup: String = ...

let uniqueLettersInFirstGroup = Set(firstGroup)
let uniqueLettersInSecondGroup = Set(secondGroup)
let uniqueLettersInThirdGroup = Set(thirdGroup)

let commonLetters = uniqueLettersInFirstGroup.intersection(uniqueLettersInSecondGroup).intersection(uniqueLettersInThirdGroup)
let commonLetter = commonLetters.first! // safe to unwrap, because if this crashes the input is bad
// ... do processing with the common letter

This worked great, but it’s also a lot of code. So I started combining things:

let firstGroup: String = ...
let secondGroup: String = ...
let thirdGroup: String = ...

let commonLetters = Set(firstGroup).intersection(secondGroup).intersection(thirdGroup)
let commonLetter = commonLetters.first!

Eventually I decided to make an extension, since this sort of algorithm (“find what’s in common between these groups of things”) is a pretty normal thing to encounter in Advent of Code:

extension Collection where Element: Collection, Element.Element: Hashable {

    var commonElements: Set<Element.Element> {
        // intersect all the elements
        // return the final intersection
    }

}

let firstGroup: String = ...
let secondGroup: String = ...
let thirdGroup: String = ...

let commonLetters = [firstGroup, secondGroup, thirdGroup].commonElements
let commonLetter = commonLetters.first!

It was about this point that I started noticing something weird: every time I ran my code, I’d get a different answer.

This is not how Advent of Code works. It is very deterministic: for each input, there is a single correct output. And as luck would have it, I’d already found the correct output. What I was getting now as everything except correct.

Questioning The Nature Of My Existence

After putting in a hefty number of log statements, I realized: my set of “common elements” would sometimes have more than one element in it.

Starting with gnmCjzwnmCPTPhBwPjzBgqPjllJJSWlhfhQDSrpJRhDSlfJl
Intersecting rLHNHrLHVNbVHMMctZFHsbcsDSDWpSDSGfSRsRWSRllfGSSG
Intersecting NNtdMVrLNdZNvLvLZrzCndqBgwwPmwgjggBn
ERROR
Common Elements??? - ["j", "r", "B", "m", "z", "n", "q", "w", "g", "C", "P"]

As a general principle that’s fine. ["apple", "pear", "papaya"].commonElements would have multiple things in it: ["a", "p"]. But in this case it was not what I wanted, because I knew that I should only be getting a single element in common (due to the nature of this particular Advent of Code puzzle).

Why I was getting a different answer every time the code ran?

This problem of "getting many things back when I was only expecting one" also fully explains why I kept getting different results every time I ran the code. The Set of common elements was returning many things, but I was asking for the .first element in the Set. Sets, by their very nature, do not have a specific ordering, so when I ran the code over and over again, the "first" thing would be different each time. That meant that, each time, I'd get a different final result.

After staring at my code for a good 10 minutes, I did the sane thing: I asked for help.

Screenshot of a slack post where I state "man i’ve got something really weird going on with my code. i’m getting a proper chunk of 3 strings, but when i try to do the part 2 logic on them, it’s saying there are 11 elements in common between them. It’s consistently ignoring the second value"

Some of the others who were also up doing Advent of Code in this particular forum graciously popped into the thread and started doing what I was doing: digging apart every single line of code, trying to identify assumptions or gaps in logic. The questions started simple: “do you have typos?”. Then they started digging in to my commonElements implementation. “What’s this extension? What if you use dropFirst()? Are there assumptions that fail because this will be a SubSequence instead of an Array? Are you using the Swift Algorithms version of .chunks(ofCount:) or your own?”. Each question and it’s corresponding answer confirmed that, by all accounts, the code looked correct.

Then on a whim I asked:

is it possible there’s a bug in Set? 😛

Now, the likelihood of this actually being the case is very very very very small. So many people are using Set, a fundamental collection type, every single day that idea of there being a bug in its implementation–especially in something as important as set intersection–seemed laughably absurd.

WTF. Did Set intersection break and we’re just noticing?

We tried different versions of Xcode. We even considered manually downloading Swift:

Try downloading a recent swift toolchain

And yet, the code was still broken. And it continued to be broken when pulled out into a Swift playground, removing all of my fancy extensions:

let a = "gnmCjzwnmCPTPhBwPjzBgqPjllJJSWlhfhQDSrpJRhDSlfJl"
let b = "rLHNHrLHVNbVHMMctZFHsbcsDSDWpSDSGfSRsRWSRllfGSSG"
let c = "NNtdMVrLNdZNvLvLZrzCndqBgwwPmwgjggBn"

var s = Set(a)
s.formIntersection(b)
s.formIntersection(c)

print(s) // ["C", "q", "m", "r", "j", "n", "z", "g", "w", "P", "B"]

At this point, we had to conclude that something really strange was going on. Then one poster hit on an idea:

Yikes, try converting a b and c to sets before intersecting

This change was simple. Instead of s.formIntersection(a), I changed it to set.formIntersection(Set(a)). The change here is that instead of using the generic “find things in common with this other sequence of characters”, I was now using the method “find things in common with this other Set. These have different implementations because, due to the nature of how sets work, the second can be done much more “cheaply” than the first.

I think you found a (known) bug

So I tried it, and suddenly my code started working. The one who suggested this then dug up this pull request on the Swift repo: PR #59422: Fix handling of duplicate items in generic Set.intersection. It turns out, there was a bug in Set.intersection(_:), but it had only been discovered this past June, and the fix only applies to macOS Ventura and later (my machine is running Monterey still). The scope of the bug is fairly limited: it only showed up if you were using the general intersection method, and the sequence had “exactly as many duplicate items as items missing from self”. As it turned out, Advent of Code happened to provide me with exactly the right input to hit this multiple times.

In the end, my workaround was simple: I could simply make sure I pass in Set(item) instead of just item. But the adventure of digging this deep into my code, questioning assumption after assumption, and coming up with ways to test those assumptions was quite exhilarating.

And it proves that maybe… just maybe… it really is a bug in the standard library.


Edit: An earlier version of this post claimed the bug was not fixed yet. This is incorrect. The bug is fixed in Swift 5.7, but my computer is running macOS Monterey (12.6) and thus using an earlier version of Swift. I have since confirmed that the code works as expected on macOS Ventura.