Exploiting String Interpolation For Fun And For Profit

A while ago I was playing around with Swift’s string interpolation functionality and come up with something cool I thought I’d share with you.

What’s String Interpolation?

In Swift, string interpolation is the functionality that lets us do substitutions into strings using the \(...) syntax:

let name = "world"
let greeting = "Hello, \(name)!"

This works because of a couple of underlying pieces provided by the standard library and the compiler:

  1. ExpressibleByStringInterpolation
  2. Some nifty syntactic transformations

The ExpressibleByStringInterpolation follows the builder pattern, which describes constructing a value by first creating an intermediate “builder”, tossing some configuration calls at it, and then asking it to “build” the final output value.

Thus, the definition of ExpressibleByStringInterpolation is straightforward:

public protocol ExpressibleByStringInterpolation: ExpressibleByStringLiteral {
    associatedtype StringInterpolation: StringInterpolationProtocol
    init(stringInterpolation: Self.StringInterpolation)
}

This protocol expresses three requirements for adopting types:

  1. You need to provide a type called StringInterpolation that itself conforms to the StringInterpolationProtocol protocol. This is the “builder” used to construct stuff
  2. You need to have an initializer for your type that can take one of these builder instances
  3. You also need to conform to the ExpressibleByStringLiteral protocol

The last requirement is a bit odd at first, but it makes sense. If you allow your types to be constructed like this…

// build an instance of MyType using ExpressibleByStringInterpolation
let myType: MyType = "Hello, \(name)!"

…then presumably you should also allow your types to be constructed like this:

// build an instance of MyType using ExpressibleByStringLiteral
let myType: MyType = "Hello, world!"

The StringInterpolationProtocol is where things start getting weird, because it can’t be fully expressed in Swift syntax. The declaration of the protocol is this:

public protocol StringInterpolationProtocol {
    associatedtype StringLiteralType: _ExpressibleByBuiltinStringLiteral
    init(literalCapacity: Int, interpolationCount: Int)
    mutating func appendLiteral(_ literal: Self.StringLiteralType)
}

A StringLiteralType (basically, always use String for this unless you have really bizarre situation where that’s not right), an initializer (to construct the builder itself), and an appendLiteral method.

But where’s all the stuff about appending the interpolated values?

The answer is that they’re there, but Swift doesn’t have a way to describe a protocol like this. The gist is that you need a method called appendInterpolation, but there aren’t any requirements on what the parameters for this method are.

To understand this a bit better, let’s detour and take a look at the transformation that happens with you use string interpolation in code.

Syntactic Transformation

When we write an interpolated string, the compiler transforms it to use the builder. If we write something like this:

let value: MyType = "Hello, \(name)!"

Then at compile-time it gets turned into this:

var builder = MyType.StringInterpolation(literalCapacity: 8, interpolationCount: 1)
builder.appendLiteral("Hello, ")
builder.appendInterpolation(name)
builder.appendLiteral("!")
let value = MyType(stringInterpolation: builder)

The literalCapacity parameter is the number of Characters present in the literal portion of the string, and the interpolationCount indicates how many substitutions there are.

What’s interesting here is that appendInterpolation call. Basically, anything that we put inside the \(…) part of the interpolation becomes the arguments to the appendInterpolation method. So \(foo: bar) becomes ….appendInterpolation(foo: bar), \(age, formatter: someNumberFormatter) becomes ….appendInterpolation(age, formatter: someNumberFormatter), and so on.

Thus, we can create string interpolators that limit their accepted substitutions based on what kinds of appendInterpolation methods we write. You could, for example, create an interpolator that only accepts interpolated integers by only providing an appendInterpolation(_ int: Int) method.

SwiftUI’s LocalizedStringKey

String interpolation is how SwiftUI is able to build up NSLocalizedString keys to look up translations in your .strings files. When we write Text("Hello, \(name)!") in SwiftUI, we’re not passing a String instance to the Text initializer; we’re passing a LocalizedStringKey instance, and that type happens to be ExpressibleByStringInterpolation.

And it’s also not a standard interpolation implementation like the one we find on String. In addition to building up the “fallback” string to use in the case where it can’t find a proper translation, it builds up the key itself. You can imagine how this might work:

struct LocalizedStringKey.StringInterpolation: StringInterpolationProtocol {
    var key = ""
    var fallback = ""

    mutating func appendLiteral(_ literal: String) {
        key.append(literal)
        fallback.append(literal)
    }

    mutating func appendInterpolation<T>(_ value: T) {
        key.append("%@")
        fallback.append("\(value)")
    }
}

With this, you can build up both the fallback value ("Hello, world!") and the key to use to look up the translation in your strings file ("Hello, %@!").

A Solution in Search of a Problem

The fact that interpolation is a syntactic transformation means we can get creative. Let’s take a closer look at the appendInterpolation(…) bit.

When you have a line of code like this, there are technically three different things that could be happening:

foo.doSomething(bar)

The obvious first case is that you have a func doSomething(_ bar: Bar) method on the type in question.

The next slightly-less-obvious-but-still-kind-of-common case is when you have a var doSomething: (Bar) -> Void property on the type. In this case, .doSomething(bar) is retrieving the closure and executing it, all on the same line.

The least-obvious case is that you’ve got a var doSomething: OtherType property, and that OtherType is @dynamicCallable. @dynamicCallable is almost never used in Swift, because it’s kind of weird and was added to make it easier to bridge in libraries from other languages. It allows you to have a value and then directly “execute” that value by simply throwing (…) on the end.

We can exploit @dynamicCallable to get a whole lot more information from string interpolation.

Inventing a Problem

Let’s imagine that we wanted to make it easy to declare some sort of “route” functionality for an app, and that we’d want to support dealing with incoming paths and automatically parse out certain things. For example, a path with /person/1234 might result in a ["person": PersonID(1234)] value. Or a /person/1234/items/42/delete might result in: ["person": PersonID(1234), "item": ItemID(42), "action": ItemAction.delete].

We can imagine how we might express this with string interpolation:

let route: Route = "/person/\(person: PersonID.self)/items/\(item: ItemID.self)/\(action: ItemAction.self)"

With @dynamicCallable and string interpolation, we can build this.

It starts by having a Route type that is ExpressibleByStringInterpolation:

struct Route: ExpressibleByStringInterpolation {
    typealias StringInterpolation = RouteMatcher
    
    init(stringInterpolation: RouteMatcher) {  }
}

Our RouteMatcher will build up a regular expression to do the parsing.

// yes, this is a class. That will be explained shortly.
class RouteMatcher: StringInterpolationProtocol {
    internal var pattern = ""
    required init(literalCapacity: Int, interpolationCount: Int) { }

    private func isSpecialRegexCharacter(_ char: Character) -> Bool { … }

    func appendLiteral(_ literal: String) {
        for character in literal {
            if isSpecialRegexCharacter(character) { pattern.append("\\") }
            pattern.append(character)
        }
    }
}

So far, so good. But now things are going to get hairy. We need to say allow an interpolation segment (\(person: Person.self)) where we have a named parameter and accept a type as the value. We can’t just have a whole bunch of appendInterpolation(…) methods on our RouteMatcher, because we don’t know what name the user will want to type in. This is where @dynamicCallable comes in.

So, we define a property called appendInterpolation that returns our @dynamicCallable type:

class RouteMatcher: StringInterpolationProtocol {
    var appendInterpolation: Capture { Capture(matcher: self) }
}

@dynamicCallable
struct Capture {
    let matcher: RouteMatcher
}

Next, we implement the dynamicallyCall() method that defines the “keyword arguments” syntax. This will allow us to execute the Capture type using named argument parameters:

@dynamicCallable
struct Capture {
    // since RouteMatcher is a reference type, modifying it here is modifying the "right" value
    let matcher: RouteMatcher

    func dynamicallyCall(withKeywordArguments args: KeyValuePairs<String, Any.Type>) {
        // args is a collection of (String, Any.Type) tuples
		// TODO: relay the information back to the RouteMatcher
    }
}
// example:
// let c: Capture = …
// c(foo: bar)

At this point, our matcher is almost complete. If you build this code, you’ll find that Xcode complains about missing protocol requirements:

error: type conforming to 'StringInterpolationProtocol' does not implement a valid 'appendInterpolation' method

The compiler is still looking for an actual method, even though we’re not going to actually be using it. Still, we need to make the compiler happy, so we add this to our RouteMatcher:

class RouteMatcher: StringInterpolationProtocol {
    
    @_disfavoredOverload
    func appendInterpolation(_ willThisExecute: Never) { }
    
}

This satisfies the compiler (it sees a method called “appendInterpolation”), but there are two things that make sure it never actually gets used:

  1. The @_disfavoredOverload annotation tells the compiler that “if you have to choose between this thing and another thing, prefer the other thing”. Yes, the underscore technically means it’s “private” and therefore “use it at your own risk”. But, it works and is used all over in SwiftUI. 🤷‍♂️

  2. The use of Never as the parameter type means that even if the compiler messes up and picks this method, it won’t work because (gasp) you can’t ever create an instance of Never to pass to the method.

Solving the Imaginary Problem

With this, the compiler now happily rewrites the string interpolation code, and our implementation means that we can capture named arguments and the type values provided to them. If you were to follow this through, you might end up with code that looks approximately like this:

protocol PathExtractible {
    init?(pathValue: String)
}

struct Route: ExpressibleByStringInterpolation {
    private let matcher: (String) -> Dictionary<String, Any>?
    
    init(stringLiteral value: String) {
        let m = Matcher(literalCapacity: 0, interpolationCount: 0)
        m.appendLiteral(value)
        self.matcher = m.build()
    }
    
    init(stringInterpolation: Matcher) {
        self.matcher = stringInterpolation.build()
    }
    
    func match(_ path: String) -> Dictionary<String, Any>? {
        return matcher(path)
    }
}

class Matcher: StringInterpolationProtocol {
    private let escaped = CharacterSet(charactersIn: #"[\^$.|?*+()"#)
    fileprivate var pattern = Array<Unicode.Scalar>()
    fileprivate var extractions = Array<(String, PathExtractible.Type)>()
    
    var appendInterpolation: Capture { Capture(matcher: self) }

    public required init(literalCapacity: Int, interpolationCount: Int) { }
    
    func appendLiteral(_ literal: String) {
        for char in literal.unicodeScalars {
            if escaped.contains(char) { pattern.append("\\") }
            pattern.append(char)
        }
    }
    
    @_disfavoredOverload
    func appendInterpolation(_ willThisExecute: Never) { }
    
    func build() -> (String) -> Dictionary<String, Any>? {
        let f = "^" + String(String.UnicodeScalarView(pattern)) + "$"
        let regex = try! NSRegularExpression(pattern: f, options: [])
        let extractions = self.extractions
        
        return { path in
            guard let match = regex.firstMatch(in: path, options: [], range: NSRange(location: 0, length: path.utf16.count)) else {
                return nil
            }
            guard match.numberOfRanges == extractions.count + 1 else {
                return nil
            }
            
            var extracted = Dictionary<String, Any>()

            // capture groups are 1-indexed
            for captureGroup in 1 ..< match.numberOfRanges {
                let substring = (path as NSString).substring(with: match.range(at: captureGroup))
                let (name, type) = extractions[captureGroup - 1]
                guard let value = type.init(pathValue: substring) else {
                    return nil
                }
                extracted[name] = value
            }
            return extracted
        }
    }
    
    @dynamicCallable
    struct Capture {
        fileprivate let matcher: Matcher
        
        func dynamicallyCall(withKeywordArguments args: KeyValuePairs<String, PathExtractible.Type>) {
            for (name, type) in args {
                matcher.pattern.append(contentsOf: "(.+?)".unicodeScalars)
                matcher.extractions.append((name, type))
            }
        }
    }
    
}

This uses a “PathExtractible” protocol to know how to construct values extracted from the path. You could do a bit of convenience work to automatically provide an implementation for things that can already be created from strings, like RawRepresentable or LosslessStringConvertible types.

Using this code would look something like this:

let route: Route = "/api/profile/\(memberID: Int.self)/lists/\(listID: String.self)"
let validArguments = route.match("/api/profile/1234/lists/5678") 
// validArguments = ["memberID": 1234, "listID": "5678"]

let invalidArguments = route.match("/api/feed")
// invalidArguments == nil

There are a couple of downsides to this approach, which is why I’ve yet to come up with a situation where this code would actually be useful:

  1. There’s no way (that I’ve found) to encode the generic types being capture in the Route itself. This means that every path that comes in has to be naïvely tried on every possible Route in order to find a match
  2. The lack of specifiable generics also means that the matched values comes out as an (unfortunate) Dictionary<String: Any>. So even though you know that the types are the right kind, you’ll still have to do some force-casting in order to get them in a useful form. Perhaps there’s something more that could be done here using a custom Decoder type.

I share this with you because string interpolation is neat, the @dynamicCallable stuff is cool, and hopefully you’ll come up with something that uses this that we all can benefit from.


Related️️ Posts️

Adventures in Advent of Code
Conditional Compilation, Part 4: Deployment Targets
Simplifying Backwards Compatibility in Swift
Core Data and SwiftUI
Custom Property Wrappers for SwiftUI
HTTP in Swift, Part 18: Wrapping Up
HTTP in Swift, Part 17: Brain Dump
HTTP in Swift, Part 16: Composite Loaders
HTTP in Swift, Part 15: OAuth
HTTP in Swift, Part 14: OAuth Setup
HTTP in Swift, Part 13: Basic Authentication
HTTP in Swift, Part 12: Retrying
HTTP in Swift, Part 11: Throttling
HTTP in Swift, Part 10: Cancellation
HTTP in Swift, Part 9: Resetting
HTTP in Swift, Part 8: Request Options
HTTP in Swift, Part 7: Dynamically Modifying Requests
HTTP in Swift, Part 6: Chaining Loaders
HTTP in Swift, Part 5: Testing and Mocking
HTTP in Swift, Part 4: Loading Requests
HTTP in Swift, Part 3: Request Bodies
HTTP in Swift, Part 2: Basic Structures
HTTP in Swift, Part 1: An Intro to HTTP
Anything worth doing...
Introducing Time
Conditional Compilation, Part 3: App Extensions
Conditional Compilation, Part 2: Including and Excluding Source Files
Conditional Compilation, Part 1: Precise Feature Flags
Swift Protocols Wishlist