HTTP in Swift, Part 17: Brain Dump

Part 17 in a series on building a Swift HTTP framework:

I was planning on having a few more posts on different loaders you can build using this architecture, but for the sake of “finishing” this series up, I’ve decided to forego a post-per-loader and instead highlight the main points of a few of them.

OpenID

We’ve already take a look at how to implement a loader that authorizes requests via an OAuth 2 flow, but there’s an abstraction that exists on top of that, called OpenID. With the OAuth loader, we needed to specify things like the login url, the url for refreshing tokens, and so on. OpenID allows for identity providers to abstract that away by shipping down a manifest that contains all of these urls (and other idiosyncrasies of the protocol).

If we wanted to implement OpenID ourselves, we’d need a preliminary state in our state machine to first fetch this manifest, and then use it as the basis for subsequent state logic. Alternatively, we could wrap an existing implementation of OpenID (such as the official implementation) in a custom HTTPLoader subclass, and allow that library to perform the complex logic. In this case, our HTTPLoader subclass would serve as an adapter between the API provided by the library, and the API wanted by the HTTP loader chain.

Caching

Conceptually, a caching loader should be relatively straight-forward to understand. When a request enters this loader, it examines the request (and perhaps an HTTPRequestOption to indicate if caching is allowed) and sees if it matches any responses that have been persisted (in memory, on disk, etc). If such a response exists, then the loader returns that response instead of sending the request further down the chain.

If a response doesn’t exist, it continues with typical request execution, but also inserts another completion handler so that it can capture the response and (if the right conditions are met), persist it to use for future requests.

Deduplication

Deduplication is similar to caching, in that when a request comes in, the loader sees if it’s similar to an already in-progress request. If it is, then the new request is set aside, and when the original request gets a response, that response is duplicated to the second request.

Redirection

There are a couple of ways to handle redirected requests.

By default, URLSession will follow redirects, unless you specifically override the willPerformHTTPRedirection delegate method on URLSessionTaskDelegate. So, you could do that and then conditionally allow redirection on requests based on a particular HTTPRequestOption you’ve created.

Alternatively, you could unconditionally deny redirections at the URLSession level, and then have a separate RedirectionFollowingLoader that takes incoming requests, duplicates them, and sends the duplicates down the chain. When the duplicate comes back, the loader examines the response and sees if its a redirection response. If it is, then it constructs a new request for the redirect, and sends that back down.

Once the loader gets back a non-redirection response, it uses that response as the response for the original request and sends it back out. You would need some logic to detect redirection loops and break out of them, but the key idea here is to send down a copy of a request, so that you get a chance to examine the response before deciding what to do about it.

Certificate Pinning

In principle, certificate pinning should look like any other HTTPLoader: a request comes in, and before it gets sent to the next one, the certificate for the target server is validated against a certificate attached to the request as an HTTPRequestOption.

In practice, this is a little bit more difficult, because certificates are only available as the connection to the remote server is being negotiated down in the URLSessionLoader. Because of this, the course of action here is to not have a separate CertificatePinningLoader, but instead to provide a CertificateValidator value to an HTTPRequest that can be used if a loader needs to do some certificate validation (similar to Alamofire’s ServerTrustEvaluating protocol).

Then our URLSessionLoader needs to be updated to use a delegate, and implement the delegate method to handle a URLAuthenticationChallenge, and then consult that option for the request when it receives the .serverTrust challenge.

Peer-to-peer

A peer-to-peer loader is interesting, because it stems from the realization that the contract for the HTTPLoader says nothing about which device the response comes from. We’ve already seen examples of loaders that will return fake responses (for mocking) or re-use responses (caching and de-duplication). A P2P loader is one that can decide to ship a request off to another device, and allow that device to provide a response.

This could be done via a myriad of technologies, ranging from something like MultipeerConnectivity to Bluetooth or direct socket connections. The possibilities here are pretty vast.

The astute observer will also realize that the URLSessionLoader we created early on fits in this sort of category. That’s a loader that “off loads” the responsibility of producing a request to another device. It happens to be a device that is also an HTTP server, but our loading stack doesn’t directly have to know that.

Streaming Responses

One area where this framework does not work terribly well is with streaming responses. This is pretty apparent: we’ve built everything around the expectation that a discrete and finite request has a discrete and finite response. Streaming kind of breaks that expectation. We can do a streamed body in an upload, because the process of sending that stream is part of sending our single discrete request.

There are some kinds of streamed bodies we could handle, such as file downloads. For these, we’d want to provide an OutputStream (or similar) to say “put any bytes you get back here”; by default this could be a stream to an in-memory Data value. This would allow us stream a response directly to a file, instead of going through an in-memory Data value.

For a live video stream, we could provide an OutputStream that pipes the data into an AVSession. However, we’d be explicitly foregoing some of the semantics of “single request, single response” in order to make this work. We would also need to be very careful about how we implement request duplication (such as would be needed by a redirecting loader).

Conclusion

There is a lot we can do with this framework. A few things are somewhat complicated and require working around/with specific implementation details of system APIs (such as certificate pinning, streamed responses, etc). On the whole though, this approach of modeling networking as “send a request, and eventually get a response” allows us to build extremely flexible, composeable, and customizable networking stacks.

In the next (and likely final) post, we’ll be zooming back out to look at the high-level overview of the framework we’ve created, and see how it fits in with other Swift technologies.