HTTP in Swift, Part 14: OAuth Setup

Part 14 in a series on building a Swift HTTP framework:

  1. HTTP in Swift, Part 1: An Intro to HTTP
  2. HTTP in Swift, Part 2: Basic Structures
  3. HTTP in Swift, Part 3: Request Bodies
  4. HTTP in Swift, Part 4: Loading Requests
  5. HTTP in Swift, Part 5: Testing and Mocking
  6. HTTP in Swift, Part 6: Chaining Loaders
  7. HTTP in Swift, Part 7: Dynamically Modifying Requests
  8. HTTP in Swift, Part 8: Request Options
  9. HTTP in Swift, Part 9: Resetting
  10. HTTP in Swift, Part 10: Cancellation
  11. HTTP in Swift, Part 11: Throttling
  12. HTTP in Swift, Part 12: Retrying
  13. HTTP in Swift, Part 13: Basic Authentication
  14. HTTP in Swift, Part 14: OAuth Setup
  15. HTTP in Swift, Part 15: OAuth
  16. HTTP in Swift, Part 16: Composite Loaders
  17. HTTP in Swift, Part 17: Brain Dump
  18. HTTP in Swift, Part 18: Wrapping Up

While Basic Access authentication works for “basic” cases, it is far more common these days to see some form of OAuth used instead. There are some interesting advantages that OAuth has over Basic authentication, such as:

  • the app never has access to the user’s username and password
  • the user can change their username or password without affecting the app’s access
  • the user can remotely revoke access for an app

These advantages come at the cost of added complexity, and it is that complexity that we’ll be exploring in this post.

The OAuth Flow

A basic (non-erroring) OAuth flow looks something like this:

A basic OAuth flow

When we decide to start authentication, we first check to see if we have any saved tokens. There are three possible outcomes here:

  1. We have credentials, and they have not yet expired
  2. We have credentials, but they have expired
  3. We do not have credentials

The first case is easy. If we have un-expired credentials, then we don’t need to do anything. The second case is also pretty easy. If we have expired credentials, we can send them off to the authentication server to get “fresh” versions of the tokens (assuming the user has not revoked access), and then we’re done.

If we do not have any credentials, then we need to ask the user to log in. This is done by constructing a URL to a login webpage, and then displaying that page to the user. The user logs in to the webpage, and the server showing that page validates the correctness of the credentials. Assuming the user’s name and password are correct, the webpage redirects to a new URL, which the browser intercepts and redirects to the app.

This redirection URL has a special code, generated by the server, that the app can use to get access credentials. So with the code in-hand, the app now turns around and asks the server: “given this code, I need authorization tokens”. The server responds with the tokens, and the process completes.

The OAuth flow is very well-defined, and is a great example of a state machine. There are a few specific things to do, and the order in which they’re done is strictly defined, and only that order is allowed.

If we were to implement this flow in Swift, a common approach might be to have some sort of State enum like we saw in the Basic Authentication post. However, given the numbers of states and the very explicit flow allowed, I think a more formal approach is warranted.

The OAuth State Machine

First, we’ll define a “State Machine” class that represents the overall process:

class OAuthStateMachine {
    func run() { }
}

Next, we’ll define a OAuthState class that represents “a circle” in the diagram above, and give the state machine a state. This state will have an enter() method, which will be called when we “enter” that state and it should start executing its logic:

class OAuthState {
    func enter() { }
}

class OAuthStateMachine {
    private var currentState: OAuthState!
}

A state will need a way to tell the machine when it’s ready to move on, so it’ll need a reference to the machine, and a way to move states:

class OAuthState {
    unowned var machine: OAuthStateMachine!
    func enter() { }
}

class OAuthStateMachine {
    private var currentState: OAuthState!

    func move(to newState: OAuthState) {
        currentState?.machine = nil
        newState.machine = self
        currentState = newState
        currentState.enter()
    }
}

Now we can define the states that correspond to our diagram:

class GetSavedCredentials: OAuthState { }
class LogIn: OAuthState { }
class GetTokens: OAuthState { }
class RefreshTokens: OAuthState { }
class Done: OAuthState { }

For each one of these, we’ll need to implement their enter() method. Let’s look at each one.

Retrieving Credentials

The GetSavedCredentials state is when we need to turn around and ask the app if it has any credentials save for us in the Keychain (or other secure storage location). In the Basic Access post, we did this via a delegate. We’ll assume a similar approach here.

class GetSavedCredentials: OAuthState {
    override func enter() {
        // we need to ask someone if there are any save credentials
        // let's assume the state machine itself has a delegate we can ask

        let delegate = machine.delegate
        DispatchQueue.main.async {
            // it's always polite to invoke delegate methods on the main thread
            delegate.stateMachine(self.machine, wantsPersistedCredentials: { credentials in
                // this closure will be called with either the credentials that were saved, or "nil"
                self.processCredentials(credentials)
            })
        }
    }

    private func processCredentials(_ credentials: OAuthCredentials?) {
        let nextState: OAuthState
        if let credentials = credentials, credentials.expired == false {
            // we got credentials and they're not expired
            nextState = Done(credentials: credentials)
        } else if let credentials = credentials {
            // we got credentials but they are expired
            nextState = RefreshTokens(credentials: credentials)
        } else {
            // we did not get credentials
            nextState = LogIn()
        }
        machine.move(to: nextState)
    }
}

That’s really all there is to it. When we “enter” the state, we ask the delegate if it has any saved credentials. At some point it’ll get back to us, whereupon we examine the result and decide where to go next.

Refreshing Tokens

Tokens that we get from the authentication server come with two parts: the “refresh” token and the “access” token. The access token is what we use to authenticate each request and tends to have a short lifetime. It can be valid anywhere from a couple of minutes to several days.

At some point, it will “expire” (this expiration date is included in the data we get as part of the tokens). When this happens, we use the other token (the “refresh” token) to ask the server for a new access token. This is one of the ways the OAuth tries to give as much control to the user as possible. When a user revokes access to an app, it not only invalidates the access token, but also the refresh token. This means that an app can’t just get new tokens and still maintain access, but instead loses access entirely.

The RefreshTokens state uses this “refresh token” to get new credentials. Let’s assume the OAuthStateMachine has an HTTPLoader we can use to request new credentials (for example, this could likely be the overall OAuth loader’s .nextLoader).

class RefreshTokens: OAuthState {
    let credentials: OAuthCredentials
    override func enter() {
        var request = HTTPRequest()
        // TODO: construct the request to point to our OAuth server
        request.body = FormBody([
            URLQueryItem(name: "client_id", value: "my_apps_client_id"),
            URLQueryItem(name: "client_secret", value: "my_apps_client_secret"),
            URLQueryItem(name: "grant_type", value: "refresh_token"),
            URLQueryItem(name: "refresh_token", value: credentials.refreshToken)
        ])

        machine.loader.load(request: request, completion: { result in
            self.processResult(result)
        })
    }

    private func processResult(_ result: HTTPResult) {
        let nextState: OAuthState
        switch result {
            case .failure(let error):
                // TODO: do we give up here? Or maybe we could ask the user to log in?
                nextState = Done(credentials: nil)

            case .success(let response):
                // this could be any response, including a "401 Unauthorized" response

                if let credentials = OAuthCredentials(response: response) {
                    // TODO: notify the delegate that we have new credentials to save
                    nextState = Done(credentials: credentials)
                } else {
                    // TODO: do we give up here? Or maybe we could ask the user to log in?
                    nextState = Done(credentials: nil)
                }
        }

        machine.move(to: nextState)
    }
}

Given our existing OAuthCredentials, we use the .refreshToken inside of it to ask the server for a new access token. If we get it, we can tell the delegate to save it and move to the “Done” state. If something goes wrong, then we can either give up (move to “Done” with no credentials) or perhaps we can can head on over to the “LogIn” state and ask the user to sign in again. That particular choice is ours, and making that change is a matter of instantiating a LogIn instance instead of a Done instance.

Logging In

If we failed to get a valid access token (whether we didn’t have one saved or couldn’t understand the response), we’ll probably need to ask the user to log in. This state is going to be just as simple as the GetSavedCredentials state:

class LogIn: OAuthState {
    let state = UUID()

    override func enter() {
        var loginURL = URLComponents()
        // construct a URL according to the specification for the server. This will likely be something like this:
        loginURL.scheme = "https"
        loginURL.host = "example.com"
        loginURL.path = "/oauth/login"
        loginURL.queryItems = [
            URLQueryItem(name: "client_id", value: "my_apps_client_id"),
            URLQueryItem(name: "response_type", value: "code"),
            URLQueryItem(name: "scope", value: "space separated list of permissions I want"),
            URLQueryItem(name: "state", value: state.uuidString)
        ]

        let url = loginURL.url! // this should always succeed
        // TODO: what if in some alternate reality this fails. Then what?

        DispatchQueue.main.async {
            let delegate = self.machine.delegate
            delegate.stateMachine(self.machine, displayLoginURL: url, completion: { callbackURL in
                self.processCallbackURL(callbackURL)
            })
        }
    }

    private func processCallbackURL(_ url: URL?) {
        let nextState: OAuthState
        // TODO: if url is nil, then the user cancelled the login process → Done(credentials: nil)
        // TODO: if we got a url but its "state" query item doesn't match self.state, the app called the wrong callback → Done(credentials: nil)
        // TODO: if we see a "code" query item in the URL → GetTokens(code: code)
        machine.move(to: nextState)
    }
}

This is conceptually a very simple state. Like with GetSavedCredentials, this is a state where we “wait for the app to do something, and when it’s done it’ll tell us”.

That part where “the app does something” can be a couple of different things. All this state is doing is giving a URL to the app, which the app needs to use somehow to get the user to log in. That could be via a WKWebView (not recommended), popping out to Safari (potentially disruptive), or using an ASWebAuthenticationSession (probably the best experience).

Displaying a WKWebView is easy, but it exposes the user to risk, because it’s possible for the app to see what the user’s typing into such a web view. Therefore, you should not use these for sensitive scenarios, like logging in. If you chose to use one of these (which you shouldn’t), you would use the web view’s WKNavigationDelegate to see when the user is done and the server is attempting to redirect the flow back to the app. You’d intercept the URL of the redirection, and invoke the callback provided to the state machine delegate method with that URL. Of course, if the user decides to cancel, you’d invoke the callback with nil. But, don’t use this approach.

If you decide to pop the user out to Safari, you’d use UIApplication (or NSWorkspace) to open the URL. The app would also need to save the callback somewhere. In Safari, the server will issue the redirect to a URL that the app is registered to handle (via its Info.plist), at which point the app will activate again, your application(_:open:options:) delegate method will be invoked, and you pass the URL back to the callback. Of course with this approach, you have no way good way of knowing if the user has cancelled.

The best approach is to use ASWebAuthenticationSession to show a secure browser in your app, but also provide a callback for when the session is finished. For example, if you have a UIWindowSceneDelegate in your app, you could use it as the presenting context:

extension MySceneDelegate: ASWebAuthenticationPresentationContextProviding {

    // this is the method that would get called by way of the state machine delegate method
    func displayLoginSession(_ url: URL, completion: @escaping (URL?) -> Void) {
        self.authSession = ASWebAuthenticationSession(url: url, callbackURLScheme: Bundle.main.bundleIdentifier!, completionHandler: { [weak self] url, error in
            self?.authSession = nil
            completion(url)
        })
        self.authSession?.prefersEphemeralWebBrowserSession = true
        self.authSession?.presentationContextProvider = self
        self.authSession?.start()
    }
    
    func presentationAnchor(for session: ASWebAuthenticationSession) -> ASPresentationAnchor {
        // which window are we presenting the session in? the scene's window!
        return window!
    }
}

Since we have decoupled “the UI needed to log in” from “the overall process of authentication”, we actually end up with a decent amount of flexibility around the entire experience and can allow app authors to choose the best experience for their app, without us (as library authors) having to make that decision for them.

Getting the Tokens

I won’t list out the full state here, but this is conceptually identical to the RefreshTokens state. In this state, we take the code we received as part of the Log In callback URL and send it up to the server with other needed bits like our client ID and client secret. Assuming it all checks out, we’ll get back a nice new refresh token and valid access token, which we can ask the app to save for us, and then move to our final “Done” state.

Done

The Done state, as we’ve seen this far, can be invoked with either a valid set of credentials or nil (meaning something went wrong). When we enter this state, it’ll need to signal somehow to the machine that the overall process has completed (another new method on OAuthStateMachine, likely) and the machine can take whatever value the Done state got and return it out to the part of the library that invoked the state machine.

Two Big Caveats

There are two glaring omissions I’ve left out of this state machine.

The first thing I’ve left out, like I’ve left out in all implementations, is the notion of threadsafety. I leave this out because it’s a decent amount of boilerplate code, and I’m aiming for readability in these posts, and not “exact correctness”.

The other thing is prompted by the question: What happens if we’re in the middle of this state machine and we’re asked to reset() the entire loading chain? In order to accommodate this, each state will likely need to also have a reset() method it can use to perform any cleanup (such as cancelling network requests) and then immediately moving to a new LogOut state. The LogOut state would be responsible for telling the delegate to save a new set of credentials (nil, meaning “delete what you have”), potentially displaying a log out web page, expiring credentials with the server, and so on. I’ve left it out for brevity, but if you end up implementing OAuth, you’d be wise to consider this scenario.

Wrapping Up

This is the overall setup we’re going to need to implement an OAuth loader for our library. In the next post, we’ll be using this OAuthStateMachine to automatically get and refresh tokens to use for authenticated requests.


Related️️ Posts️

HTTP in Swift, Part 18: Wrapping Up
HTTP in Swift, Part 17: Brain Dump
HTTP in Swift, Part 16: Composite Loaders
HTTP in Swift, Part 15: OAuth
HTTP in Swift, Part 13: Basic Authentication
HTTP in Swift, Part 12: Retrying
HTTP in Swift, Part 11: Throttling
HTTP in Swift, Part 10: Cancellation
HTTP in Swift, Part 9: Resetting
HTTP in Swift, Part 8: Request Options
HTTP in Swift, Part 7: Dynamically Modifying Requests
HTTP in Swift, Part 6: Chaining Loaders
HTTP in Swift, Part 5: Testing and Mocking
HTTP in Swift, Part 4: Loading Requests
HTTP in Swift, Part 3: Request Bodies
HTTP in Swift, Part 2: Basic Structures
HTTP in Swift, Part 1: An Intro to HTTP