HTTP in Swift, Part 14: OAuth Setup
Part 14 in a series on building a Swift HTTP framework:
- HTTP in Swift, Part 1: An Intro to HTTP
- HTTP in Swift, Part 2: Basic Structures
- HTTP in Swift, Part 3: Request Bodies
- HTTP in Swift, Part 4: Loading Requests
- HTTP in Swift, Part 5: Testing and Mocking
- HTTP in Swift, Part 6: Chaining Loaders
- HTTP in Swift, Part 7: Dynamically Modifying Requests
- HTTP in Swift, Part 8: Request Options
- HTTP in Swift, Part 9: Resetting
- HTTP in Swift, Part 10: Cancellation
- HTTP in Swift, Part 11: Throttling
- HTTP in Swift, Part 12: Retrying
- HTTP in Swift, Part 13: Basic Authentication
- HTTP in Swift, Part 14: OAuth Setup
- HTTP in Swift, Part 15: OAuth
- HTTP in Swift, Part 16: Composite Loaders
- HTTP in Swift, Part 17: Brain Dump
- HTTP in Swift, Part 18: Wrapping Up
While Basic Access authentication works for “basic” cases, it is far more common these days to see some form of OAuth used instead. There are some interesting advantages that OAuth has over Basic authentication, such as:
- the app never has access to the user’s username and password
- the user can change their username or password without affecting the app’s access
- the user can remotely revoke access for an app
These advantages come at the cost of added complexity, and it is that complexity that we’ll be exploring in this post.
The OAuth Flow
A basic (non-erroring) OAuth flow looks something like this:
When we decide to start authentication, we first check to see if we have any saved tokens. There are three possible outcomes here:
- We have credentials, and they have not yet expired
- We have credentials, but they have expired
- We do not have credentials
The first case is easy. If we have un-expired credentials, then we don’t need to do anything. The second case is also pretty easy. If we have expired credentials, we can send them off to the authentication server to get “fresh” versions of the tokens (assuming the user has not revoked access), and then we’re done.
If we do not have any credentials, then we need to ask the user to log in. This is done by constructing a URL to a login webpage, and then displaying that page to the user. The user logs in to the webpage, and the server showing that page validates the correctness of the credentials. Assuming the user’s name and password are correct, the webpage redirects to a new URL, which the browser intercepts and redirects to the app.
This redirection URL has a special code, generated by the server, that the app can use to get access credentials. So with the code in-hand, the app now turns around and asks the server: “given this code, I need authorization tokens”. The server responds with the tokens, and the process completes.
The OAuth flow is very well-defined, and is a great example of a state machine. There are a few specific things to do, and the order in which they’re done is strictly defined, and only that order is allowed.
If we were to implement this flow in Swift, a common approach might be to have some sort of State
enum like we saw in the Basic Authentication post. However, given the numbers of states and the very explicit flow allowed, I think a more formal approach is warranted.
The OAuth State Machine
First, we’ll define a “State Machine” class that represents the overall process:
class OAuthStateMachine {
func run() { }
}
Next, we’ll define a OAuthState
class that represents “a circle” in the diagram above, and give the state machine a state. This state will have an enter()
method, which will be called when we “enter” that state and it should start executing its logic:
class OAuthState {
func enter() { }
}
class OAuthStateMachine {
private var currentState: OAuthState!
}
A state will need a way to tell the machine when it’s ready to move on, so it’ll need a reference to the machine, and a way to move states:
class OAuthState {
unowned var machine: OAuthStateMachine!
func enter() { }
}
class OAuthStateMachine {
private var currentState: OAuthState!
func move(to newState: OAuthState) {
currentState?.machine = nil
newState.machine = self
currentState = newState
currentState.enter()
}
}
Now we can define the states that correspond to our diagram:
class GetSavedCredentials: OAuthState { }
class LogIn: OAuthState { }
class GetTokens: OAuthState { }
class RefreshTokens: OAuthState { }
class Done: OAuthState { }
For each one of these, we’ll need to implement their enter()
method. Let’s look at each one.
Retrieving Credentials
The GetSavedCredentials
state is when we need to turn around and ask the app if it has any credentials save for us in the Keychain (or other secure storage location). In the Basic Access post, we did this via a delegate. We’ll assume a similar approach here.
class GetSavedCredentials: OAuthState {
override func enter() {
// we need to ask someone if there are any save credentials
// let's assume the state machine itself has a delegate we can ask
let delegate = machine.delegate
DispatchQueue.main.async {
// it's always polite to invoke delegate methods on the main thread
delegate.stateMachine(self.machine, wantsPersistedCredentials: { credentials in
// this closure will be called with either the credentials that were saved, or "nil"
self.processCredentials(credentials)
})
}
}
private func processCredentials(_ credentials: OAuthCredentials?) {
let nextState: OAuthState
if let credentials = credentials, credentials.expired == false {
// we got credentials and they're not expired
nextState = Done(credentials: credentials)
} else if let credentials = credentials {
// we got credentials but they are expired
nextState = RefreshTokens(credentials: credentials)
} else {
// we did not get credentials
nextState = LogIn()
}
machine.move(to: nextState)
}
}
That’s really all there is to it. When we “enter” the state, we ask the delegate if it has any saved credentials. At some point it’ll get back to us, whereupon we examine the result and decide where to go next.
Refreshing Tokens
Tokens that we get from the authentication server come with two parts: the “refresh” token and the “access” token. The access token is what we use to authenticate each request and tends to have a short lifetime. It can be valid anywhere from a couple of minutes to several days.
At some point, it will “expire” (this expiration date is included in the data we get as part of the tokens). When this happens, we use the other token (the “refresh” token) to ask the server for a new access token. This is one of the ways the OAuth tries to give as much control to the user as possible. When a user revokes access to an app, it not only invalidates the access token, but also the refresh token. This means that an app can’t just get new tokens and still maintain access, but instead loses access entirely.
The RefreshTokens
state uses this “refresh token” to get new credentials. Let’s assume the OAuthStateMachine
has an HTTPLoader
we can use to request new credentials (for example, this could likely be the overall OAuth
loader’s .nextLoader
).
class RefreshTokens: OAuthState {
let credentials: OAuthCredentials
override func enter() {
var request = HTTPRequest()
// TODO: construct the request to point to our OAuth server
request.body = FormBody([
URLQueryItem(name: "client_id", value: "my_apps_client_id"),
URLQueryItem(name: "client_secret", value: "my_apps_client_secret"),
URLQueryItem(name: "grant_type", value: "refresh_token"),
URLQueryItem(name: "refresh_token", value: credentials.refreshToken)
])
machine.loader.load(request: request, completion: { result in
self.processResult(result)
})
}
private func processResult(_ result: HTTPResult) {
let nextState: OAuthState
switch result {
case .failure(let error):
// TODO: do we give up here? Or maybe we could ask the user to log in?
nextState = Done(credentials: nil)
case .success(let response):
// this could be any response, including a "401 Unauthorized" response
if let credentials = OAuthCredentials(response: response) {
// TODO: notify the delegate that we have new credentials to save
nextState = Done(credentials: credentials)
} else {
// TODO: do we give up here? Or maybe we could ask the user to log in?
nextState = Done(credentials: nil)
}
}
machine.move(to: nextState)
}
}
Given our existing OAuthCredentials
, we use the .refreshToken
inside of it to ask the server for a new access token. If we get it, we can tell the delegate to save it and move to the “Done” state. If something goes wrong, then we can either give up (move to “Done” with no credentials) or perhaps we can can head on over to the “LogIn” state and ask the user to sign in again. That particular choice is ours, and making that change is a matter of instantiating a LogIn
instance instead of a Done
instance.
Logging In
If we failed to get a valid access token (whether we didn’t have one saved or couldn’t understand the response), we’ll probably need to ask the user to log in. This state is going to be just as simple as the GetSavedCredentials
state:
class LogIn: OAuthState {
let state = UUID()
override func enter() {
var loginURL = URLComponents()
// construct a URL according to the specification for the server. This will likely be something like this:
loginURL.scheme = "https"
loginURL.host = "example.com"
loginURL.path = "/oauth/login"
loginURL.queryItems = [
URLQueryItem(name: "client_id", value: "my_apps_client_id"),
URLQueryItem(name: "response_type", value: "code"),
URLQueryItem(name: "scope", value: "space separated list of permissions I want"),
URLQueryItem(name: "state", value: state.uuidString)
]
let url = loginURL.url! // this should always succeed
// TODO: what if in some alternate reality this fails. Then what?
DispatchQueue.main.async {
let delegate = self.machine.delegate
delegate.stateMachine(self.machine, displayLoginURL: url, completion: { callbackURL in
self.processCallbackURL(callbackURL)
})
}
}
private func processCallbackURL(_ url: URL?) {
let nextState: OAuthState
// TODO: if url is nil, then the user cancelled the login process → Done(credentials: nil)
// TODO: if we got a url but its "state" query item doesn't match self.state, the app called the wrong callback → Done(credentials: nil)
// TODO: if we see a "code" query item in the URL → GetTokens(code: code)
machine.move(to: nextState)
}
}
This is conceptually a very simple state. Like with GetSavedCredentials
, this is a state where we “wait for the app to do something, and when it’s done it’ll tell us”.
That part where “the app does something” can be a couple of different things. All this state is doing is giving a URL
to the app, which the app needs to use somehow to get the user to log in. That could be via a WKWebView
(not recommended), popping out to Safari (potentially disruptive), or using an ASWebAuthenticationSession
(probably the best experience).
Displaying a WKWebView
is easy, but it exposes the user to risk, because it’s possible for the app to see what the user’s typing into such a web view. Therefore, you should not use these for sensitive scenarios, like logging in. If you chose to use one of these (which you shouldn’t), you would use the web view’s WKNavigationDelegate
to see when the user is done and the server is attempting to redirect the flow back to the app. You’d intercept the URL of the redirection, and invoke the callback provided to the state machine delegate method with that URL. Of course, if the user decides to cancel, you’d invoke the callback with nil
. But, don’t use this approach.
If you decide to pop the user out to Safari, you’d use UIApplication
(or NSWorkspace
) to open the URL. The app would also need to save the callback somewhere. In Safari, the server will issue the redirect to a URL that the app is registered to handle (via its Info.plist), at which point the app will activate again, your application(_:open:options:)
delegate method will be invoked, and you pass the URL back to the callback. Of course with this approach, you have no way good way of knowing if the user has cancelled.
The best approach is to use ASWebAuthenticationSession
to show a secure browser in your app, but also provide a callback for when the session is finished. For example, if you have a UIWindowSceneDelegate
in your app, you could use it as the presenting context:
extension MySceneDelegate: ASWebAuthenticationPresentationContextProviding {
// this is the method that would get called by way of the state machine delegate method
func displayLoginSession(_ url: URL, completion: @escaping (URL?) -> Void) {
self.authSession = ASWebAuthenticationSession(url: url, callbackURLScheme: Bundle.main.bundleIdentifier!, completionHandler: { [weak self] url, error in
self?.authSession = nil
completion(url)
})
self.authSession?.prefersEphemeralWebBrowserSession = true
self.authSession?.presentationContextProvider = self
self.authSession?.start()
}
func presentationAnchor(for session: ASWebAuthenticationSession) -> ASPresentationAnchor {
// which window are we presenting the session in? the scene's window!
return window!
}
}
Since we have decoupled “the UI needed to log in” from “the overall process of authentication”, we actually end up with a decent amount of flexibility around the entire experience and can allow app authors to choose the best experience for their app, without us (as library authors) having to make that decision for them.
Getting the Tokens
I won’t list out the full state here, but this is conceptually identical to the RefreshTokens
state. In this state, we take the code
we received as part of the Log In callback URL and send it up to the server with other needed bits like our client ID and client secret. Assuming it all checks out, we’ll get back a nice new refresh token and valid access token, which we can ask the app to save for us, and then move to our final “Done” state.
Done
The Done
state, as we’ve seen this far, can be invoked with either a valid set of credentials or nil
(meaning something went wrong). When we enter this state, it’ll need to signal somehow to the machine that the overall process has completed (another new method on OAuthStateMachine
, likely) and the machine can take whatever value the Done
state got and return it out to the part of the library that invoked the state machine.
Two Big Caveats
There are two glaring omissions I’ve left out of this state machine.
The first thing I’ve left out, like I’ve left out in all implementations, is the notion of threadsafety. I leave this out because it’s a decent amount of boilerplate code, and I’m aiming for readability in these posts, and not “exact correctness”.
The other thing is prompted by the question: What happens if we’re in the middle of this state machine and we’re asked to reset()
the entire loading chain? In order to accommodate this, each state will likely need to also have a reset()
method it can use to perform any cleanup (such as cancelling network requests) and then immediately moving to a new LogOut
state. The LogOut
state would be responsible for telling the delegate to save a new set of credentials (nil
, meaning “delete what you have”), potentially displaying a log out web page, expiring credentials with the server, and so on. I’ve left it out for brevity, but if you end up implementing OAuth, you’d be wise to consider this scenario.
Wrapping Up
This is the overall setup we’re going to need to implement an OAuth
loader for our library. In the next post, we’ll be using this OAuthStateMachine
to automatically get and refresh tokens to use for authenticated requests.