Mobile App OAuth and OpenID Connect

Social signin is easy and secure to use except hard to implement correctly from scratch.
I did not find exactly what I need so I am writing this down for my future self.

I am building an online forum application with the following requirements:

has mobile and web frontend.
shared backend for RESTFUL API(GraphQL upcoming).
supports third party login(Google/Apple).

Application Architecture

We organize backend services by their responsibilitys. Services communicates with each other via RPC(either HTTP, Thrift, or flatbuffer).

For all requests coming from front end, load balancer will redirect it to designated host machine. When there is too much load, need to enable sticky session at load balancer level.

Auth Service will be responsible for authentication(authN) and authorization(authZ). We are going to have a db to save all our user related data. Since we need reliablity SQL db(Postgres) is used.

Post Service will be responsible for other post related CRUD operation. We can use NOSQL or SQL whichever is more convenient.

Logging Service is responsible for monitoring the whole application. Since this is a write heavy service, we may endup using NOSQL.

Session Service will be responsible for managing all user sessions(Create, Read, Update, Delete). This is a shared service and because session is so commonly used, in memory db (like Redis) is used.

I plan to add Newsfeed Service to be responsible for ranking user posts.

In order to offer stateful experience(e.g. serve personalized content) on top of a stateless protocol(HTTP), our backend needs to persist user state. Such state is called session and is persisted in either backend’s memory or database across requests.

In this context, when we say user is logged in, we really mean there is a session object in our system. When user logged out the session objected is removed. Session management is essentially user state management.

Stateful or stateless sessions

Worth noting JWT token does not mean it cannot use a stateful session. JWT and stateful session can co-exist. Just that stateful session with JWT means the JWT token will have a field called session_id vs saving everything in the JWT token. See more here: Stop using JWT for sessions management

If product needs to support login from multiple device, token based solution cannot support this.

Scaling stateful sessions

Depending on the total number of users, scaling stateful sessions(from joepie91):

Once you run multiple backend processes on a server: A redis daemon(on that server) for session storage.
Once you run on multiple servers: A dedicated server running Redis just for session storage.
Once you run on multiple servers, in multiple clusters: Sticky sessions.

For security, keep seesion_id in HttpOnly header to prevent malicious client js code(XSS). uuid, save session_id to either KeyStore or cookies etc.

To support sign in with Google means to create a session and assign a session_id for a user that owns a Google account.

To achieve, I need to rely on OpenID Connect protocol. The high level idea is having Google act as a third party trusted by both the user and my application.

OIDC protocol

Overall the OpenID Connect protocol, in abstract, follows the following steps.

The RP (Client) sends a request to the OpenID Provider (OP).
The OP authenticates the End-User and obtains authorization.
The OP responds with an ID Token and usually an Access Token.
The RP can send a request with the Access Token to the UserInfo Endpoint.
The UserInfo Endpoint returns Claims about the End-User.
These steps are illustrated in the following diagram:

+--------+                                   +--------+
|        |                                   |        |
|        |---------(1) AuthN Request-------->|        |
|        |                                   |        |
|        |  +--------+                       |        |
|        |  |        |                       |        |
|        |  |  End-  |<--(2) AuthN & AuthZ-->|        |
|        |  |  User  |                       |        |
|   RP   |  |        |                       |   OP   |
|        |  +--------+                       |        |
|        |                                   |        |
|        |<--------(3) AuthN Response--------|        |
|        |                                   |        |
|        |---------(4) UserInfo Request----->|        |
|        |                                   |        |
|        |<--------(5) UserInfo Response-----|        |
|        |                                   |        |
+--------+                                   +--------+

First(step 1 to 3) user interact with Google server to acquire a crypto proof(auth code) created by Google auth service showing two things:

One user has the correct credential to log into user account he/she owns.
Two my application can only access a list of information in Google’s server(scope).

After that(step 4 to 5), user gives this crypto proof(auth code) to my backend which will exchange the code for user information. My backend interacts with Google again to acquire the information listed in scope and exchange the auth code for user info(openid) and optionally a refresh token. Google server can verify everything is good by the fact that my application possess the one time auth code.

Let’s review two diagrams for the same process.

PKCE Sequence Diagram

This is the first diagram on PKCE enabled OIDC.

The client.com here is equivalent to my backend service. The as.com here is equivalent to Google’s auth service.

Sequence Diagram For My Application

This is the second diagram for my Google’s OIDC process of the Auth Code Grant Flow:

A.1 & A.2 is specific to PKCE. This part generates a verifier at the backend side. And sendback code challange = sha256(verifier) to mobile devices’ browser.

Then(A.3 to A.6) the control is redirected from mobile browser to make a GET request to Google’s auth service. Google will present user with login prompt and user can examine the correct scope(profile name, email, etc). Finally control is back with a redirect url and auth_code to mobile browser.

All A.* is user interacting with trusted third party(Google in our case). Starting from B.*, the process is how my backend interacts with Google to acquire user information.

Starting from B.1, the control is redirected from mobile browser to my backend(because this is the url I set at Google as the redirect url after Google auth services issued access code to my Mobile web browser).
Together with the control passback is auth_code being sent to my backend. At this point backend has auth_code and verifier(due to the PKCE process during A.1 and A.2). It has enough information to make a POST request to Google’s auth service to exchange for user info.

Google server validates both the access_code and code challenge = sha256(verifier) returns access_token, open_id, and request_token. (B.3)

The control goes back to my backend. My backend should set up session for this user. If open_id tells me that this user exists, I should create session object, other wise I should auto-register this user or render a new user registration flow at my mobile app.

The Youtube Video (slides)

Summary

This article discuss the high level design to support sign-in wigh Google. The key is to treat Google as trusted third party by both user and my website.

Once OIDC verifies and returns validation sucess from Google auth service, I should create session for either my mobile or web frontend.

Contents