fbpx
Web Development

How Authorization Tokens Work

1. Implementation Types

It depends on each authorization server how to implement access tokens.

It depends, but as the description in “1.4. Access Token” of RFC 6749 implies as follows,

The token may denote an identifier used to retrieve the authorization information or may self-contain the authorization information in a verifiable manner (i.e., a token string consisting of some data and a signature)/

implementations can be categorized into two major patterns, that is, “identifier type” and “self-contained type”. There also exists “hybrid-type” that combines the two types.

1.1. Identifier Type

In an identifier-type implementation, information associated with access tokens is stored in a database of the authorization server. Then, unique identifiers that can identify each database record are used as access tokens.

Tokens are stored in the database instead of access tokens themselves for better security.

1.2. Self-Contained Type

In a self-contained-type implementation, information associated with an access token is embedded in the access token itself. The authorization server does not have to manage information associated with access tokens in its database

1.3. Hybrid Type

In a hybrid-type implementation, self-contained-type access tokens are generated and at the same time database records corresponding to the access, tokens are stored in a database of the authorization server.

2. How To Get Access Token Information

2.1. How To Get Information about Identifier-Type Access Token

If access tokens are identifier type, resources servers must make inquiries to the authorization server about access token information unless the resource servers and the authorization server directly share the same database. The API that the authorization server provides for the inquiries is called the “introspection endpoint”.

RFC 7662 (OAuth 2.0 Token Introspection) is the standard specification for the introspection endpoint.

The standard introspection endpoint accepts POST requests with a mandatory token request parameter and an optional token_type_hint request parameter and returns access token information in JSON format. The following are a request example and a response example from RFC 7662.

An only concern over identifier-type access tokens that is frequently asked is performance because making an inquiry to an introspection endpoint will involve network communication. However, in practice, concerns around introspection latency are largely mitigated by good caching. Of course, it is important to delete cache entries when corresponding access tokens are revoked.

2.2. How To Get Information about Self-Contained-Type Access Token

If access tokens are self-contained type, access token information (such as expiration date) can be obtained by reading the content of the access tokens. Resource servers do not have to make inquiries to the introspection endpoint of the authorization server.

3. Verification of Self-Contained-Type Access Token

The format of self-contained-type access tokens is publicly known unless they are encrypted. Therefore, access tokens can be easily counterfeited if there is no mechanism to prevent it.

Before swallowing information embedded in an access token, resource servers must verify in some way or other that the access token is not a fake one. This is the reason that the sentence shown above excerpted from “1.4. Access Token” of RFC 6749 says “in a verifiable manner”. A common practice to detect forgery is to attach the signature to data and verify the signature when the data is used. In the case of self-contained-type access tokens, the steps are (1) that an authorization server generates a signature-attached access token and then (2) that a resource server verifies the signature.

As JWT (JSON Web Token) defined in RFC 7519 is handy as a generic format for signature-attached data, JWT is often adopted as the format of self-contained-type access tokens. In fact, there exists a specification that assumes it.

4. Consideration Points for JWT-based Access Token 4.1. Signature Algorithm

Available choices for the signature algorithm are the ones listed in “3.1. “alg” (Algorithm) Header Parameter Values for JWS” of RFC 7518 (JSON Web Algorithms). Among them, none which means “no signature” is meaningless for JWT-based access tokens, of course.

Signature Algorithms listed in RFC 7518

4.1.1. Symmetric Signature Algorithm

Because HS256, HS384, and HS512 are symmetric algorithms, an authorization server (which generates JWT-based access tokens) and a resource server (which interprets JWT-based access tokens) must share the same key. At the time of this writing, there is no specification that defines a rule to determine the shared key.

10.1. Signing” of OpenID Connect Core 1.0 states that the shared key is “the octets of the UTF-8 representation of the client_secret value” when a symmetric algorithm is used for signing. However, this rule can apply only between an authorization server and a client application and cannot apply for a symmetric key shared between an authorization server and a resource server.

No Rule for Shared Key between Authorization Server and Resource Server

Therefore, implementers have to decide their own rules as to how to determine a shared key if they want to use symmetric algorithms for signing access tokens.

Some authorization server implementations issue pairs of client ID and client secret to resource servers. By treating resource servers as clients, the existing rules and infrastructure for keys can be reused. It may work, but I’m not so sure that mixing different concepts won’t cause inconsistencies somewhere unexpected in the future.

4.1.2. Asymmetric Signature Algorithm

Other algorithms are asymmetric.

An authorization server signs an access token with a private key, and a resource server verifies the signature using a public key exposed by the authorization server. The resource server necessarily has to obtain the public key of the authorization server in advance before performing signature verification.

If the authorization server provides an endpoint that exposes its JWK Set document (RFC 7517) and the document includes a public key whereby to verify signature of access tokens, resource servers can download the public key from the endpoint.

If the authorization server supports OpenID Connect Discovery 1.0, resource servers can find the URL of the JWK Set document in a response from the discovery endpoint of the authorization server ({issuer-identifier}/.well-known/OpenID-configuration). A discovery endpoint returns information about the server’s configuration in JSON format. The value of the jwks_uri parameter in the JSON represents the URL of the JWK Set document. A live example of discovery endpoint is here (Google’s discovery endpoint).

RS256 is “Recommended” in RFC 7518, but it is better not to use asymmetric algorithms that start with RS. For security reasons, “8.6. JWS algorithm considerations” in “Financial-grade API — Part 2: Read and Write API Security Profile” says that RS algorithms should not be used. In addition, from a viewpoint of key size and performance, other algorithms are preferable.

4.2. Encryption

JWT-based access tokens can be encrypted by using RFC 7516 (JSON Web Encryption).

4.2.1. Symmetric Encryption Algorithm

As the same as in the case of a symmetric signature algorithm, implementers have to decide a rule as to how to determine a shared key between an authorization server and a resource server because there is no standard specification for the purpose.

4.2.2. Asymmetric Encryption Algorithm

If an asymmetric algorithm is used for encryption, an authorization server uses a public key of a resource server in encrypting access tokens. However, there is no standard specification that defines how to get a public key of a resource server (Note1). Therefore, implementers have to decide how to pass a public key of a resource server to an authorization server.

(Note1: A specification (OAuth 2.0 Protected Resource Metadata) defining metadata of a resource server was proposed in the past and it included jwks_uri, but the last update of the draft was more than 2 years ago (Jan. 19, 2017) and it has already expired.)

If an authorization server encrypts access tokens with an asymmetric algorithm using a public key of a resource server, the implementation of the introspection endpoint of the authorization server needs a private key of the resource server that corresponds to the public key in order to decrypt the encrypted access tokens. If implementers think it is a bad practice to share a private key between an authorization server and a resource server, acceptable choices will be either (1) that the introspection endpoint returns an error when it receives an encrypted access token or (2) that the authorization server gives up providing an introspection endpoint.

4.3. Information Hidden from Client

It is easy to read the payload part of unencrypted JWT-based access tokens. Therefore, information that should not be visible to a client must not be included in an unencrypted JWT-based access token.

To associate information that should be hidden from a client with an access token, possible choices will be the following.

  1. encrypt access tokens
  2. adopt identifier-type access tokens
  3. adopt hybrid-type access tokens and keep secret information only in the database on the server-side and not embed the information in access tokens

Regarding access token encryption, it should be noted that access tokens are sent through the network on every API call and this fact will raise another issue when encryption keys are compromised. Especially, if the encryption algorithm lacks PFS (perfect forward secrecy).

4.4. Access Token Revocation

It is difficult, if not impossible, to revoke self-contained-type access tokens. Because the structure is the same as that of the PKI certificate, a mechanism equivalent to PKI’s CRL (Certificate Revocation List) or OCSP (Online Certificate Status Protocol) must be in operation in order to revoke self-contained-type access tokens before their expiration.

To build a mechanism equivalent to CRL or OCSP, each access token must be uniquely identifiable. This can be achieved by utilizing the jti claim. Then, an authorization server registers the unique identifier of an access token into its “access token revocation list” when the access token is revoked. The unique identifier must be kept in the list until the original expiration date of the access token is reached. If the unique identifier were removed earlier, the revoked access token would be resurrected.

When a resource server receives an access token, it must check the revocation status of the access token. If a CRL-like mechanism has been adopted, the resource server will download the list of revoked access tokens from somewhere and check whether the unique identifier of the access token is included in the list or not. On the other hand, if an OCSP-like mechanism is in operation, the resource server will pass the unique identifier of the access token to an API equivalent to “OCSP Responder” and get the revocation status of the access token in return.

Inquiry about Revocation Status of Access Token

However, making an inquiry about revocation status to an authorization server will involve network communication as does making an API call to an introspection endpoint for an identifier-type access token. If this is true, the biggest advantage of self-contained-type access tokens vanishes. Considering the fact that identifier type has more merits, it is hard to find convincing reasons to adopt self-contained type positively. If implementers still had to choose a self-contained type, it would be only when an authorization server and a resource server cannot communicate over the network at runtime for some reason.

Therefore, when adopting a self-contained type for access token implementation, it is often the case that implementers make a compromise “make the duration of access tokens as short as possible and give up revocation”.