Which information is encoded in Signal's user QR codes?

Written on 2024-03-08 in 1198 words ✍️.
Part of cs IT-security privacy

Update 2024-03-09: the two digits can be overwritten by the user - thanks for pointing it out, Stefan!

Introduction

My favorite messaging app Signal recently launched a major feature where they try to remove phone numbers as identifiers as much as possible. A recent blog “Keep your phone number private with Signal usernames” describes it. Instead they use usernames like “meisterluk.42” now. This username can be chosen by the user. Recognize that these usernames are temporary to find a user right away, but the identifier can be easily given up and there is no directory of usernames. This feature also includes a QR code generation specific for your username:

two smartphone screens showing the QR code as part of the user profile and its decoded value which is a URL

You can also generate a QR code or link that directs people to your username, letting them quickly connect with you on Signal.

— Signal blog post

We wondered whether this QR code URL encodes only the username. If so, the scanning user might reach another person, because the username was given up and another person now uses it. Can this happen? If there is additional information (e.g. when the username was assigned), this case can be prevented.

FOSS (abbr. free open source software) has the beautiful property that this technical aspect can be checked by anyone.

Source code analysis

Heads-up: the app is written in Kotlin; an obvious choice for the Android application.

  1. In order to get started, we need to identify where the QR is shown. In QrCodeBadge, we can see that the screen is divided into rows and columns to generate the user interface. So this provides the starting point where we can identify which information is used for QR code generation.

  2. So we generate a QrCode instance to be represented. Does it really reveal which information is encoded? No, because we only deal with pixels and bits. The code does not indicate how the bits have been created.

  3. So far we understand that the UI component QRCode is generated with data.data as information source. Thus the data member of QrCodeState which is of type QrCodeData.

  4. Once more, QrCodeData only considers the bits and not semantic elements of the stored information. At this point, we need to get back to the first step. So we generate the QR code, but who fill data into the data member?

  5. Most likely UsernameLinkShareScreen does it. There are several calls of QrCodeBadge (the function we looked at in the first step), but I found this one most plausible.

  6. If so, the information is encoded in is qrCodeState of an object of class UsernameLinkSettingsState.

  7. And which value does qrCodeState have? Well, this really depends on the touch user interface state (loading screen versus ready). But likely the relevant information is stored in _state.qrCodeState.

  8. Where is qrCodeState set? This happens based on a callback once the data is available.

  9. Based on the callback response data, the link data must be generated. toLink() does it.

  10. Where is toLink() implemented? In UsernameRepository.

  11. entropy and serverId are the components. This corresponds to the documentation of UsernameLinkComponents:

    “[…​] for passing around the two components of a username link: the entropy and serverId, which when combined together form the full link handle.”

  12. But let us take one step back to UsernameRepository. Its comprehensive documentation reveals the solution to our question:

  • We want usernames to be assigned a random numerical discriminator to avoid land grabs

  • We don’t want to store plaintext usernames on the service

  • We don’t want plaintext usernames in username links

  • We want username links to be revocable and rotatable without changing your username

  • We want users to be able to turn a link into a displayable username in the app

— requirements for username links acc. to UsernameRepository documentation

There’s three main components to username links:

  • An encrypted username blob

  • A serverId (which is a UUID)

  • "entropy" (some random bytes used to encrypt the username)

The service basically stores a map of (serverId → encrypted username blob). We can ask the service for the encrypted username blob for a given serverId, and then decrypt it with the entropy. Simple enough.

How are those pieces shared? Well, the link looks like this: https://signal.me/#eu/<32 bytes of entropy><16 bytes of serverId uuid>

So, when we get a link, we parse out the entropy and serverId. We then use the serverId to get the encrypted username, and then decrypt it with the entropy.

This gives us everything we want:

  • We can rotate our link without changing our username by just picking new (serverId, entropy) and storing a new blob on the service.

  • When the user decrypts the username, they see it displayed exactly how the user uploaded it.

  • The service has no idea what links correspond to what usernames — it’s just storing encrypted blobs.

— UsernameRepository documentation

Summary

So the URL is decoded as follows:

  1. serviceID is extracted from the URL (final 16 bytes)

  2. A Signal server responds back which encrypted username blob is associated with this serviceID

  3. We use the entropy of the URL (first 32 bytes) to decrypt the encrypted username blob. We get a username blob.

  4. By requirement #5, this is decoded into the displayable username together with the discriminator (the two digits as part of the username).

But is really just (username, discriminator)?

The source code reveals it is the tuple (username, ACI). What is ACI?

Let’s make it easy here and the user is that ACI just means that it is the hashed username, whose existence is verified by making an API call to this URL.

Conclusion

The QR code only contains the displayable username with the two digits. But the Signal server never sees this value and a whole lot of different QR codes can be generated for this one identical value.

And I like FOSS as well as the Signal source code.