Thu, March 13, 2025

Disambiguating Data Wallets

Data wallets are becoming popular, as this happens the term is becoming increasingly overloaded and confused. This post is a digestable disambgiuation 🍜 of the standards 📇, regulation ✒️, implementors :screwdriver:, and other players 🙋 in the space.

The Standards

As an engineer, I like to start by talking about the technology and standards behind data wallets. From there, we can point to which standards that regulators do - or don't - choose.

Don't worry - the tech is not hard to understand. As you'll soon see the confusion comes from having many players who want their solution to win out.

The goal

Credit: dock.io

The common goal of data wallets is to allow you to prove that someone said something - for instance that the University of Oxford says that you earned a DPhil in Computer Science, or that TicketMaster says they issued you with a valid ticked for tonights Taylor Swift concert.

To do this the someone (which we will now call an issuer) gives you, or more specifically an application on your device like Google Wallet (which we will call the holder) a digital Verifiable Credential. Examples include the UK digital drivers license 💳 and Digital Student Certificates.

This digital credential can then be forwarded to someone else (which we call a verifier) - such as an employer who wants to confirm that you have a valid Doctorate. Et. ✨voila✨ you now have that dream job growing cherry tomatoes¹.

The tech

Now I said the tech wasn't that hard - so let's take a look at what is going on under the hood of these credentials.

Here is an example using one of the credential standards, specifically W3C Verifiable Credentials - we'll come to the rest later. In this example, I have been issued with a "DPhil in Computer Science" from the University of Oxford.

{
  "@context": "https://www.w3.org/ns/credentials/v2",
  "id": "http://www.ox.ac.uk/credentials/58473",
  "type": ["VerifiableCredential", "DPhilAwardCredential"],
  "issuer": "http://www.ox.ac.uk/",
  "credentialSubject": {
    "id": "https://www.jeswr.org/#me",
    "awarded": {
      "id": "http://www.cs.ox.ac.uk/awards/DPhil",
      "name": "DPhil in Computer Science"
    }
  }
}

A JSON-LD representation of a W3C Verifiable Credential for a DPhil Award

But there is one problem, I just made this up. So how is this supposed to be useful in my job application to become a farm hand².

Well, what would help is for Oxford to digitally sign this credential.

The concept of digital signatures has existed for decades and whether you're aware of it or not - is already in many parts of your digital life, including being the backbone of the HTTPS security. More recently, if you've found yourself using passkeys to log into websites - then you've been using digital signatures to sign a message saying "I own this account, please let me in!"

Signatures

So how do these signatures actually work?

First, a hash of the digital credential document is created - and is a unique fingerprint 🐾 for the document. For the DPhil Award Credential, this is what the hash looks like:

4sSXDN7iEw2niW96vPWNPJVeiwWe6VR77jl+wRnA6bk=

You can try generating it for yourself here. It is not important for this article to understand how this hash is generated, but if you're curious - look here.

The issuer (i.e. Oxford) then signs this hash using something called public-private key cryptography. The way this works is that the issuer uses some mathemagic to generate a pair of files - one of which is called a public key, and the other which is called a private key. Below are real examples of these files:

-----BEGIN RSA PUBLIC KEY-----
MEgCQQCo9+BpMRYQ/dL3DS2CyJxRF+j6ctbT3/Qp84+KeFhnii7NT7fELilKUSnx
S30WAvQCCo2yU1orfgqr41mM70MBAgMBAAE=
-----END RSA PUBLIC KEY-----

What is special about the letters and numbers in these two files, is that a mathematical function can be used to combine the private key and the hash to generate a signature like this one:

z58DAdFfa9SkqZMVPxAQp...jQCrfFPP2oumHKtz

The issuer keeps the private key a secret so that no-one can forge the signature. The issuer (Oxford) also tells everyone about their public key, for instance, by putting it on their website in a Controller Identifier Document (CID). Putting this all together, we add the following information to the DPhil Award Credential:

{
  ...
  "proof": {
    ...
    "verificationMethod": "http://www.ox.ac.uk/pubkey",
    "proofValue": "z58DAdFfa9SkqZMVPxAQp...jQCrfFPP2oumHKtz"
  }
}

How does this help the verifier (my prospective farming employer) confirm that my DPhil Award Credential was actually stated by the issuer (Oxford)?

The verifier can use different mathematical function to convert a signature and public key into a hash. If that hash, is the same as the hash of my DPhil Award Credential, then they know that the award must have been signed by the private key that the issuer (Oxford) created.

Selective disclosure

Many headlines surrounding digital credentials - such as this UK press release - promise the ability to "prove your age without revealing any other information."

To enable this, some Verifiable Credentials are built with the capacity to perform Selective Disclosure. In short, this allows you to take a Verifiable Credential containing lots of information, such as this Resident Card credential - and forward only part of the information, such as your birthDate to the verifier, whilst enabling the verifier to confirm that the date of birth was contained in a validly signed Verifiable Credential.

Standards Wars

Well that all makes sense … so what on earth is there to dispute? Quite a bit as it turns out! Broadly speaking the debate is around:

What the format of the information inside the digital credential should be
What mathematical function should be used for creating the signature,
How the hash of the digital credential should be created, and
How to specify what attributes are described within a credential

These are the kinds of battles that we have seen played out many times historically.

Past format wars include VHS vs. BetaMax, Blu-Ray vs. HD DVD, and, if we dare venture back to the 1800's - wars over the size of the rail gauge and type of electrical current we should use.

So - what different formats are there? Who is backing them? How do they compare?

There are three key players in the space: The World Wide Web Consortium (W3C), the International Standards Organisation (ISO), and the Internet Engineering Task Force (IETF).

diagram

W3C Specifications

The W3C were first to work on many standards around Digital Credentials, after the formation of a Credentials Community Group in 2014. By 2017, this group had published their Verifiable Claims Data Model and Representations 1.0 which defined how to express signed credentials similar to the one shown in our earlier discussion of the tech. This specification was prescriptive of core functionality such as how to sign credentials, describe core "metadata" such as who issued the credential, when the credential was issued and who the credential is about. The specification intentionally left the task of defining the data structures of domain specific credentials - such as a diploma credential or digital driver's license out of scope. Instead, allowing arbitrary credential types to be listed.

Even within this W3C specification there is a tension in the format that should be used to describe the content of credentials. The specification provided a description of how to describe credentials using both JSON and JSON-LD. The Linked Data community advocated for the use of an RDF data model for its semantic richness, extensibility, and interoperability, aligning credentials with the broader Semantic Web vision - and compromised to use JSON-LD as the encoding for this data model.

This data model is what backs Enterprise Knowledge Graphs such as the Google Knowledge Graph. A key feature of this data model, is that it supports contextual understanding. Suppose I have the following credential:

{
  ...
  "type": ["VerifiableCredential", "CustomExt12"],
  "referenceNumber": 83294847,
  ...
}

This reference number could refer to any number of things; a customer support ticket, a product identifier, or a transaction receipt. So in order to understand how to use this information, I need to have a pre-defined understanding of what a CustomExt12 credential describes. This also makes it difficult to integrate information from multiple credentials; as we need to keep track of the contextual information of which credential they were extracted from.

The use of RDF - encoded as JSON-LD - within Verifiable Credential standards provides another option. Here, all of this context is applied to the referenceNumber term - in JSON-LD this can be done as follows:

{
  "@context": [
    "https://www.w3.org/ns/credentials/v2",
    "https://www.w3.org/ns/credentials/examples/v2",
    "https://extension.example/my-contexts/v1"
  ],
  ...
  "type": ["VerifiableCredential", "CustomExt12"],
  "referenceNumber": 83294847,
  ...
}

The effect of adding this @context is to establish a URI defining referenceNumber, e.g. http://example.org/schema/receipts/tesco/referenceNumber. This URL can be dereferenced (looked up) to discover a contextual description e.g.:

@prefix tesco: <http://example.org/schema/receipts/tesco/> .
@prefix recepits: <http://example.org/schema/receipts/> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

tesco:referenceNumber rdfs:subClassOf receipts:referenceNumber ;
  rdfs:name "Tesco Receipt Reference Number" ;
  rdfs:description "A unique reference number for receipt of purchasing a set of products at tesco" ;
  rdfs:domain tesco:purchase ;
  rdfs:range xsd:int .

This built in contextual information is especially useful when, for instance, we want to integrate data from many credentials that each may use the term referenceNumber to discuss different concepts (e.g. reference numbers from different types of purchases, shops etc.).

Conversely, the Security and Cryptography community pushed for plain JSON with JWT (JSON Web Tokens), to reduce implementation complexity and ease security analyses.

GPT-4.5 does a decent job of providing a slightly longer presentation of this history - which you can find here.

By 2019 this work had evolved to having the formation of a W3C endorsed working group which produced the Verifiable Credentials Data Model 1.0. This specification struck a new compromise to the data model; in particular requiring all credentials to be JSON-LD with a particular framing so that the document could be parsed both as RDF or as plain JSON. This approach comes with its own set of challenges.

ISO Specifications

Instead of defining generic credential formats, ISO has instead taken the approach of defining credentials for specific domains.

The first Verifiable Credential specification published by ISO is the Mobile driving license (mDL) specification - published in 2021. This specification defines a fixed schema for describing approximately 30 attributes in digital driver's license's - such as the drivers name, address, date of birth and the expiry date of the license. The specification expects attributes to be serialized using JSON or CBOR and thus lacks the out-of-the-box interoperability that comes with linked-data formats.

As we shall discuss in later sections, this digital driver's license specification also defines a series of bespoke transmission and query mechanisms.

This means that it is very well-defined how to build an infrastructure specifically for mDL licenses. The trade-off is that implementors need to build custom transmission flows, and query engines to support the specification. This both increases implementation burden, and hinders interoperability with non-mDL credentials.

ISO is also working on several other Verifiable Credential standards - including Cards and security devices for personal identification designed to standardize core features for electronic identity document including drivers licenses, passports, residency permits, and building passes. The underlying goal of the standard is to support interoperability between electronic identity (eID) systems. This standard also defines a range of attributes that may be required in different eID systems - extending those attributes found in the mDL license with attributes such as Business Name, Profession, and Academic Title to support workplace passes, as well as other attributes such as telephone number and email address. This specification also targets JSON and CBOR formats for encoding data in credentials - meaning that there are still interoperability challenges with systems that need to define attributes that are not defined within this document.

IETF Specifications

The IETF is also producing a set of JSON based verifiable credentials called SD-JWT-based Verifiable Credentials.

JSON Web Tokens (JWT's) are commonly used on the Web today for a range of tasks requiring signed data - for instance they are often used to prove to a website that you are logged in and allowed to access private information on a website.

Selective Disclosure (SD) which we have already discussed above, is a mechanism for proving that a subset of information within a credential is true - without revealing the whole credential to a verifier.

As the Internet Engineering Task Force (IETF) is responsible for producing a number of Internet Standards - the SD-JWT-based Verifiable Credentials has been produced with the goal of allowing digital credentials to be easily integrated into existing internet systems - such as OAuth authentication flows - which are commonly used for single sign on.

Feature	W3C Verifiable Credentials	ISO mDL (18013-5) / ISO 23220	IETF SD-JWT VC
Data Format	JSON-LD with specified framing (can be parsed as both RDF and plain JSON)	JSON or CBOR serialization with fixed schema	JSON with JWT (JSON Web Token) structure
Signature Formats	Multiple supported (Data Integrity, JSON Web Signatures)	ISO-specific cryptographic protocols	JWT signatures (RS256, ES256, etc.)
Hashing Mechanisms	Various supported (SHA-256, etc.) depending on proof type	Defined within ISO specifications	SHA-256 and other algorithms supported by JWT
Attribute Specification	Semantic, extensible via RDF/JSON-LD context definitions	Fixed schema with predefined attributes	JSON claims with selective disclosure support
Selective Disclosure	Supported through various methods (BBS+, etc.)	Limited to predefined attributes (e.g., age verification)	Native support through SD-JWT mechanisms

This table was generated with the assistance of claude-3.7-sonnet-thinking

A push for alignment

The Open Wallet Foundation, hosted by the Linux Foundation has a mission to facilitate global interoperability of verifiable credentials.

To this end, the Open Wallet Foundation has been chartered to:

develop and maintain open source code for wallets to enable and ensure wallet interoperability,

advocate for the adoption of the interoperable digital wallet technology, and

collaborate with Standards Development Organizations (SDOs) in the development and proliferation of open standards related to digital wallets The OWF will not publish a publicly available wallet (including into any application stores).

OWF, has also taken on around two dozen open source codebases in support of this mission.

A number of other alignment/harmonization efforts are also under way within standards organisations. The draft ISO/IEC 23220-2 specification, for instance, defines a "Common Development and Distribution License data model" to support mapping ISO defined credentials to the W3C Verifiable Credentials format.

Regulation Driving Data Wallets

European Digital Identity (EUDI) Regulation

After three years in the making, the EUDI (European Digital Identity) wallet officially came into force on May 20, 2024 - through the eIDAS (Electronic Identification, Authentication, and Trust Services) 2 regulation. EUDI promises to make "EU Digital Identity […] available to EU citizens, residents, and businesses who want to identify themselves or provide confirmation of certain personal information." By 2026, every EU Member State will be required to make at least one Digital Identity Wallet available to all citizens and residents.

There are three core types of credentials that are to be made available under eIDAS 2 regulation:

Electronic Attestation of Attributes (EAA) - which can be issued by any organisation that wants to make statements about a particular entity (e.g., they have a concert ticket, gym membership, or student card)
Qualified Electronic Attestation of Attributes (QEAA) - which can be issued only by Qualified Trust Service Providers to create legally binding credentials such as professional qualifications, birth cetificates, marriage licenses, property deeds and business operating licenses.
Personal Identification Data (PID) - which can be issued only by government authorities and serve as a proof of identity.

The European Union has produced an Architecture and Reference Framework, details of which are available here. This reference architecture specifies:

How to issue PID data using both the ISO mDL specification and the IEEE SD-JWT specification can be used to format the data, and how verifiers can request data using OID4VP.
That (Q)EAA's MUST be issued in accordance with either the ISO mDL data model or the W3C Verifiable Credential Data Model.

Data (Use and Access) Bill

The Data Use and Access Bill is proposed legislation currently at committee stage in the House of Commons. One mandate of the bill is to create a Digital Verification Services (DVS) Trust Framework - driven by the Secretary of State maintaining a register of service providers accredited to provide some "digital verification services" in the UK.

The Digital Identity and Attributes Framework (DIATF) has been created by the Department of Science and Technology (DSIT) in the UK, as a framework defining the services that different service providers in the UK can implement and become registered as a DVS service.

The DVS may be seen as the UK's equivalent to eIDAS regulation, whilst the DIATF may be seen as requivalent to the EU's Architecture and Reference Framework.

Notably, the DIATF is less prescriptive of which standards must be used - and places more of a focus on the roles of different service providers. In the latest iteration of this framework, 5 service providers were defined:

Source: GOV.UK: What the data bill means for digital identity

UK Digital Driver's License

Source: GOV.UK: Digital driving license coming this year

In January, the UK announced the Digital Driver's License that will be made available through a new GOV.UK App - planned to launch in the summer of 2025. Further, it is expected that there will be a digital form of all UK documents made available by 2027.

The core infrastructure backing this will be the ISO mobile Driver's License (MDL) standard.

Whatever is happening in Australia

Meanwhile, Australia has just … gotten on with the job, in most states you can download their digital driver's license today - in Queensland the license was being piloted back in 2020, and has been available statewide since November 2023. South Australia, the first in the country to launch a digital driver's licence - has had once since 2017!

Why Solid as a Holder Service should be taken seriously

What is Solid

Solid is a standard for data storage on the Web - primarily created to allow individuals to store their personal data separately from websites. This enables re-use of data across platforms, and better control over consent management. Solid is now becoming an official W3C Standard under the Linked Web Storage Working Group.

Solid has three key features: Solid-OIDC enabling Single Sign On similar to the way we "Sign in with Google", a standard HTTP interface for applications to read and write data to a Personal Online Datastore (Pod), and access controls so users can manage who can read and write data to their Pod.

The Disclaimer

Now let me be upfront about the bias here. I work with Solid - a lot.

I lead work on Solid at the Open Data Institute which stewards all opensource work on the Solid Project, am a Doctoral Student in the Ethical Web and Data Architectures (EWADA) Group at the University of Oxford, independently contribute to opensource projects for Solid technologies, and formerly worked as an Enterprise Software Engineer at Inrupt - a commercial implementor of Solid.

Solid as a Credential Holder

I am of the view that the Solid Pods are ideal for use as holder services in the verifiable credential ecosystem, it is certainly possible as a Solid Wallet has already been donated to the Open Wallet Foundation demonstrating how this can be implemented.

The advantages of using Solid as a Holder service are as follows:

Portability of credentials

Socially Aware Cloud Storage, Design Issues, Tim Berners-Lee

We've become accustomed to living in a world of data silo's - so much so that we barely notice it anymore. On most websites we find ourselves entering and re-entering the same basic mobile, email and date-of-birth to every website that we visit; and we find ourselves reconstructing the same set of contacts across Instagram, Facebook, Twitter, LinkedIn, Whatsapp, the list goes on …

Solid was created to solve this problem, providing a standard way of reading and writing data to personal cloud storage. The way it works is simple: when you log in to a Solid-compatible website with Single Sign On - all the personal data that you create gets saved to the store - and are made accessible to any other Solid-compatible applications that you use the second you hit "consent for data usage." Much easier!

The existing Apple Wallet gives us a pretty good sense of the current trajectory for digital wallets and credentials, which is:

I buy a GWR train ticket on my GWR App,
I click add to my Apple Wallet

Easy! But what if I:

Bought the ticket with a PC, and saved it to Google Wallet instead? or,
Your phone dies, and you want to access the ticket from a friends phone?

Then life is going to be a lot more difficult, because companies such as Apple want to keep these tickets closed within their ecosystem - just as they don't want your contacts or photo's to leave their ecosystem.

The good news 🎉 is that the Solid specification can be used here too - so we have a chance to intervene before this even becomes a problem.

Standard Web interface for transferring credentials

There are a lot of ways that credentials can be transferred. The ISO Mobile driving licence (mDL) standard alone defines the following credential exchange mechanisms within its standards document:

QR Code
Near-field communication (NFC)
Bluetooth Low Energy (BLE)
Wi-Fi Aware
OpenID Connect (OIDC), or
WebAPI - an HTTP interface defined within the mobile Drivers License (mDL) specification, specifically defining how these mobile Drivers License's can be transported.

Additionally, the OpenID Foundation has defined flows for credential issuance (OID4VCI) - which supports issuers sending data to holders; and presentation (OID4VP) - which defines how verifiers can request credentials from holders. These OpenID flows are designed to support the transfer any form of W3C or ISO Verifiable Credential.

The OID4VP specification even supports the Digital Credentials Query Language (DCQL) to allow the verifier to query for and filter the contents of credentials - producing a particular presentation that confroms to the verifiers query.

… thats a lot of standards!

Whilst this has been happening, W3C groups have also been busy defining better ways for browsers to operate with your confidential data - including digital credentials. The Credential Management Level 1 (creative naming!) "describes an imperative API enabling a website to request a user's credentials from a user agent, and to help the user agent correctly store user credentials for future use" the credentials in scope for this group include passwords, one time passcodes and digital credentials such as Verifiable Credentials.

On top of this API specifications such as the Credentials Handling API (CHAPI) are being developed.

Given this diaspora of transfer standards for digital credentials - it begs the question - why should we propose Solid as yet another set of interfaces for transferring data to and from holder services?

Well there are a few:

The Solid APIs are symmetric with the file system. This means that you can view your credentials through a file-system like interface - whilst they are living in cloud storage that you manage.
Whilst credentials can be made available both through the Credentials Handling API in the browser - they can also be accessed directly from personal cloud storage by service providers who have been granted consent to access the credential.
Solid provides a means to completely decouple consent management interfaces from the holder of credentials - because of the standardised Access Control mechanisms that it has.

But most importantly Solid is about far more than just credentials. In his recent article in the Financial Times; Sir Tim Berners-Lee discussed how Solid is really about taking back control of all of your data - including your social media contacts, your financial data, and your health records - to enable you to be empowered with this data. This empowerment can be as simple as revoking access of your data to platforms you no-longer trust, or porting your Facebook contacts on to Reddit through to the use of trusted AI agents that support your wellbeing.

Queryability of Verifiable Credentials

Let me again present my bias' upfront. The last 5 years of my work and research have revolved around Semantic Web Technologies - and my current research is on the very topic of Queryable Credentials, and I recently gave a talk on this topic at FOSDEM (video below).

So when I heard that there was a Digital Credentials Query Language (DCQL) as part of the OID4VP specification I was thrilled - but sadly that was short-lived. This is because the expressivity of DQCL is largely restricted to filtering operations to determine:

Which Verifiable Credentials to include as part of a Verifiable Presentation
Which subset of attributes to from those Credentials to include in the Verifiable Presentation

Ok, but surely there must be a little more capability to the current specifications than this - after all, it is promised that you can prove your age without revealing your date of birth when using digital drivers licenses - so there must be some way of querying for age …

… right?

Wrong. Taking a deeper look at the ISO Mobile driving license (mDL) reveals that the issuer (e.g. the DVLA) has to explicitly sign statements about your age; so my digital drivers license might look something like this:

{
  ...
  age: "00-00-2000",
  is_over_18: true,
  is_over_21: true,
  is_over_65: false,
  ...
}

This means that:

I have to tell the issuer (DVLA) that I want to prove I'm over 18 - when this isn't something they need to know.
I am reliant on the issuer (DVLA) to issue these statements - so if my driving authority doesn't want to issue is_over_21 statements; I may be forced to reveal my age. Whilst this is less problematic - and less likely - in the case of age; it is an issue when trying to any non-standard derivation. For example, proving non-caucasian ethnicity, without revealing the minority population that you belong to.
I cannot tell the verifier (e.g. my future employer at the tomato farm) about information that can be derived from multiple credentials. Want to prove to a car hire agency that you can drive in the UK without giving them details from your license, visa and passport; then you're out of luck!

This is a far cry from the kind of derivations that can be performed using the semantic reasoning and query engines. The good news is that it is technically feasible for the holder (you) to do deriviations such as this one and then hand them to the verifier - that's what the below talk is about.

The bigger challenge is now to get the technology production ready and standardized.

Does it even make sense to be talking about credentials?

As I discuss further here - whilst it is sensible to talk about credentials to end-users of applications; it is my position that credentials are the wrong object to be working with at a standards level. Instead, we should be talking about datasets of facts, with metadata such as signatures and proofs that attest to their integrity.