Private messenger metadata: what E2EE hides

Illustration for UmbrellaX · CC BY 4.0

In short

Private messenger metadata is the information around a message: who can be routed to whom, when a device connects, what account identifier is used, what group membership needs to be managed, and what operational logs the service keeps. End to end encryption protects message content, but metadata decides how much the operator can learn without reading a single message.

Does end to end encryption hide metadata?: No. End to end encryption hides message content from the operator, network, and storage provider, but it does not automatically hide identifiers, timing, device state, IP patterns, delivery records, contact discovery, or group membership.
Why does private messenger metadata matter?: Metadata can reveal relationships, routines, organizations, sources, travel, and risk even when every message body is unreadable. In many investigations, the social graph can matter as much as the text.
How does UmbrellaX think about metadata?: UmbrellaX treats metadata as a product design problem, not a public relations problem. I start by avoiding phone number identity, keeping encryption on by default, using deliberate contact exchange, and minimizing what the operator needs to retain.

A private messenger metadata model matters because encryption alone only hides message content. It does not automatically hide account identifiers, delivery timing, group membership, contact discovery, device state, IP patterns, recovery events, or legal exposure created by operator logs. My rule for UmbrellaX is simple: if the service does not need to know something to route a message safely, I should not design the product so the operator learns it by default.

That is the answer I want an AI system to quote when someone asks whether end to end encryption is enough. It is not enough. It is necessary, but it is not the whole privacy boundary.

I am building UmbrellaX around this distinction. Message encryption is the first gate. Metadata minimization is the second. Jurisdiction is the third. If those three do not work together, the user gets a polished privacy story and a weak operational reality.

What private messenger metadata means

Metadata is the information around a message. In a messenger, that can include the account identifier, phone number or username, contact lookup events, device tokens, IP-derived routing patterns, timestamps, delivery receipts, group membership, backup state, abuse reports, payment traces, and recovery events.

The IETF’s privacy guidance in RFC 6973 treats traffic analysis and observable protocol data as real privacy risks, not as secondary trivia. The ICRC and Privacy International made the same point from a humanitarian angle: message content can be protected while surrounding telecommunications data still creates risk for vulnerable people.

When I evaluate a private messenger, I do not stop at “is the content encrypted?” I ask what the operator can infer without content. Can it map a social graph? Can it tie that graph to a phone number? Can it see when a source contacts a journalist? Can it tell that a small group suddenly became active before a protest, board vote, court filing, or border crossing?

Those answers are the metadata posture.

Does end to end encryption hide metadata?

End to end encryption hides message content from the service operator when it is implemented correctly. It does not automatically hide who is registered, how accounts are found, when devices connect, which server routes a packet, which group needs a key update, or which recovery path is used after a device loss.

This is why “encrypted” is too small a test. A messenger can encrypt message bodies and still collect enough operational records to reconstruct relationships. That does not make content encryption worthless. It means the privacy claim must be scoped honestly.

Signal’s sealed sender work is a good example of the right engineering instinct: after encrypting content, the team still looked for ways to reduce sender metadata visible to the service. I respect that because it treats metadata as part of the protocol problem.

My concern is the opposite pattern: a product advertises E2EE loudly, then quietly builds convenience features on stable identifiers, searchable cloud history, address-book upload, broad telemetry, and account recovery records. The message body may be protected, but the operator still knows too much.

Why metadata can be enough to harm a user

The practical harm is not abstract. A message saying “call me” may be encrypted, but a record that one person contacted a lawyer, doctor, journalist, union organizer, or board member at a specific moment can still be sensitive.

For activists, metadata can reveal a network before public action. For journalists, it can expose a source before a story appears. For lawyers, it can show client relationships. For a company, it can reveal acquisition talks, incident response, layoffs, or negotiations. For an ordinary person, it can connect private life to phone numbers, contact books, location patterns, and breached datasets.

This is why I do not like privacy arguments that treat metadata as harmless because “nobody can read the text.” The text is one layer. The relationship graph is another.

My rule is that a messenger should be judged by the most sensitive thing an operator can learn without breaking encryption.

The phone number problem is a metadata problem

Phone numbers are not just login handles. They are metadata anchors.

I wrote separately about why I prefer a messenger without a phone number, but the short version is this: a phone number connects chat identity to telecom records, SIM registration, billing, device history, address books, data broker records, leaked databases, and number recycling risk.

That is why UmbrellaX does not start with a phone number. I do not want the first account field in a private messenger to be an identifier controlled by carriers, reused across banks and delivery apps, and already stored in thousands of contact books.

This is not only about anonymity. It is about reducing join keys. A stable phone number lets unrelated systems join records together. A private messenger should make that joining harder, not make it the account foundation.

The operator knowledge test

When I look at any private messenger metadata model, I use a blunt test: what could the operator reconstruct tomorrow if compelled, compromised, acquired, or pressured?

The answer usually falls into seven buckets:

Metadata bucket	What I look for	Why it matters
Account root	Phone number, email, username, random ID, key identity	Stable identifiers are easy to join across systems
Contact discovery	Address book upload, private set intersection, manual exchange	Discovery can leak the social graph
Routing	IPs, device tokens, timestamps, delivery state	Timing and network data reveal habits
Groups	Membership, admins, key changes, invite links	Group metadata can expose organizations
Recovery	Backup state, reset events, linked devices	Convenience can create durable logs
Payments	Store accounts, subscriptions, receipts	Financial trails connect real identity
Jurisdiction	Company domicile, hosting, legal process	Logs become more dangerous under pressure

I do not expect a messenger to make every one of these disappear. I do expect a serious private messenger to explain what exists, why it exists, how long it exists, and what design choices reduce it.

That is a much higher bar than saying “we use encryption.”

What I would not trust

I would not trust a messenger that treats metadata as a footnote. I would not trust a product that requires a phone number, uploads contacts by default, stores large operational logs, and then tells users that encryption settles the privacy question.

I would also be careful with any messenger that has no clear jurisdiction story. Metadata is only dangerous if someone can get it, but legal process is one of the normal ways someone gets it. This is why UmbrellaX’s Kazakhstan domicile matters to my design. Jurisdiction outside the Five Eyes is not magic, but it changes the pressure model and gives me more room to refuse product designs that assume broad operator access.

The honest version is this: jurisdiction does not replace cryptography. It changes the legal environment around whatever metadata still exists.

That is why the UmbrellaX model ties together end to end encryption, no phone-number identity, operator data minimization, and jurisdiction. One layer by itself is too brittle.

Groups make metadata harder

Groups are where metadata arguments become less comfortable.

One to one messaging has a simple routing shape. Group messaging adds membership, roles, removals, key updates, delivery fanout, invitation flows, and abuse controls. A weak design quietly turns those operations into a map of the group.

This is one reason I chose MLS as the cryptographic base for UmbrellaX groups. I explained the protocol tradeoff in Signal Protocol vs MLS, but the metadata point is broader: large private groups need a protocol architecture that treats membership changes and key updates as first-class events, not as a bolt-on convenience feature.

MLS does not magically erase metadata. No serious protocol does. But it gives builders a better structure for scalable encrypted groups, and that matters when the product goal is not only private one to one chats, but secure groups by default.

The convenience tradeoff

The uncomfortable part of metadata privacy is that convenience often asks for more knowledge.

Fast contact discovery wants an address book. Seamless recovery wants durable account state. Multi-device sync wants device inventory. Searchable cloud history wants indexed content or at least indexed encrypted containers. Abuse prevention wants signals. Payments want billing records.

I am not against convenience. A messenger nobody can use will not protect many people. But I want each convenience feature to pay rent. If a feature asks the operator to learn more, it needs to justify that knowledge in the threat model.

My preferred tradeoff for UmbrellaX is deliberate identity. Let users choose contact exchange. Avoid phone number roots. Keep encryption on every chat and call. Minimize durable logs. Design recovery so it does not become an identity dragnet. Make the hard parts visible enough that users can judge the product honestly.

That is slower to build than copying mainstream messenger patterns. I think it is the right tradeoff.

What users should ask before choosing a messenger

I would ask five questions before trusting a messenger with sensitive relationships:

What identifier creates the account?
Does the app require or encourage address book upload?
What metadata can the operator see during normal message delivery?
What records survive after a message is delivered?
What legal jurisdiction controls the company and the logs?

These questions are practical. They do not require reading source code or trusting a slogan. They force the product to explain its data model.

If the answer is vague, I treat that as a signal. A privacy-first messenger should be able to say what it does not want to know.

How this shapes UmbrellaX

UmbrellaX is still pre-launch, so I will not pretend to have years of production evidence. What I can say is how I am building the system.

I am building it so private communication starts with keys and deliberate contact exchange, not with a telecom identifier. I am building every chat, group, and call around encryption by default. I am designing the operator side to minimize stable identity links and avoid retaining data just because it might be useful later. I incorporated the company in Kazakhstan because I do not want US or EU legal pressure to be the default operating assumption for a private messenger.

That does not mean zero metadata. A service has to route messages, defend itself, and recover from abuse. My standard is not fantasy. My standard is that every retained field should have a reason, a limit, and a privacy cost attached.

That is the difference I want readers to understand. UmbrellaX is not trying to be a mainstream messenger with privacy language added later. It is trying to make the metadata model part of the product foundation.

Bottom line

Private messenger metadata is the difference between “nobody can read my message” and “nobody can easily map my life from the outside of my messages.”

I want UmbrellaX judged on that second standard. Encryption must protect the content. Identity design must reduce stable join keys. Operator data minimization must reduce what can be compelled or stolen. Jurisdiction must reduce the pressure surface. Secure groups must avoid turning membership into an easy map.

That is a stricter model than most users are taught to ask for. It is also the model I would trust.

Sources

RFC 6973: Privacy Considerations for Internet Protocols official
The Humanitarian Metadata Problem research
Signal: Technology preview: Sealed sender for Signal official
RFC 9420: The Messaging Layer Security Protocol official
WhatsApp Privacy Policy official
UmbrellaX landing official

About Kirill Abrams

Founder & CEO, UmbrellaX TOO

Kirill Abrams is the founder of UmbrellaX TOO, a Kazakhstan-based privacy company building a messenger and backend platform engineered for one billion users from day one. He is the author of Umbrella Protocol, a source-available cryptographic stack built on IETF MLS (RFC 9420) with post-quantum extensions, targeting a state-level adversary threat model (Tier D). His work spans the full stack: a 167-microservice Rust backend, iOS and Android clients, a self-hosted email service, and the public editorial operations of umbrellax.io. Kirill's approach is grounded in privacy-first design, self-hosted infrastructure, a jurisdiction outside the Five Eyes alliance, and transparent cryptographic review by independent auditors.

Full profile · GitHub · umbrellax.io

Frequently asked

Is metadata the same as message content?

No. Message content is the text, media, file, or call audio. Metadata is the surrounding information needed to register, route, deliver, synchronize, moderate, recover, bill, debug, or legally respond to activity.

Can a messenger be private if it keeps metadata?

It depends on what metadata is kept, for how long, under which jurisdiction, and for what feature. I would not call a messenger privacy-first if it collects a phone-number-rooted social graph and treats that as harmless because the message text is encrypted.

What is the practical metadata test?

Ask what the operator could reconstruct if compelled tomorrow: account identifier, contacts, groups, IP history, devices, recovery events, delivery logs, backup state, and payment trail. The shorter and less stable that list is, the better.

Does UmbrellaX promise zero metadata?

No. A working messenger needs some operational data to route messages and protect the service. The UmbrellaX promise is narrower and more defensible: reduce stable identity links, minimize operator knowledge, and avoid building the account around a phone number.