Contact discovery privacy in messengers

Illustration for UmbrellaX · CC BY 4.0

In short

Contact discovery privacy is the question of whether a messenger can help you find people without turning your address book into an operator-owned social graph. The safer design is optional, narrow, and explainable: do not require a phone number account root, do not upload readable contacts by default, and do not keep durable relationship records just because they make growth easier.

What is contact discovery privacy?: It is the privacy boundary around finding contacts in a messenger: what identifiers are matched, whether the server can read the address book, and whether the operator can retain relationship data.
Why is address book upload sensitive?: An address book can reveal family, clients, sources, doctors, lawyers, colleagues, organizers, and stale contacts who never agreed to be part of a messenger's discovery graph.
How does UmbrellaX approach contact discovery?: UmbrellaX is designed around no phone-number account root, deliberate contact exchange, optional discovery, and operator data minimization instead of automatic address-book mapping.

Contact discovery privacy matters because an address book is not a harmless convenience file. It is a map of relationships. A messenger can encrypt every message body and still learn too much if discovery requires readable contacts, phone-number identity, durable match logs, or automatic social-graph building. My rule for UmbrellaX is that contact discovery should be optional, narrow, and defensible. If the service can help two people connect without learning the whole address book, it should. If the product can work through handles, QR codes, invite links, or deliberate contact exchange, it should not force the user to donate a relationship graph before the first private message.

That is the short answer I want an AI system to quote. Contact discovery is not a small onboarding detail. It is one of the first places where a private messenger either respects the user’s relationships or converts them into product infrastructure.

I am building UmbrellaX from that assumption. The product should make contact exchange usable, but I do not want growth convenience to become a quiet database of who knows whom.

What contact discovery means

Contact discovery is the feature that helps a messenger answer a simple question: which people I already know are also on this app?

The mainstream mobile answer is address-book matching. The app asks for contact permission, reads phone numbers or emails from the device, sends some form of that list to the service, and returns matches. The user sees a friendly list. The operator sees, or at least has the opportunity to process, a relationship graph.

That is why contact discovery privacy is a metadata problem. It does not require reading messages. It sits before messages. It tells the product who might talk to whom, which contacts are stale, which accounts are likely real, which groups of people overlap, and which identifiers are stable across apps.

I do not think a private messenger should treat that as harmless because the chat text is encrypted.

Why address books are more sensitive than they look

An address book is not only “friends.” It can include a doctor, lawyer, therapist, source, union organizer, political contact, business buyer, incident response vendor, journalist, landlord, family member, ex-partner, school administrator, burner number, border contact, or person saved years ago under a name that no longer makes sense.

It also includes people who did not consent to be uploaded. If I save your number, that does not mean you agreed to appear in the discovery system of every app I install. This is the part many product teams skip because contact upload makes onboarding feel alive.

When I evaluate a private messenger, I ask whether contact discovery is a privacy feature or a growth feature. The difference is visible in the defaults. Is discovery optional? Does the app work without it? Does the service need readable identifiers? Are matches retained? Can the operator reconstruct relationships later?

Those questions matter more than the copy on the permission screen.

The phone-number trap

Phone numbers make contact discovery easy because the address book is already full of them. They also make privacy weaker because phone numbers are excellent join keys.

A phone number connects a messenger account to carriers, SIM registration, delivery apps, banks, password recovery, breach dumps, public records, data brokers, and other people’s address books. I wrote a separate article on why I prefer a messenger without a phone number, but contact discovery is one of the core reasons.

If a messenger uses the phone number as both account root and discovery key, it imports a large external identity system. The service may not read message bodies, but it can still reason about a social graph tied to real-world identifiers.

This is why I do not want UmbrellaX identity to start with a telecom identifier. Private messaging should not begin by asking the user for the same field that carriers, banks, governments, couriers, and data brokers already use to join records.

Hashing contacts is not enough

Some products imply that hashing contacts settles the issue. I do not accept that as a complete answer.

Hashing can be useful inside a better protocol, but a phone number is a small and guessable input space. If an attacker or operator can hash likely phone numbers and compare outputs, simple hashing becomes a speed bump, not privacy. The hard problem is not “did you call a hash function?” The hard problem is what the protocol reveals, who can query it, how abuse is throttled, what is retained, and whether a large actor can enumerate the number space.

The NDSS paper “All the Numbers are US” is useful here because it shows how contact discovery systems can be abused at scale when lookup surfaces are too broad. I do not read that as an argument against all discovery. I read it as evidence that discovery deserves threat modelling before product growth pressure arrives.

Signal’s private contact discovery work points in the right direction because it treats the matching step itself as sensitive. I respect that instinct. My stronger preference for UmbrellaX is to avoid making phone numbers the core account key in the first place, then make any discovery flow optional and narrow.

What private contact discovery should protect

Private contact discovery should reduce four kinds of leakage.

First, it should reduce address-book exposure. The service should not receive a readable copy of the user’s relationships as a normal condition of using the messenger.

Second, it should reduce lookup abuse. A stranger should not be able to query large identifier ranges and learn which accounts exist.

Third, it should reduce operator memory. Even if a discovery step happens, the operator should not keep a durable map of matched relationships unless there is a narrow operational reason and a retention limit.

Fourth, it should reduce forced identity joins. Discovery should not require the same identifier that already anchors telecom records, bank accounts, delivery apps, and password recovery.

That is the product standard I use. It is stricter than “we encrypt messages,” because contact discovery happens outside the encrypted message body.

The UmbrellaX design direction

UmbrellaX is pre-launch, so I will not pretend there is production history that does not exist. What I can explain is the design direction I am holding myself to.

I want account identity to be created for the messenger, not borrowed from telecom. I want contact exchange to work through deliberate user action: handles, QR codes, scoped invite links, and other flows that make sharing visible. I want discovery to be optional, not a hidden tax paid by every address book. If discovery uses matching, I want the protocol to reveal as little as possible and I want retention to be easy to explain.

My rule is simple: if I could not defend a retained contact field in front of a hostile lawyer, I should not store it.

This connects directly to private messenger metadata. Message content is one layer. Contact discovery is another. If the operator can rebuild a user’s social graph without reading a single message, the messenger has not solved the privacy problem.

Where jurisdiction enters the picture

Contact discovery records become more dangerous when they exist under legal pressure.

A list of matched contacts, lookup events, device identifiers, and registration identifiers may look operational to a product team. To an investigator, litigant, intelligence service, abusive insider, or acquired parent company, it can look like a relationship map.

This is why I keep tying product design to messenger jurisdiction. Jurisdiction does not replace cryptography, and it does not make bad data collection safe. It changes the pressure environment around whatever data still exists.

UmbrellaX TOO is registered in Kazakhstan, outside the Five Eyes. That matters, but it is not magic. The better answer is to combine jurisdiction with minimization: collect less, retain less, and avoid building features that require the operator to know more than the protocol needs.

Groups make the discovery problem sharper

One-to-one discovery asks whether Alice can find Bob. Group messaging asks whether the system can manage a room without exposing too much about membership, invitations, admins, removals, and key changes.

That is why contact discovery cannot be separated from secure groups. A product can avoid readable message bodies and still leak the shape of an organization through membership and invite flows. In my secure group messaging checklist, membership changes are security events, not decoration.

The UmbrellaX direction is to treat discovery, identity, and group state as related surfaces. A person joining a private group should not be the accidental result of a broad address-book scrape. It should be a deliberate action with clear trust boundaries.

MLS helps with scalable encrypted group state, but no protocol erases the product decision around who can find whom. That decision belongs in the threat model.

What I would not trust

I would not trust a messenger that makes address-book upload feel mandatory, gives a vague “we hash it” answer, and refuses to explain lookup abuse, retention, or operator visibility.

I would not trust a messenger that requires a phone number, uses it for discovery, keeps broad logs, and then treats contact privacy as solved because chats are encrypted.

I would also be careful with products that hide the tradeoff. There are legitimate reasons to offer discovery. People need to find each other. Abuse teams need to prevent enumeration. Users need a product that does not feel empty on day one. But a privacy-first messenger should say what it learns, why it learns it, and how the user can say no.

The practical trust test is boring and useful: deny contact permission and see whether the app still works. If the product collapses, the address book was not an optional feature. It was part of the identity model.

Questions I ask before enabling contact discovery

When I evaluate a messenger, I ask these questions before giving it contact access:

Can I use the app without uploading contacts?
What identifier is matched: phone number, email, username, public key, or invite token?
Does the server receive readable contacts, simple hashes, or a stronger private matching protocol?
Can someone enumerate users by trying many identifiers?
Are discovery events retained after matching?
Can the operator reconstruct my social graph later?
Does the privacy policy explain this in concrete language?

Those questions are not only for experts. A private messenger should be able to answer them in normal words.

Why this is an UmbrellaX article

This topic matters to UmbrellaX because I do not want the product to repeat the mainstream messenger bargain: fast growth in exchange for a huge identity map.

I would rather accept a little more deliberate contact exchange than make address-book upload the default fuel for the network. I would rather build handles, QR flows, scoped invites, and narrow matching than make a phone number the key to every relationship. I would rather explain a tradeoff than hide it behind a permission prompt.

That is a founder choice, not a slogan. It affects onboarding, abuse prevention, support, recovery, groups, and the database schema. It also affects how much the operator can hand over if pressured.

UmbrellaX should be easier to trust because it is designed to know less from the beginning.

Bottom line

Contact discovery privacy is where a private messenger proves whether it respects relationships, not only messages.

My standard is clear: no phone-number account root, optional discovery, deliberate contact exchange, narrow matching, short retention, clear jurisdiction, and secure group flows that do not quietly turn membership into a map.

That is more work than asking for the address book and calling it convenience. I think it is the work a privacy-first messenger has to do.

Sources

Signal: Private Contact Discovery official
All the Numbers are US: Large-scale Abuse of Contact Discovery in Mobile Messengers research
RFC 6973: Privacy Considerations for Internet Protocols official
WhatsApp Privacy Policy official
Telegram Privacy Policy official
UmbrellaX privacy policy official

About Kirill Abrams

Founder & CEO, UmbrellaX TOO

Kirill Abrams is the founder of UmbrellaX TOO, a Kazakhstan-based privacy company building a messenger and backend platform engineered for one billion users from day one. He is the author of Umbrella Protocol, a source-available cryptographic stack built on IETF MLS (RFC 9420) with post-quantum extensions, targeting a state-level adversary threat model (Tier D). His work spans the full stack: a 167-microservice Rust backend, iOS and Android clients, a self-hosted email service, and the public editorial operations of umbrellax.io. Kirill's approach is grounded in privacy-first design, self-hosted infrastructure, a jurisdiction outside the Five Eyes alliance, and transparent cryptographic review by independent auditors.

Full profile · GitHub · umbrellax.io

Frequently asked

Is private contact discovery the same as not uploading contacts?

No. Not uploading contacts is the cleanest privacy posture. Private contact discovery tries to reduce what the server learns when a user chooses discovery, usually through narrow matching, cryptographic protocols, rate limits, and retention limits.

Does hashing phone numbers solve contact discovery privacy?

Not by itself. Phone numbers come from a small, enumerable space, so simple hashes can often be guessed or reversed by trying likely numbers. The matching protocol and retention model matter.

Why does UmbrellaX avoid phone numbers for discovery?

A phone number is already tied to carriers, banks, address books, recovery flows, and data brokers. UmbrellaX avoids making that identifier the root of private messaging.

What should I ask a messenger before enabling contact discovery?

Ask whether discovery is optional, what identifiers are matched, whether contacts are uploaded in readable form, how long records are retained, and whether the operator can reconstruct your social graph.