Endearist
DE EN Get Endearist

Practice

Contact deduplication

Contact deduplication detects and merges duplicate address-book entries by matching normalized emails and phone numbers, plus fuzzy name comparison.

Duplicates are the natural state of an address book that has lived through a few phones. Every CSV import, every account you sync (Google, iCloud, Exchange, WhatsApp), and every business-card scan can mint a second copy of someone, because the systems involved share no common identifier. Deduplication is the corrective pass: find records that describe the same human, decide which one survives, and merge the rest into it without losing data.

The detection half runs on two kinds of matching. Exact-key matching normalizes a field first — lowercase and trim the email, convert the phone number to E.164 (+4915123456789) so "0151 234 56 789" and "+49 151 2345 6789" collide — then compares. Fuzzy matching handles records that share no key: edit-distance or Jaro–Winkler similarity on names, often blocked by initial letter to stay fast. Fuzzy matching is where false positives live; "Thomas Müller" matches his own son.

The merge half is policy. Union the fields where possible (a person can have three phone numbers), prefer the most recently edited value where fields conflict, and always keep an undo path. An aggressive auto-merge that guesses wrong destroys data more permanently than the duplicates ever did.

How duplicates get minted in the first place

Three factories produce most duplicates. Multi-source sync: your phone merges Google, iCloud, and SIM contacts into one view, but each source keeps its own record — disable and re-enable an account and the "merged" person splits in two. Imports without identifiers: vCards carry a UID property precisely to prevent this, but most exporters generate a fresh UID per export, so re-importing the same file duplicates everything. And human entry: you save "Sarah Yoga" at class, then "Sarah Lindqvist" appears via WhatsApp a year later. Because the causes are structural, deduplication is recurring maintenance, not a one-time cleanup.

Exact keys, fuzzy names, and the false-positive trap

A sane matcher works in tiers of confidence. Tier one: identical normalized email or E.164 phone — safe enough to propose automatically, with one caveat: shared family landlines and joint household emails do exist. Tier two: very high name similarity plus a corroborating field (same birthday, same organization). Tier three: name similarity alone — never auto-merge here. Nicknames (Bill/William, Sepp/Josef), transliteration variants (Müller/Mueller), married names, and Jr./Sr. pairs all defeat naive string distance. The cost asymmetry decides the design: a missed duplicate is clutter, a wrong merge is silent data loss in two people's records.

Running a dedup pass with Endearist

Endearist ships a contact-dedup tool that applies exactly this tiered logic: it normalizes emails and phone numbers, scores name similarity, and presents merge candidates grouped by confidence so you confirm each union instead of trusting a black box. Because Endearist is local-first, the entire comparison runs on your device — your address book is never uploaded to a server to be matched. Field unions keep every phone number and note from both records, and the import path runs the same matching, so pulling in an old vCard or CSV export proposes merges instead of silently doubling your contacts.

Try it yourself

Frequently asked questions

Why does my phone keep showing duplicate contacts?
Almost always multi-account sync. Your phone displays a merged view of contacts from Google, iCloud, Exchange, WhatsApp, and the SIM, but each account stores its own copy. When linking heuristics fail — a nickname in one source, a formal name in another — you see two entries. Fix it at the source: pick one primary account for contacts, export the rest, import into the primary, and dedupe there.
Should duplicates be merged automatically?
Only at the highest confidence tier — identical normalized email or phone — and even then with an undo. Anything based on name similarity needs human confirmation, because the failure cases (parent and child with the same name, colleagues sharing an office line, married couples on one email) are common and a bad merge mixes two people's histories irreversibly. Good tools propose in bulk but let you confirm per pair.
Which fields are safest to match duplicates on?
In descending order of reliability: a shared stable identifier like a vCard UID (rare in practice), normalized email address, phone number in E.164 form, then name plus a corroborating field such as birthday or employer. Names alone are the weakest signal — riddled with nicknames, spelling variants, and genuinely distinct people who share them. Physical addresses are nearly useless: households share them and formatting varies wildly.

Last updated: 2026-06-10

Tend relationships, not records.

Endearist is a local-first personal CRM. Free up to 25 contacts.

Start free