How to Verify an Email Address Without Sending a Single Message
You can learn a surprising amount about an email address before ever sending to it: syntax, MX records, SMTP acceptance, disposability. You can also fool yourself badly. Here's what each check really tells you.
EvilMail TeamJune 9, 202612 min read
# How to Verify an Email Address Without Sending a Single Message
Every signup form that says "please enter a valid email" is quietly making a bet. It wants to reject [email protected] and typos, catch fake addresses, and avoid mailing a list that's half garbage — all without actually sending anything, because sending to bad addresses is how you wreck your sender reputation. The catch is that "is this email real?" has no clean yes/no answer. What you can do is stack a series of increasingly expensive, increasingly unreliable checks, each of which narrows the uncertainty a little.
I've built this pipeline more than once, and the most valuable thing I learned is where each layer lies to you. Syntax checks reject valid addresses. SMTP probes get greylisted and blocked. Catch-all domains accept everything. Below is the honest tour: what each verification step actually proves, where it fails, and how to combine them into a deliverability score instead of a false binary — plus the ethics, because probing strangers' mail servers is not consequence-free.
Layer 1: Syntax — Cheap, Local, and Trickier Than You Think
The first check costs nothing and touches no network: is the string even shaped like an email address? Most developers reach for a regex, and most of those regexes are wrong. The full grammar in RFC 5322 permits things that look absurd — quoted local parts with spaces, comments in parentheses, and more. The fully-compliant regex is a monstrous, multi-thousand-character pattern that nobody should paste into production.
The pragmatic move is a *deliberately looser* check that accepts anything plausibly routable and rejects only the obviously broken:
python
import re
# Pragmatic, not RFC-exhaustive. Rejects the obviously broken,
# accepts anything a real mail server would plausibly route.
EMAIL_RE = re.compile(r"^[^@\s]+@[^@\s]+\.[^@\s]+$")
def syntax_ok(addr: str) -> bool:
if len(addr) > 254: # RFC 5321 length ceiling
return False
if addr.count('@') != 1:
return False
local, _, domain = addr.partition('@')
if len(local) > 64: # local part limit
return False
return bool(EMAIL_RE.match(addr))
Two philosophies collide here. A strict validator protects downstream systems but *will* reject valid addresses — internationalized addresses with Unicode, plus-addressing edge cases, unusual TLDs. A loose validator lets more through and defers the real judgment to later layers. My rule: syntax validation should only reject strings that *cannot possibly* be delivered. Everything ambiguous passes to the network checks. Rejecting a real customer because your regex disliked their + tag is a worse outcome than accepting one bad string you'll catch later.
One genuinely useful syntax-adjacent check is typo detection on the domain. gmial.com, hotnail.com, yaho.com — a Levenshtein-distance comparison against a list of the top 50 mailbox providers catches the most common real-world mistakes and lets you prompt "did you mean gmail.com?" instead of silently failing.
Layer 2: Does the Domain Even Accept Mail? (MX Lookup)
Syntax passing tells you the string is well-formed. The next question is whether the domain has any way to receive mail at all. This is a DNS lookup for the domain's MX (Mail Exchanger) records — cheap, fast, and far more informative than syntax.
From the command line:
bash
# Ask for the mail servers that handle a domain
dig +short MX gmail.com
# 5 gmail-smtp-in.l.google.com.
# 10 alt1.gmail-smtp-in.l.google.com.
# 20 alt2.gmail-smtp-in.l.google.com.
# Same question with nslookup, if dig isn't around
nslookup -type=MX gmail.com
If a domain returns MX records, some server has volunteered to accept its mail. If it returns *none*, you check for a fallback: per the mail RFCs, a domain with an A record but no MX can still receive mail at that A-record host (the "implicit MX" rule). So the real test is:
python
import dns.resolver
def domain_accepts_mail(domain: str) -> bool:
try:
mx = dns.resolver.resolve(domain, 'MX')
return len(mx) > 0
except dns.resolver.NoAnswer:
# No MX — fall back to A/AAAA (implicit MX)
try:
dns.resolver.resolve(domain, 'A')
return True
except dns.resolver.NXDOMAIN:
return False
except dns.resolver.NXDOMAIN:
return False # domain doesn't exist at all
An NXDOMAIN result is one of the few *certain* answers in this whole business: the domain does not exist, so the address cannot be real. That's a hard reject you can trust. Everything past this point gets murky.
Layer 3: SMTP RCPT Probing — and Why It Lies
Here's the technique everyone wants and everyone overtrusts. SMTP, the protocol that delivers mail, has a conversation structure. You can open that conversation, get as far as naming the recipient, and read the server's reaction — *without* ever issuing the DATA command that would actually send a message. If the server accepts the RCPT TO, the mailbox probably exists. If it rejects with a 550, it probably doesn't.
Walking the conversation by hand shows exactly what's happening:
text
$ telnet gmail-smtp-in.l.google.com 25
220 mx.google.com ESMTP ready
HELO verifier.example.com
250 mx.google.com at your service
MAIL FROM:<[email protected]>
250 2.1.0 OK
RCPT TO:<[email protected]>
250 2.1.5 OK <- mailbox accepted (probably exists)
QUIT <- we stop here; no DATA, no message sent
221 closing connection
A 250 on RCPT TO suggests the mailbox exists. A 550 5.1.1 No such user suggests it doesn't. Clean, right? It is not clean. This is the least reliable widely-used check in the entire pipeline, for reasons that compound:
Catch-all domains accept everything. A domain with a catch-all configuration answers 250 to *every* recipient, real or not. [email protected] gets the same green light as the CEO's real address. On these domains, SMTP probing tells you literally nothing.
Greylisting. Many servers deliberately return a temporary 450/451 to strangers on first contact, expecting a legitimate sender to retry later. A one-shot probe reads that as failure when the address is perfectly valid.
Anti-verification defenses. Big providers know this trick. Yahoo, Outlook, and others frequently accept *all* RCPT TO commands regardless of validity, specifically to defeat probing, and defer the real bounce until after DATA. Google increasingly rate-limits and blocks probing IPs.
Your IP gets blocked. Repeatedly connecting on port 25 and disconnecting after RCPT without sending is a textbook spammer/harvester signature. Do it at volume and your IP lands on blocklists, poisoning your *real* mail delivery. Many residential and cloud providers block outbound port 25 entirely, so the probe won't even connect.
Treat an SMTP probe as one weak signal, never a verdict. A 550 is moderately trustworthy (few servers falsely reject). A 250 means "maybe," and on a catch-all it means nothing at all — which is why the next layer exists.
Layer 4: Catch-All and Disposable Detection
Two special cases deserve their own logic because they break the layers above.
Detecting a catch-all is done by probing a definitely-fake address. Send RCPT TO for a random string that could not possibly exist — [email protected]. If the server *accepts* that, it accepts everything, so it's a catch-all, and any positive result for the real address is worthless. Flag it as "accept-all, unverifiable" rather than "valid."
Detecting disposable/temporary addresses is a different goal entirely. Services like EvilMail and its many cousins provide throwaway domains — technically valid, fully deliverable, but signaling that the user doesn't want a durable relationship with you. Detection is list-based: maintain (or subscribe to) a set of known disposable domains and match the address's domain against it. There's no protocol trick; it's a lookup. Keep in mind the list is a moving target — new disposable domains appear constantly, and blanket-blocking them is a product decision, not a validity one. A disposable address is *real*; whether you want it is your call.
Putting It Together: Score, Don't Judge
Because no single layer is authoritative, the mature output isn't valid: true/false. It's a deliverability score with an explanation, so the calling code can decide the threshold. Here's the flow as pseudo-code:
text
function verify(address):
# 1. Syntax — hard gate
if not syntax_ok(address):
return { score: 0, status: "invalid_syntax" }
domain = domain_of(address)
# 2. Domain existence & MX — hard gate
if not domain_accepts_mail(domain):
return { score: 0, status: "no_mx" }
score = 50 # syntax + MX both pass: baseline plausible
# 3. Reputation signals (list lookups, no network to target)
if is_disposable(domain): score -= 20; note("disposable")
if is_role_account(local): score -= 10; note("role: info@/admin@")
if is_typo_domain(domain): score -= 30; note("likely typo")
# 4. SMTP probe — soft signal, wrapped in caveats
mx = best_mx(domain)
if is_catch_all(mx, domain):
note("catch-all: mailbox unverifiable")
# don't add confidence we can't earn
else:
code = smtp_rcpt_check(mx, address)
if code == 250: score += 30
if code == 550: return { score: 5, status: "mailbox_not_found" }
if code in (450, 451): note("greylisted, inconclusive")
return { score: clamp(score, 0, 100), notes: notes }
The important design choices: hard gates (bad syntax, no MX, NXDOMAIN, confirmed 550) collapse the score immediately because they're the trustworthy signals. Everything else *adjusts* a score and attaches a note. A role account like info@ isn't invalid — it's just likelier to be a shared inbox that mangles your onboarding, so you dock a few points and let the caller decide. Catch-all detection prevents you from banking confidence you never actually earned.
Ethics, Rate Limits, and Not Being a Menace
Every SMTP probe is an unsolicited connection to someone else's infrastructure. A handful is invisible; a verification service hammering millions of addresses is a load and a nuisance, and it looks exactly like address harvesting for spam. Some practical and ethical guardrails:
Cache aggressively. MX records and catch-all status change rarely. Cache them for hours or days. Never re-probe the same address twice in a day.
Rate-limit per target domain. Don't open fifty connections to one provider in a second. Space them out; respect any 421 Too many connections by backing off exponentially.
Use a real HELO hostname with valid reverse DNS. Probing from an IP with no PTR record is both suspicious and likely to be rejected. If you can't set proper rDNS, you probably shouldn't be probing.
Have a valid `MAIL FROM` and honor bounces. Don't spoof.
Consider not probing at all. For most signup flows, syntax + MX + disposable-list + typo-detection catches the overwhelming majority of bad addresses with zero SMTP traffic. The RCPT probe adds marginal accuracy at real reputational cost. Reserve it for high-value list cleaning, and even then, a reputable third-party verification API that pools reputation across many callers is often the more responsible choice than rolling your own prober.
The uncomfortable truth is that the only way to *know* an address accepts and reads mail is to send it something and watch what happens — a bounce, an open, a reply. Everything short of that is inference. Done well, the inference is good enough to keep your lists clean and your reputation intact. Done carelessly, you become the exact kind of traffic that made mail servers defensive in the first place. Verify like someone who has to share the network, because you do.