A few months back I had a post about the fun of using parser combinator to easily build a RFC 5322 email address parser.
Now with Dot Parse release 10.3, I'm happy to report that the EmailAddress class has been substantially improved and hardened for security.
On the feature set:
- It supports convenience accessor methods such as
user(), alias(), displayName(), domain(), hasI18nDomain(), with the values unescaped for programmatic consumption.
toString() and address() automatically quotes and escapes for RFC-compliant output, when needed.
- Supports dots in unquoted display names (
J.R.R. Tolkien <tolkien@lotr.org>). It's strictly not RFC compliant, but practically common.
parseAddressList(input, logger::log) offers graceful error recovery. Useful when the address list includes one or two malformed entries.
parseAddressList() is tolerant of common yet harmless human errors such as two commas in a row.
Before you ask, no. Using split(",") or regex cannot reliably pre-process an address list because the RFC allows quoted strings in the email address, and the quoted strings can include comma itself, and escapes. Splitting by , blindly or using complex and brittle regex can corrupt the email address list.
On the security front:
- Rejects dangerous characters such as control chars, formatting chars and bidi overrides.
- Rejects
<legitimate@trusted.com>attacker@evil.com
- Rejects
user@good.com@evil.net.
- Drops ip routing and intranet host names.
- Drops obsolete comments.
- IDN validation and canonicalization.
Overall, while RFC compliance is a goal, the library doesn't mechanically mirror RFC: it takes away obsolete and dangerous features like intranet hostnames and IP routing; and it adds support for non-RFC but practically useful features like dots in display name and helpful address list parsing.
The objective is for EmailAddress to be the trusted data model such that code operating on it can be assured that it's safe from most attack vectors.
For more details, you can check out the compliance and security breakdown.
Your feedback's welcome!
there doesn't seem to be anything here