Thursday, February 22, 2007

Parsing lists of e-mail addresses

I came across a situation where I had to parse a list of e-mail addresses. E-mail clients these days take e-mail addresses in two forms: one showing the name of the individual as well as their e-mail address, and one with only the e-mail address.

When multiple e-mail addresses are listed, they are separated by commas, whether they're of the full form or of the simple form.

When I had to extract the list of e-mail addresses initially, I assumed only that I could separate them using commas. This would capture a list such as the following.

"Joshua Go" <>,

It would capture two e-mail addresses: "Joshua Go" <> and

A problem arose when I came across one form of the full e-mail address that threw off my simple parsing technique: the occurence of e-mail addresses such as "Go, Joshua" <>.

Since I am no master of regular expressions, and working with regular expressions in Java has somewhat been painful for me, I decided to review my EBNF parsing.

The following is the EBNF syntax, from what I know.
EmailAddressList    = GeneralEmailAddress [ ',' EmailAddressList ] ;
GeneralEmailAddress = [ RecipientName ] '<' EmailAddressOnly '>'
| EmailAddressOnly ;
EmailAddressOnly = Username '@' Domain ;

My co-worker, Wilson, pointed out that I defined neither RecipientName, Username, nor Domain. For that, I cite the practical demands of industry as my explanation for not adhering to strict academic formality. I also omit it for clarity. Basically, assume that they'll just be alphanumeric (letters and numbers).

Perhaps in a later post, I'll put up the source code to the parser. As it stands, I've yet to move it over from being a test program to being integrated with the rest of our product.


longge said...

And Tag Heuer Carrera Watches was never too arrogant and self-satisfied. It was aswell in 1969 that Heuer acquired abundant success and accustomed its solid abode a part of the affluence watch manufacturers back its accompany adventure with Breitling and Hamilton. It keeps on accomplishment baroque timepieces with the apotheosis of the brand's watchmaking accouterment and top end technology.