Quotes strings suitably for rfc2822 local part
Version 0.1
use Convert::YText qw(encode_ytext decode_ytext);
$encoded=encode_ytext($string); $decoded=decode_ytext($encoded);
($decoded eq $string) || die \*(L"this should never happen!\*(R";
Convert::YText converts strings to and from \*(L"YText\*(R", a format inspired by xtext defined in \s-1RFC1894\s0, the \s-1MIME\s0 base64 and quoted-printable types (\s-1RFC\s0 1394). The main goal is encode a \s-1UTF8\s0 string into something safe for use as the local part in an internet email address (\s-1RFC2822\s0).
By default spaces are replaced with \*(L"+\*(R", \*(L"/\*(R" with \*(L"~\*(R", the characters \*(L"A-Za-z0-9_.-\*(R" encode as themselves, and everything else is written \*(L"=USTR=\*(R" where \s-1USTR\s0 is the base64 (using \*(L"A-Za-z0-9_.\*(R" as digits) encoding of the unicode character code. The encoding is configurable (see below).
The module can can export \*(C`encode_ytext\*(C' which converts arbitrary unicode string into a \*(L"safe\*(R" form, and \*(C`decode_ytext\*(C' which recovers the original text. \*(C`validate_ytext\*(C' is a heuristic which returns 0 for bad input.
For more control, you will need to use the \s-1OO\s0 interface.
Create a new encoding object.
Arguments
Arguments are by name (i.e. a hash).
Arguments
a string to encode.
Returns
encoded string
Arguments
a string to decode.
Returns
encoded string
Simple necessary but not sufficient test for validity.
According to \s-1RFC\s0 2822, the following non-alphanumerics are \s-1OK\s0 for the local part of an address: \*(L"!#$%&'*+-/=?^_`{|}~\*(R". On the other hand, it seems common in practice to block addresses having \*(L"%!/|`#&?\*(R" in the local part. The idea is to restrict ourselves to basic \s-1ASCII\s0 alphanumerics, plus a small set of printable \s-1ASCII\s0, namely \*(L"=_+-~.\*(R".
The characters '+' and '-' are pretty widely used to attach suffixes (although usually only one works on a given mail host). It seems ok to use '+-', since the first marks the beginning of a suffix, and then is a regular character. The character '.' also seems mostly permissable.
David Bremner, <[email protected]<gt>
Copyright (C) 2011 David Bremner. All Rights Reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
MIME::Base64, MIME::Decoder::Base64, MIME::Decoder::QuotedPrint.