Mostly rfc822-compliant parser of received headers
use Mail::Field; my $received = Mail::Field->new('Received', $header); my $results = $received->parse_tree(); my $parsed_ok = $received->parsed_ok(); my $diagnostics = $received->diagnostics();
Don't use this class directly! Instead ask Mail::Field for new instances based on the field name!
Mail::Field::Received provides subroutines for parsing Received headers from e-mails. It mostly complies with \s-1RFC822\s0, but deviates to accommodate a number of broken MTAs which are in common use. It also attempts to extract useful information which MTAs often embed within the \*(C`(comments)\*(C'.
It is a subclass derived from the Mail::Field and Mail::Field::Generic classes.
debug Returns current debugging level obtained via the \*(C`diagnostics\*(C' method. If a parameter is given, the debugging level is changed. The default level is 3.
diagnose $received->diagnose("foo", "\n"); Appends stuff to the parser's diagnostics buffer.
diagnostics my $diagnostics = $received->diagnostics(); Returns the contents of the parser's diagnostics buffer.
parse The actual parser. Returns the object (Mail::Field barfs otherwise).
parsed_ok if ($received->parsed_ok()) { ... } Returns true if the parse succeed, or if it failed, but was permitted to fail for some reason, such as encountering evidence of a known broken (non-RFC822-compliant) format mid-parse.
parse_tree my $parse_tree = $received->parse_tree(); Returns the actual parse tree, which is where you get all the useful information. It is returned as a hashref whose keys are strings like `from', `by', `with', `id', `via' etc., corresponding to the components of Received headers as defined by \s-1RFC822:\s0 received = "Received" ":" ; one per relay ["from" domain] ; sending host ["by" domain] ; receiving host ["via" atom] ; physical path *("with" atom) ; link/mail protocol ["id" msg-id] ; receiver msg id ["for" addr-spec] ; initial form ";" date-time ; time received The corresponding values are more hashrefs which are mini-parse-trees for these individual components. A typical parse tree looks something like: { 'by' => { 'domain' => 'host5.hostingcheck.com', 'whole' => 'by host5.hostingcheck.com', 'comments' => [ '(8.9.3/8.9.3)' ], }, 'date_time' => { 'year' => 2000, 'week_day' => 'Tue', 'minute' => 57, 'day_of_year' => '1 Feb', 'month_day' => ' 1', 'zone' => '-0500', 'second' => 18, 'hms' => '21:57:18', 'date_time' => 'Tue, 1 Feb 2000 21:57:18 -0500', 'hour' => 21, 'month' => 'Feb', 'rest' => '2000 21:57:18 -0500', 'whole' => 'Tue, 1 Feb 2000 21:57:18 -0500' }, 'with' => { 'with' => 'ESMTP', 'whole' => 'with ESMTP' }, 'from' => { 'domain' => 'mediacons.tecc.co.uk', 'HELO' => 'tr909.mediaconsult.com', 'from' => 'tr909.mediaconsult.com', 'address' => '193.128.6.132', 'comments' => [ '(mediacons.tecc.co.uk [193.128.6.132])', ], 'whole' => 'from tr909.mediaconsult.com (mediacons.tecc.co.uk [193.128.6.132]) ' }, 'id' => { 'id' => 'VAA24164', 'whole' => 'id VAA24164' }, 'comments' => [ '(mediacons.tecc.co.uk [193.128.6.132])', '(8.9.3/8.9.3)' ], 'for' => { 'for' => '<[email protected]>', 'whole' => 'for <[email protected]>' }, 'whole' => 'from tr909.mediaconsult.com (mediacons.tecc.co.uk [193.128.6.132]) by host5.hostingcheck.com (8.9.3/8.9.3) with ESMTP id VAA24164 for <[email protected]>; Tue, 1 Feb 2000 21:57:18 -0500' }
Doesn't use Parse::RecDescent, which it maybe should.
Doesn't offer a `strict \s-1RFC822\s0' parsing mode. To implement that would be a royal pain in the arse, unless we move to Parse::RecDescent.
Mail::Field, Mail::Header
Adam Spiers <[email protected]>
All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.