Replicates sax events to several sax event handlers
$saxt = new XML::Filter::SAXT ( { Handler => $out1 }, { DocumentHandler => $out2 }, { DTDHandler => $out3, Handler => $out4 } ); $perlsax = new XML::Parser::PerlSAX ( Handler => $saxt ); $perlsax->parse ( [OPTIONS] );
\s-1SAXT\s0 is like the Unix 'tee' command in that it multiplexes the input stream to several output streams. In this case, the input stream is a PerlSAX event producer (like XML::Parser::PerlSAX) and the output streams are PerlSAX handlers or filters.
The \s-1SAXT\s0 constructor takes a list of hash references. Each hash specifies an output handler. The hash keys can be: DocumentHandler, DTDHandler, EntityResolver or Handler, where Handler is a combination of the previous three and acts as the default handler. E.g. if DocumentHandler is not specified, it will try to use Handler.
In this example we use XML::Parser::PerlSAX to parse an \s-1XML\s0 file and to invoke the PerlSAX callbacks of our \s-1SAXT\s0 object. The \s-1SAXT\s0 object then forwards the callbacks to XML::Checker, which will 'die' if it encounters an error, and to XML::Handler::BuildDOM, which will store the \s-1XML\s0 in an XML::DOM::Document.
use XML::Parser::PerlSAX; use XML::Filter::SAXT; use XML::Handler::BuildDOM; use XML::Checker;
my $checker = new XML::Checker; my $builder = new XML::Handler::BuildDOM (KeepCDATA => 1); my $tee = new XML::Filter::SAXT ( { Handler => $checker }, { Handler => $builder } );
my $parser = new XML::Parser::PerlSAX (Handler => $tee); eval { # This is how you set the error handler for XML::Checker local $XML::Checker::FAIL = \&my_fail;
my $dom_document = $parser->parsefile ("file.xml"); ... your code here ... }; if ($@) { # Either XML::Parser::PerlSAX threw an exception (bad XML) # or XML::Checker found an error and my_fail died. ... your error handling code here ... }
# XML::Checker error handler sub my_fail { my $code = shift; die XML::Checker::error_string ($code, @_) if $code < 200; # warnings and info messages are >= 200 }
This is still alpha software. Package names and interfaces are subject to change.
Enno Dersken is the original author.
Send bug reports, hints, tips, suggestions to T.J. Mather at <[email protected]>.