SYNOPSIS

        use Parse::DebControl

        $parser = new Parse::DebControl;

        $data = $parser->parse_mem($control_data, $options);
        $data = $parser->parse_file('./debian/control', $options);
        $data = $parser->parse_web($url, $options);

        $writer = new Parse::DebControl;

        $string = $writer->write_mem($singlestanza);
        $string = $writer->write_mem([$stanza1, $stanza2]);

        $writer->write_file($filename, $singlestanza, $options);
        $writer->write_file($filename, [$stanza1, $stanza2], $options);

        $writer->write_file($handle, $singlestanza, $options);
        $writer->write_file($handle, [$stanza1, $stanza2], $options);

        $parser->DEBUG();

DESCRIPTION

Parse::DebControl is an easy OO way to parse debian control files and other colon separated key-value pairs. It's specifically designed to handle the format used in Debian control files, template files, and the cache files used by dpkg.

For basic format information see: http://www.debian.org/doc/debian-policy/ch-controlfields.html#s-controlsyntax

This module does not actually do any intelligence with the file content (because there are a lot of files in this format), but merely handles the format. It can handle simple control files, or files hundreds of lines long efficiently and easily.

Class Methods

  • \*(C`new()\*(C'

  • \*(C`new($debug)\*(C' Returns a new Parse::DebControl object. If a true parameter $debug is passed in, it turns on debugging, similar to a call to \*(C`DEBUG()\*(C' (see below);

  • \*(C`parse_file($control_filename,$options)\*(C' Takes a filename as a scalar and an optional hashref of options (see below). Will parse as much as it can, warning (if \*(C`DEBUG\*(C'ing is turned on) on parsing errors. Returns an array of hashrefs, containing the data in the control file, split up by stanza. Stanzas are deliniated by newlines, and multi-line fields are expressed as such post-parsing. Single periods are treated as special extra newline deliniators, per convention. Whitespace is also stripped off of lines as to make it less-easy to make mistakes with hand-written conf files). The options hashref can take parameters as follows. Setting the string to true enables the option. useTieIxHash - Instead of an array of regular hashrefs, uses Tie::IxHash- based hashrefs

    discardCase - Remove all case items from keys (not values)

    stripComments - Remove all commented lines in standard #comment format. Literal #'s are represented by ##. For instance

    Hello there #this is a comment Hello there, I like ##CCCCCC as a grey.

    The first is a comment, the second is a literal "#".

    verbMultiLine - Keep the description AS IS, and no not collapse leading spaces or dots as newlines. This also keeps whitespace from being stripped off the end of lines.

    tryGzip - Attempt to expand the data chunk with gzip first. If the text is already expanded (ie: plain text), parsing will continue normally. This could optionally be turned on for all items in the future, but it is off by default so we don't have to scrub over all the text for performance reasons.

    singleBlock - Only parse the first block of data and return it. This is useful when you have possible "junk" data after the metadata.

    strict - Tries to parse obeying the strict rules for real debian control files. This will force comment stripping for debian/control (must start line) and for other files will check if a field may span multiple lines.

    allowUnknownFields - In strict mode, allow unknown fields.

    type - If the strict option is chosen, then this parameter defines what format we have. Available formats is: - debian/control - DEBIAN/control - .dsc - .changes

  • \*(C`parse_mem($control_data, $options)\*(C' Similar to \*(C`parse_file\*(C', except takes data as a scalar. Returns the same array of hashrefs as \*(C`parse_file\*(C'. The options hashref is the same as \*(C`parse_file\*(C' as well; see above.

  • \*(C`parse_web($url, $options)\*(C' Similar to the other parse_* functions, this pulls down a control file from the web and attempts to parse it. For options and return values, see \*(C`parse_file\*(C', above

  • \*(C`write_file($filename, $data, $options)\*(C'

  • \*(C`write_file($handle, $data)\*(C'

  • \*(C`write_file($filename, [$data1, $data2, $data3], $options)\*(C'

  • \*(C`write_file($handle, [$data, $data2, $data3])\*(C' This function takes a filename or a handle and writes the data out. The data can be given as a single hashref or as an arrayref of hashrefs. It will then write it out in a format that it can parse. The order is dependent on your hash sorting order. If you care, use Tie::IxHash. Remember for reading back in, the module doesn't care. The $options hashref can contain one of the following two items: addNewline - At the end of the last stanza, add an additional newline. appendFile - (default) Write to the end of the file clobberFile - Overwrite the file given. gzip - Compress the data with gzip before writing Since you determine the mode of your filehandle, passing it along with an options hashref obviously won't do anything; rather, it is ignored. The addNewline option solves a situation where if you are writing stanzas to a file in a loop (such as logging with this module), then the data will be streamed together, and won't parse back in correctly. It is possible that this is the behavior that you want (if you wanted to write one key at a time), so it is optional. This function returns the number of bytes written to the file, undef otherwise.

  • \*(C`write_mem($data)\*(C'

  • \*(C`write_mem([$data1,$data2,$data3])\*(C'; This function works similarly to the \*(C`write_file\*(C' method, except it returns the control structure as a scalar, instead of writing it to a file. There is no %options for this file (yet);

  • \*(C`DEBUG()\*(C' Turns on debugging. Calling it with no parameter or a true parameter turns on verbose \*(C`warn()\*(C'ings. Calling it with a false parameter turns it off. It is useful for nailing down any format or internal problems.

CHANGES

Version 2.005 - January 13th, 2004

  • More generic test suite fix for earlier versions of Test::More

  • Updated copyright statement

Version 2.004 - January 12th, 2004

  • More documentation formatting and typo fixes

  • \s-1CHANGES\s0 file now generated automatically

  • Fixes for potential test suite failure in Pod::Coverage run

  • Adds the \*(L"addNewline\*(R" option to write_file to solve the streaming stanza problem.

  • Adds tests for the addNewline option

Version 2.003 - January 6th, 2004

  • Added optional Test::Pod test

  • Skips potential Win32 test failure in the module where it wants to write to /tmp.

  • Added optional Pod::Coverage test

Version 2.002 - October 7th, 2003

  • No code changes. Fixes to test suite

Version 2.001 - September 11th, 2003

  • Cleaned up more \s-1POD\s0 errors

  • Added tests for file writing

  • Fixed bug where write_file ignored the gzip parameter

Version 2.0 - September 5th, 2003

  • Version increase.

  • Added gzip support (with the tryGzip option), so that compresses control files can be parsed on the fly

  • Added gzip support for writing of control files

  • Added parse_web to snag files right off the web. Useful for things such as apt's Sources.gz and Packages.gz

Version 1.10b - September 2nd, 2003

  • Documentation fix for ## vs # in stripComments

Version 1.10 - September 2nd, 2003

  • Documentation fixes, as pointed out by pudge

  • Adds a feature to stripComments where ## will get interpolated as a literal pound sign, as suggested by pudge.

Version 1.9 - July 24th, 2003

  • Fix for warning for edge case (uninitialized value in chomp)

  • Tests for \s-1CRLF\s0

Version 1.8 - July 11th, 2003

  • By default, we now strip off whitespace unless verbMultiLine is in place. This makes sense for things like conf files where trailing whitespace has no meaning. Thanks to pudge for reporting this.

Version 1.7 - June 25th, 2003

  • \s-1POD\s0 documentation error noticed again by Frank Lichtenheld

  • Also by Frank, applied a patch to add a \*(L"verbMultiLine\*(R" option so that we can hand multiline fields back unparsed.

  • Slightly expanded test suite to cover new features

Version 1.6.1 - June 9th, 2003

  • \s-1POD\s0 cleanups noticed by Frank Lichtenheld. Thank you, Frank.

Version 1.6 - June 2nd, 2003

  • Cleaned up some warnings when you pass in empty hashrefs or arrayrefs

  • Added stripComments setting

  • Cleaned up \s-1POD\s0 errors

Version 1.5 - May 8th, 2003

  • Added a line to quash errors with undef hashkeys and writing

  • Fixed the Makefile.PL to straighten up DebControl.pm being in the wrong dir

Version 1.4 - April 30th, 2003

  • Removed exports as they were unnecessary. Many thanks to pudge, who pointed this out.

Version 1.3 - April 28th, 2003

  • Fixed a bug where writing blank stanzas would throw a warning. Fix found and supplied by Nate Oostendorp.

Version 1.2b - April 25th, 2003

Fixed:

  • A bug in the test suite where IxHash was not disabled in 40write.t. Thanks to Jeroen Latour from cpan-testers for the report.

Version 1.2 - April 24th, 2003

Fixed:

  • A bug in IxHash support where multiple stanzas might be out of order

Version 1.1 - April 23rd, 2003

Added:

  • Writing support

  • Tie::IxHash support

  • Case insensitive reading support

Version 1.0 - April 23rd, 2003

  • This is the initial public release for \s-1CPAN\s0, so everything is new.

BUGS

The module will let you parse otherwise illegal key-value pairs and pairs with spaces. Badly formed stanzas will do things like overwrite duplicate keys, etc. This is your problem.

As of 1.10, the module uses advanced regexp's to figure out about comments. If the tests fail, then stripComments won't work on your earlier perl version (should be fine on 5.6.0+)

TODO

Change the name over to the Debian:: namespace, probably as Debian::ControlFormat. This will happen as soon as the project that uses this module reaches stability, and we can do some minor tweaks.

COPYRIGHT

Parse::DebControl is copyright 2003,2004 Jay Bonci <[email protected]>. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.