XML::CSV: Perl extension converting csv files to xml

SYNOPSIS

  use XML::CSV;
  $csv_obj = XML::CSV->new();
  $csv_obj = XML::CSV->new(\%attr);

  $status = $csv_obj->parse_doc(file_name);
  $status = $csv_obj->parse_doc(file_name, \%attr);

  $csv_obj->declare_xml(\%attr);
  $csv_obj->declare_doctype(\%attr);

  $csv_obj->print_xml(file_name, \%attr);

DESCRIPTION

\s-1XML::CSV\s0 is a new module in is going to be upgraded very often as my time permits. For the time being it uses \s-1CSV_XS\s0 module object default values to parse the (*.csv) document and then creates a perl data structure with xml tags names and data. At this point it does not allow for a write as you parse interface but is the first upgrade for the next release. I will also allow more access to the data structures and more documentation. I will also put in more support for \s-1XML\s0, since currently it only allows a simple \s-1XML\s0 structure. Currently you can modify the tag structure to allow for attributes. No \s-1DTD\s0 support is currently available, but will be implemented in a soon coming release. As the module will provide both: object and event interfaces, it will be used upon individual needs, system resources, and required performance. Ofcourse the \s-1DOM\s0 implementation takes up more resources and in some instances timing, it's the easiest to use.

ATTRIBUTES \fInew()\fP

error_out - Turn on the error handling which will die on all errors and assign the error message to $XML::CSV::csvxml_error.

column_headings - Specifies the column heading to use. Passed as an array reference. Can be used as a supplement to using the first column in the file as the \s-1XML\s0 tag names. Since \s-1XML::CSV\s0 does not require you to parse the \s-1CSV\s0 file, you can provide your own data structure to parse.

column_data - Specifies the \s-1CSV\s0 data in a two dimensional array. Passed as an array reference.

csv_xs - Specifies the \s-1CSV_XS\s0 object to use. This is used to create custom \s-1CSV_XS\s0 object and override the default one created by \s-1XML::CSV\s0.

ATTRIBUTES \fIparse_doc()\fP

headings - Specifies the number of rows to use as tag names. Defaults to 0. Ex. {headings => 1} (This will use the first row of data as xml tags)

sub_char - Specifies the character with which the illegal tag characters will be replaced with. Defaults to undef meaning no substitution is done. To eliminate characters use "\*(L" (empty string) or to replace with another see below. Ex. {sub_char => \*(R"_\*(L"} or {sub_char => \*(R""}

ATTRIBUTES \fIdeclare_xml()\fP

version - Specifies the xml version. Ex. {version => '1.0'}

encoding - Specifies the type of encoding. \s-1XML\s0 standard defaults encoding to '\s-1UTF-8\s0' if notspecifically

           set.

Ex. {encoding => '\s-1ISO-8859_1\s0'}

standalone - Specifies the the document as standalone (yes|no). If the document is does not rely on an

             external \s-1DTD\s0, \s-1DTD\s0 is internal, or the external \s-1DTD\s0 does not effect the contents of the document,
             the standalone attribute should be set to 'yes', otherwise 'no' should be used.  For more info
             see \s-1XML\s0 declaration documentation.

Ex. {standalone => 'yes'}

ATTRIBUTES \fIdeclare_doctype()\fP

source - Specifies the source of the \s-1DTD\s0 (SYSTEM|PUBLIC) Ex. {source => '\s-1SYSTEM\s0'}

location1 - \s-1URI\s0 to the \s-1DTD\s0 file. Public \s-1ID\s0 may be used if source is \s-1PUBLIC\s0. Ex. {location1 => 'http://www.xmlproj.com/dtd/index_dtd.dtd'} or {location1 => '-//Netscape Communications//DTD \s-1RSS\s0 0.90//EN'}

location2 - Optional second \s-1URI\s0. Usually used if the location1 public \s-1ID\s0 is not found by the

            validating parser.

Ex. {location2 => 'http://www.xmlproj.com/file.dtd'}

subset - Any other information that proceedes the \s-1DTD\s0 declaration. Usually includes internal \s-1DTD\s0 if any. Ex. {subset => '\s-1ELEMENT\s0 first_name (#PCDATA)>\n<!ELEMENT last_name (#PCDATA)>'} You can even enterpolate the string with $obj->{column_headings} to dynamically build the \s-1DTD\s0. Ex. {subset => \*(L"\s-1ELEMENT\s0 $obj->{columnt_headings}[0] (#PCDATA)>\*(R"}

ATTRIBUTES \fIprint_xml()\fP

file_tag - Specifies the file parent tag. Defaults to \*(L"records\*(R". Ex. {file_tag => \*(L"file_data\*(R"} (Do not use < and > when specifying)

parent_tag - Specifies the record parent tag. Defaults to \*(L"record\*(R". Ex. {parent_tag => \*(L"record_data\*(R"} (Do not use < and > when specifying)

format - Specifies the character to use to indent nodes. Defaults to \*(L"\t\*(R" (tab). Ex. {format => \*(L" \*(R"} or {format => \*(L"\t\t\*(R"}

PUBLIC VARIABLES

$csv_obj->{column_headings} $csv_obj->{column_data}

EXAMPLES

Example #1:

This is a simple implementation which uses defaults

use \s-1XML::CSV\s0; $csv_obj = \s-1XML::CSV-\s0>new(); $csv_obj->parse_doc(\*(L"in_file.csv\*(R", {headings => 1});

$csv_obj->print_xml(\*(L"out.xml\*(R");

Example #2:

This example uses a passed headings array reference which is used along with the parsed data.

use \s-1XML::CSV\s0; $csv_obj = \s-1XML::CSV-\s0>new();

$csv_obj->{column_headings} = \@arr_of_headings;

$csv_obj->parse_doc(\*(L"in_file.csv\*(R"); $csv_obj->print_xml(\*(L"out.xml\*(R", {format => \*(L" \*(R", file_tag = \*(L"xml_file\*(R", parent_tag => \*(L"record\*(R"});

Example #3:

First it passes a reference to a array with column headings and then a reference to two dimensional array of data where the first index represents the row number and the second column number. We also pass a custom Text::CSV_XS object to overwrite the default object. This is usefull for creating your own \s-1CSV_XS\s0 object's args before using the parse_doc() method. See 'perldoc Text::CSV_XS' for different new() attributes.

use \s-1XML::CSV\s0;

$default_obj_xs = Text::CSV_XS->new({quote_char => '"'}); $csv_obj = \s-1XML::CSV-\s0>new({csv_xs => $default_obj_xs}); $csv_obj->{column_headings} = \@arr_of_headings;

$csv_obj->{column_data} = \@arr_of_data;