use Text::Diff;
diff \@a, $b { STYLE => "Table" };
This is a plugin output formatter for Text::Diff that generates \*(L"table\*(R" style diffs:
+--+----------------------------------+--+------------------------------+ | |../Test-Differences-0.2/MANIFEST | |../Test-Differences/MANIFEST | | |Thu Dec 13 15:38:49 2001 | |Sat Dec 15 02:09:44 2001 | +--+----------------------------------+--+------------------------------+ | | * 1|Changes * | 1|Differences.pm | 2|Differences.pm | | 2|MANIFEST | 3|MANIFEST | | | * 4|MANIFEST.SKIP * | 3|Makefile.PL | 5|Makefile.PL | | | * 6|t/00escape.t * | 4|t/00flatten.t | 7|t/00flatten.t | | 5|t/01text_vs_data.t | 8|t/01text_vs_data.t | | 6|t/10test.t | 9|t/10test.t | +--+----------------------------------+--+------------------------------+
This format also goes to some pains to highlight \*(L"invisible\*(R" characters on differing elements by selectively escaping whitespace. Each element is split in to three segments (leading whitespace, body, trailing whitespace). If whitespace differs in a segement, that segment is whitespace escaped.
Here is an example of the selective whitespace.
+--+--------------------------+--------------------------+ | |demo_ws_A.txt |demo_ws_B.txt | | |Fri Dec 21 08:36:32 2001 |Fri Dec 21 08:36:50 2001 | +--+--------------------------+--------------------------+ | 1|identical |identical | * 2| spaced in | also spaced in * * 3|embedded space |embedded tab * | 4|identical |identical | * 5| spaced in |\ttabbed in * * 6|trailing spaces\s\s\n |trailing tabs\t\t\n * | 7|identical |identical | * 8|lf line\n |crlf line\r\n * * 9|embedded ws |embedded\tws * +--+--------------------------+--------------------------+
Here's why the lines do or do not have whitespace escaped:
Whether or not line 3 should have that tab character escaped is a judgement call; so far I'm choosing not to.
To output the raw unicode chracters consult the documentation of Text::Diff::Config. You can set the \*(C`DIFF_OUTPUT_UNICODE\*(C' environment variable to 1 to output it from the command line. For more information, consult this bug: <https://rt.cpan.org/Ticket/Display.html?id=54214> .
Table formatting requires buffering the entire diff in memory in order to calculate column widths. This format should only be used for smaller diffs.
Assumes tab stops every 8 characters, as $DIETY intended.
Assumes all character codes >= 127 need to be escaped as hex codes, ie that the user's terminal is \s-1ASCII\s0, and not even \*(L"high bit \s-1ASCII\s0\*(R", capable. This can be made an option when the need arises.
Assumes that control codes (character codes 0..31) that don't have slash-letter escapes (\*(L"\n\*(R", \*(L"\r\*(R", etc) in Perl are best presented as hex escapes (\*(L"\x01\*(R") instead of octal (\*(L"\001\*(R") or control-code (\*(L"\cA\*(R") escapes.
Barrie Slaymaker <[email protected]>
Copyright 2001 Barrie Slaymaker, All Rights Reserved.
You may use this software under the terms of the \s-1GNU\s0 public license, any version, or the Artistic license.