Pod::Abstract::Path: Search for pod nodes matching a path within a document tree.

SYNOPSIS

 /head1(1)/head2          # All head2 elements under
                          # the 2nd head1 element
 //item                   # All items anywhere
 //item[@label =~ {^\*$}] # All items with '*' labels.
 //head2[/hilight]        # All head2 elements containing
                          # "hilight" elements

 # Top level head1s containing head2s that have headings matching
 # "NAME", and also have at least one list somewhere in their
 # contents.
 /head1[/head2[@heading =~ {NAME}]][//over]

 # Top level headings having the same title as the following heading.
 /head1[@heading = >>@heading]

 # Top level headings containing at least one subheading with the same
 # name.
 /head1[@heading = ./head2@heading]

DESCRIPTION

Pod::Abstract::Path is a path selection syntax that allows fast and easy traversal of Pod::Abstract documents. While it has a simple syntax, there is significant complexity in the queries that you can create.

Not all of the designed features have yet been implemented, but it is currently quite useful, and all of the filters in \*(C`paf\*(C' make use of Pod Paths.

\s-1SYMBOLS:\s0

/

Selects children of the left hand side.

//

Selects all descendants of the left hand side.

.

Selects the current node - this is a \s-1NOP\s0 that can be used in expressions.

..

Selects the parrent node. If there are multiple nodes selected, all of their parents will be included.

^

Selects the root node of the tree for the current node. This allows you to escape from a nested expression. Note that this is the \s-1ROOT\s0 node, not the node that you started from. If you want to evaluate an expression from a node as though it were the root node, the easiest ways are to detach or dup it - otherwise the root operator will find the original root node.

name, #cut, :text, :verbatim, :paragraph

Any element name, or symbolic type name, will restrict the selection to only elements matching that type. e.g, "\*(C`//:paragraph\*(C'" will select all descendants, anywhere, but then restrict that set to only \*(C`:paragraph\*(C' type nodes. Names together separated by spaces will match all of those names - e.g: \*(C`//head1 over\*(C' will match all lists and all head1s.

&, | (union and intersection)

Union will take expressions on either side, and return all nodes that are members of either set. Intersection returns nodes that are members of \s-1BOTH\s0 sets. These can be used to extend expressions, and within [ expressions ] where a path is supported (left side of a match, left or right side of an = sign). These are \s-1NOT\s0 logical and/or, though a similar effect can be induced through these operators. The named attribute of the nodes on the left hand side. Current attributes are @heading for head1 through head4, and @label for list items.

[ expression ]

Select only the left hand elements that match the expression in the brackets. The expression will be evaluated from the point of view of each node in the current result set. Expressions can be:

Any regular path will be true if there are any nodes matched. The above example will be true if there are any head2 nodes as direct children of the selected node. A regex match will be true if the left hand expression has nodes that match the regular expression between the braces on the right hand side. The above example will match anything with a heading containing \*(L"\s-1FOO\s0\*(R". Optionally, the right hand closing brace may have the \*(C`i\*(C' modifier to cause case-insensitive matching. i.e \*(C`[@heading =~ {foo}i]\*(C' will match \*(C`foo\*(C' or \*(C`fOO\*(C'. Reverses the remainder of the expression. The above example will match anything without a child head2 node. Matches nodes where the operator is satistied for at least one pair of nodes. The right hand expression can be a constant string (single quoted: 'string', or a second expression. If two expressions are used, they are matched combinationally - i.e, all result nodes on the left are matched against all result nodes on the right. Both sides may contain nested expressions. The following Perl compatible operators are supported: String: \*(C` eq gt lt le ge ne \*(C' Numeric: \*(C`== < > <= >= !=\*(C'

PERFORMANCE

Pod::Abstract::Path is not designed to be fast. It is designed to be expressive and useful, but it involves sucessive expand/de-duplicate/linear search operations and doing this with large documents containing many nodes is not suitable for high performance systems.

Simple expressions can be fast enough, but there is nothing to stop you from writing \*(L"//[<condition>]\*(R" and linear-searching all 10,000 nodes of your Pod document. Use with caution in interactive systems.

INTERFACE

It is recommended you use the \*(C`<Pod::Abstract::Node-\*(C'select>> method to evaluate Path expressions.

If you wish to generate paths for use in other modules, use \*(C`parse_path\*(C' to generate a parse tree, pass that as an argument to \*(C`new\*(C', then use \*(C`process\*(C' to evaluate the expression against a list of nodes. You can re-use the same parse tree to process multiple lists of nodes in this fashion.

METHODS

filter_unique

It is possible during processing - especially using ^ or .. operators - to generate many duplicate matches of the same nodes. Each pass around the loop, we filter to unique nodes so that duplicates cannot inflate more than one time.

This effectively means that \*(C`//^\*(C' (however awful that is) will match one node only - just really inefficiently.

parse_path

Parse a list of lexemes and generate a driver tree for the process method. This is a simple recursive descent parser with one element of lookahead.

AUTHOR

Ben Lilburne <[email protected]>

COPYRIGHT AND LICENSE

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

Pod::Abstract::Path (3pm)