Task diff
This task was introduced in PageSeeder version 5.9605.
When the Task export is used with the attribute compareto="[version]", it generates PSML with a <compare> element for each fragment that has been edited since the version specified. The <compare> element has a <content> element which contains the content of the fragment at the time of the version.
This task adds a <diff> element under each <compare> element which is the DiffX comparison between the version being exported and the compareto version. For example:
<diff>
<fragment xmlns:dfx="http://www.topologi.com/2005/Diff-X"
xmlns:del="http://www.topologi.com/2005/Diff-X/Delete"
xmlns:ins="http://www.topologi.com/2005/Diff-X/Insert"
id="2">
<para>The <dfx:del>quick</dfx:del><dfx:ins>slow</dfx:ins></para>
<para indent="1" ins:indent="true">brown</para>
<para del:indent="1">fox</para>
<para dfx:insert="true"><dfx:ins>jumps</dfx:ins></para>
</fragment>
</diff>
- This task only works on PSML with
level="portable". - Only files with
<compare>elements are output to the destination folder. - If the diff fails for some reason, then no
<diff>element is added for that fragment.
Definition
Minimal definition:
<ps:diff src="[source]" dest="[destination]"/>
Full definition:
<ps:diff src="[source]"
dest="[destination]"
maxevents="[maximum diff events]"
granularity="[character|word|space_word|text]"
whitespace="[compare|preserve|ignore]">
<files>
<include name="[name]" />
<exclude name="[name]" />
</files>
</ps:diff>
Attributes
| Attribute | Description | Required | Default |
|---|---|---|---|
| src |
The source folder on the file system of the universal portable format input. For example:
| Yes | |
| dest |
The destination folder on the file system for the output files. For example:
| Yes | |
| maxevents | The maximum number of diff events allowed (see following note) | No | 4000000 |
| granularity |
Defines the granularity of the text compare used by DiffX. Allowed values are:
| No | space_word |
| whitespace |
Defines how the whitespaces are to be processed by Diff-X. Allowed values are:
| No | preserve |
Diff events are the number of elements/attributes/text in each fragment multiplied by each other. When maxevents is reached, the diff sets the coarsest granularity (text) and tries again. If events is still larger than maxevents, the <diff> element is not generated. For reasonable performance, a maximum of 4,000,000 is recommended.
Elements
Element <files>
Used to only diff certain files in the src folder.
This element might contain multiple nested <include name="" /> or <exclude name="" /> elements as defined in the following sections.
Element <include>
A pattern matching documents/folders to include. If not present, then all documents/folders are included.
| Attribute | Description | Required |
|---|---|---|
| name |
The pattern with format is similar to the file selection in other Ant tasks. Examples:
????/* | Yes |
Element <exclude>
A pattern matching documents/folders to exclude. If not present, then no documents/folders are excluded.
| Attribute | Description | Required |
|---|---|---|
| name |
The pattern with format is similar to the file selection in other Ant tasks. Examples:
| Yes |
Examples
Diff and prepare for processing
The following example adds diff elements to exported documents and then copies them back to the original folder so they can be processed.
<ps:diff src="c:\working\export" dest="c:\working\diff"/> <copy todir="c:\working\export" overwrite="true"> <fileset dir="c:\working\diff" /> </copy>
Only diff certain folders
The following example only diffs PSML files under the top level articles folder.
<ps:diff src="c:\working\export" dest="c:\working\diff">
<files>
<include name="articles/**" />
</files>
</ps:diff>