Skip to main content

 Documents

Handling and managing documents

Validating documents

Overview

PageSeeder has in-built document validation technology based on Schematron . Schematron is an ISO standard for processing XML documents and the following screenshot shows a PSML document validated using the Best Practice schema that comes with PageSeeder.

This schema demonstrates a small sample of Schematron capabilities and can be customized to suit more specific needs. When authoring documents, there are several advantages to Schematron when compared to conventional XML schemas (W3C XSD files), including:

  • Much better error and warning messages – instead of cryptic parser messages designed for programmers, Schematron messages are written for specific circumstances (for example, do not embed a photo without including a credit and copyright statement). This improves productivity and reduces training and author frustration.
  • Ability to prioritize information – all information is not equal. So while all documents created in PageSeeder are always valid against the PSML schema, Schematron makes it straightforward to add extra checking to areas of specific importance such as complex references, metadata or detailed content such as addresses or part numbers.

Although different to the approach used by conventional XML editing tools, Schematron allows developers to validate for more constructs than W3C XML schemas and requires less time to implement.

Usage

Single file validation

PageSeeder provides a icon in the right margin for all PageSeeder documents. It uses the Schematron schema specific to the document if any.

Document validation panel

Batch validation

A more powerful feature of PageSeeder is the ability to validate an entire collection of documents. This can be done by selecting specific documents from search results or by validating a folder.

Batch validation is very useful for QA or when the structure or semantics of documents needs to evolve.

For example, batch validation can be used for:

  • Checking that assets match specific constraints (for example, their dimension or resolution) before being published.
  • Ensuring all xrefs are resolved.
  • Diagnosing any structural issue that might not be supported in a publish process.
  • Ensuring domain specific semantics.

Configuring a Schematron schema

The easiest way to configure a Schematron for a specific document type is to use the Template configuration page.

Special URLs

When running Schematron validations in PageSeeder, the document() function can be used with the following special URLs to get additional data in PageSeeder:

[URI path or URL]

A URI path for a document (for example /ps/acme/specs/mydoc.psml) or a full URL returns the PSML for that document or URL. In PageSeeder 5.9811 or higher, non-PSML documents and URLs return the Universal Metadata Format. Example:

<!-- Match cross references -->
<sch:rule context="xref|blockxref">

  <!-- Check referenced document contains a heading2 -->
  <sch:assert test="document(@href)//heading[@level='2']">Document
    <sch:value-of select="@href"/> has no heading2.
  </sch:assert>

</sch:rule>

ps:search

This returns search results from the group where the validation was launched and can use any parameters from the Group search service. Example:

<sch:let name="definitions"
         value="document(
                'ps:search?filters=psdocumenttype:definition)" />

If the parameter project=myproject is used, then all groups the user can view under that project are searched. To restrict the groups, use the groups parameter with a comma-separated list of group names. To maximize performance, the results of up to 30 searches are cached and reused for the validation of each single folder or batch search. Requires PageSeeder 5.9800 or higher.

ps:source-metadata

This uses the same parameters and returns the same XML as the Get externalURI source metadata forURL service. Requires PageSeeder 5.9807 or higher. Example:

<sch:let name="url-metadata"
         value="document(concat(
         'ps:source-metadata?method=head&amp;url=',
         encode-for-uri(@href)))"/>

As of PageSeeder 5.9900, if no url parameter is supplied, the URL for the URI being validated is used. This is recommended when validating URLs as it also throttles the validation so a host isn’t sent too many requests. Example:

<sch:let name="url-metadata"
         value="document('ps:source-metadata?method=head')"/>

ps:self

This returns the same XML as the Get self service. Requires PageSeeder 5.9811 or higher. Example:

<sch:let name="self"
         value="document('ps:self')"/>
<sch:let name="memberid"
         value="$self/member/@id" />

ps:workflow

This returns the same XML as the Get URI workflow service. Requires PageSeeder 5.9811 or higher. Example:

<sch:let name="workflow"
         value="document('ps:workflow')" />

Sample code

There are several examples of Schematron rules in the Schematron code samples.

Created on , last edited on