Tutorials

Task-driven tutorials and recipes for PageSeeder

How to validate documents using Schematron

Skills requiredSchematron
Time required (minutes)30
Intended audienceDeveloper
DifficultyEasy
CategoryDocument

Objective

By the end of this tutorial you will be able to use Schematron to validate film documents. This includes learning how to define validation rules and implementing them on PageSeeder.

Prerequisites

This tutorial assumes that:

Tutorial

All the files for this tutorial can be found on Github .

The film document includes four sections which are:

  • title;
  • properties list;
  • film summary;
  • film poster.

Validation rules

Here is a list validation rules. For code detail please see the code example at the bottom of the page.

Properties

Property name

Rule

Year

Must be a number greater than 1900 and less than next year

Classification

Must be one of “M, PG, G, R18+, ... "

Country

Must not be empty

Director

Must not be empty

Genre

Must not be empty.  Further, check value in value set.

Producer

Must not be empty

Writer

Must not be empty

Actor

Must not be empty

Fragments

Fragment name

Rule

Film summary

Must contain at least one paragraph (<para>) element

Film poster

Must not be empty and the link must be resolved

Upload film documents

Download films.zip from Github;

Go to your group on the PageSeeder server and click the Upload a document icon;

Create a new folder under 'documents' called 'films' and make sure it is selected;

Drag and drop your .zip file into the page or click Browse files to select it;

Click the Unzip icon next to the file in PageSeeder;

Click Continue and then Continue again to confirm.

Create a Schematron

In the PageSeeder group, change to developer perspective (hover over the cube icon at the top left of the screen);

Select Document types from the developer tab;

Once on the document type configuration page, click Create document type to create a new document type called 'film';

Then in the row for 'film' type, click on create in the 'Schematron' column.

create_schematron.png

Edit the Schematron

Edit the Schematron configuration file and click on Save. See the code examples below for code to enter.

edit_schematron_for_validation.png

Run the Schematron

Go back to the 'film' document folder and validate all the documents.

document_collection_validation.png

Alternatively, open one document and validate this document by clicking on the blue navigation bar.

Single_document_validation.png

Make content invalid

Edit the content fragment and make the content invalid to test validation. For example, change Year to 'abc' and save it.

edit_properties.png

View the validation error

After validating the document, you will then see the validation error message. Clicking on details... will allow you to see the rule.

error_message_for_single_doc_with_detail.png

When validating an entire folder, you will see a pop-up window displaying the error message. For more details, click on Validation Report. This will display the summary.

error_message_for_folder_validation.png

 

error_message_for_folder_validation_with_detail.png

Code example

Validate properties

 <!--
    Set of rules applying to the document properties
  -->
  <sch:pattern name="Properties">

    <!-- Classification -->
    <sch:rule context="property[@name='classification']">
      <sch:assert test="contains('M PG G R18+', @value)">
      Fragment '<sch:value-of select="../@id"/>' - classification '<sch:value-of select="@value"/>' must be G, M, PG or R18+.</sch:assert>
    </sch:rule>
   
    <!-- Year-->
    <sch:rule context="property[@name='year']">
       <sch:assert test="number(@value) le number(year-from-date(current-date())) and number(@value) gt 1900">
         Fragment '<sch:value-of select="../@id"/>' - year '<sch:value-of select="@value"/>' is not valid.
       </sch:assert>
    </sch:rule>

    <!-- Genre -->
    <sch:rule context="property[@name='genre']">
       <sch:assert test="@count and count(value)">
         Fragment '<sch:value-of select="../@id"/>' - genre should be specified at least one genre.
       </sch:assert>
     </sch:rule>
   
     <!-- Director -->
    <sch:rule context="property[@name='director']">
       <sch:assert test="@count and count(value)">
         Fragment '<sch:value-of select="../@id"/>' - director should be specified at least one name.
       </sch:assert>
     </sch:rule>
    
     <!-- Writer -->
    <sch:rule context="property[@name='writer']">
       <sch:assert test="@count and count(value)">
         Fragment '<sch:value-of select="../@id"/>' - writer should be specified at least one name.
       </sch:assert>
     </sch:rule>
   
    <!-- Producer -->
    <sch:rule context="property[@name='producer']">
       <sch:assert test="@count and count(value)">
         Fragment '<sch:value-of select="../@id"/>' -  producer should be specified at least one name.
       </sch:assert>
     </sch:rule>
   
    <!-- Actor -->
    <sch:rule context="property[@name='actor']">
       <sch:assert test="@count and count(value)">
         Fragment '<sch:value-of select="../@id"/>' -  actor should be specified at least one name.
       </sch:assert>
     </sch:rule>
   
    <!-- Country -->
    <sch:rule context="property[@name='country']">
       <sch:assert test="@value">
         Fragment '<sch:value-of select="../@id"/>' -  country should be specified at least one country.
       </sch:assert>
     </sch:rule>
  </sch:pattern>

You can further check the value of 'genre' by using an external value set and easily add more values into the value set in future. Another advantage is the validation code does not need to be changed. Please see details at validate by using external code list below.

Validate summary

<sch:pattern name="summary">
     <sch:rule context="section[@id='summary']">
       <sch:assert test="count(descendant::para)">
         Documdent '<sch:value-of select="../@id"/>' :  miss film summary
       </sch:assert>
    </sch:rule>
  </sch:pattern>​

Validate image (film poster)

  <sch:pattern name="Image">
 
    <!-- Image exists -->
    <sch:rule context="section[@id='image']">      
       <sch:assert test="count(descendant::image) and descendant::image/@src">
         Document '<sch:value-of select="../@id"/>' :  miss film poster.
       </sch:assert>
    </sch:rule>
    
    <!-- Image resolved -->
    <sch:rule context="image">      
       <sch:assert test="not(@unresolved) and @uriid">
         Fragment '<sch:value-of select="../@id"/>' -  image at <sch:value-of select="@src"/> is unresolved.
       </sch:assert>
    </sch:rule>
  </sch:pattern>

Using external code list

Define code list file

Create a PSML file (e.g. film_codes.psml) as below and upload it to the PageSeeder group.

<document level="portable">
<section id="title">
  <fragment id="1">
    <heading level="1">film codes</heading>
  </fragment>
</section>
<section id="rules">
  <fragment id="classification-codes">
    <heading level="2">classification-codes</heading>
    <list>
      <item>P</item>
      <item>PG</item>
      <item>M</item>
      <item>R18+</item>
    </list>
  </fragment>
  <fragment id="genre-codes">
    <heading level="2">genre-codes</heading>
    <list>
      <item>Action</item>
      <item>Drama</item>
      <item>Thriller</item>
      <item>Romance</item>
      <item>Comedy</item>
    </list>
  </fragment>
</section>
</document>

Declare the reference to the external code list file in schematron

<sch:pattern name="Properties">
    <sch:let name="URI" value="'/ps/films/tutorial/documents/film_codes.psml'"/>
    <sch:let name="code-list-document" value="document($URI)" />
...
</sch:pattern>

Note

Please confirm the URI of your code list file and replace the sample file path above (the folder path can be found by expanding the details at the top of the document view page).

Use the value in code list to check document content

Property contains single value:

 <!-- Classification -->
    <sch:rule context="property[@name='classification']"> 
      <sch:let name="classification-list" value="$code-list-document//fragment[@id='classification-codes']"/>
      <sch:assert test="$classification-list//item = @value">
      Fragment '<sch:value-of select="../@id"/>' - classification '<sch:value-of select="@value"/>' is not valid. Matching values are <sch:value-of select="$classification-list/list"/>
      </sch:assert>
    </sch:rule>

Property contains multiple values:

<!-- Genre -->
    <sch:rule context="property[@name='genre']">
       <sch:assert test="@count and count(value)">
         Fragment '<sch:value-of select="../@id"/>' -  genre should be specified at least one genre.
       </sch:assert>
     </sch:rule>
    <sch:rule context="property[@name='genre']/value">
       <sch:let name="genre-list" value="$code-list-document//fragment[@id='genre-codes']"/>
       <sch:assert test="$genre-list//item=current()">
         Fragment '<sch:value-of select="../@id"/>' -  genre '<sch:value-of select="current()"/>' is not valid.
         </sch:assert>
     </sch:rule>​

Reference

schematron sample code 

Schematron official website 

Validating Code Lists with Schematron 

Created on , last edited on