How to validate documents using Schematron
Skills required | Schematron |
---|---|
Time required (minutes) | 30 |
Intended audience | Developer |
Difficulty | Easy |
Category | Document |
Objective
By the end of this tutorial you will be able to use Schematron to validate film documents. This includes learning how to define validation rules and implementing them on PageSeeder.
Prerequisites
This tutorial assumes that:
- You have administrator access to a PageSeeder server with at least one group.
Tutorial
All the files for this tutorial can be found on Github .
The film document includes four sections which are:
- title;
- properties list;
- film summary;
- film poster.
Validation rules
Here is a list validation rules. For code detail please see the code example at the bottom of the page.
Properties
Property name |
Rule |
---|---|
Year |
Must be a number greater than 1900 and less than next year |
Classification |
Must be one of “M, PG, G, R18+, ... ” |
Country |
Must not be empty |
Director |
Must not be empty |
Genre |
Must not be empty. Further, check value in value set. |
Producer |
Must not be empty |
Writer |
Must not be empty |
Actor |
Must not be empty |
Fragments
Fragment name |
Rule |
---|---|
Film summary |
Must contain at least one paragraph ( |
Film poster |
Must not be empty and the link must be resolved |
Upload film documents
Download films.zip
from Github;
Go to your group on the PageSeeder server and click the Upload a document icon;
Create a new folder under 'documents' called 'films' and make sure it is selected;
Drag and drop your .zip file into the page or click Browse files to select it;
Click the Unzip icon next to the file in PageSeeder;
Click Continue and then Continue again to confirm.
Create a Schematron
In the PageSeeder group, enable developer view;
Go to the Document types page;
Once on the document type page, click Create document type... to create a new document type called 'film';
Then in the row for 'film' type, click create in the 'Schematron' column.
Edit the Schematron
Edit the Schematron configuration file and click Save. See the code examples below for code to enter.
Run the Schematron
Go back to the 'film' document folder and validate all the documents.
Alternatively, open one document and validate this document by clicking on the blue navigation bar.
Make content invalid
Edit the content fragment and make the content invalid to test validation. For example, change Year to 'abc' and save it.
View the validation error
After validating the document, you will then see the validation error message. Clicking on details... will allow you to see the rule.
When validating an entire folder, you will see a pop-up window displaying the error message. For more details, click on Validation Report. This will display the summary.
Code example
Validate properties
<!-- Set of rules applying to the document properties --> <sch:pattern name="Properties"> <!-- Classification --> <sch:rule context="property[@name='classification']"> <sch:assert test="contains('M PG G R18+', @value)"> Fragment '<sch:value-of select="../@id"/>' - classification '<sch:value-of select="@value"/>' must be G, M, PG or R18+.</sch:assert> </sch:rule> <!-- Year--> <sch:rule context="property[@name='year']"> <sch:assert test="number(@value) le number(year-from-date(current-date())) and number(@value) gt 1900"> Fragment '<sch:value-of select="../@id"/>' - year '<sch:value-of select="@value"/>' is not valid. </sch:assert> </sch:rule> <!-- Genre --> <sch:rule context="property[@name='genre']"> <sch:assert test="@count and count(value)"> Fragment '<sch:value-of select="../@id"/>' - genre should be specified at least one genre. </sch:assert> </sch:rule> <!-- Director --> <sch:rule context="property[@name='director']"> <sch:assert test="@count and count(value)"> Fragment '<sch:value-of select="../@id"/>' - director should be specified at least one name. </sch:assert> </sch:rule> <!-- Writer --> <sch:rule context="property[@name='writer']"> <sch:assert test="@count and count(value)"> Fragment '<sch:value-of select="../@id"/>' - writer should be specified at least one name. </sch:assert> </sch:rule> <!-- Producer --> <sch:rule context="property[@name='producer']"> <sch:assert test="@count and count(value)"> Fragment '<sch:value-of select="../@id"/>' - producer should be specified at least one name. </sch:assert> </sch:rule> <!-- Actor --> <sch:rule context="property[@name='actor']"> <sch:assert test="@count and count(value)"> Fragment '<sch:value-of select="../@id"/>' - actor should be specified at least one name. </sch:assert> </sch:rule> <!-- Country --> <sch:rule context="property[@name='country']"> <sch:assert test="@value"> Fragment '<sch:value-of select="../@id"/>' - country should be specified at least one country. </sch:assert> </sch:rule> </sch:pattern>
You can further check the value of 'genre' by using an external value set and easily add more values into the value set in future. Another advantage is the validation code does not need to be changed. Please see details at validate by using external code list below.
Validate summary
<sch:pattern name="summary"> <sch:rule context="section[@id='summary']"> <sch:assert test="count(descendant::para)"> Documdent '<sch:value-of select="../@id"/>' : miss film summary </sch:assert> </sch:rule> </sch:pattern>
Validate image (film poster)
<sch:pattern name="Image"> <!-- Image exists --> <sch:rule context="section[@id='image']"> <sch:assert test="count(descendant::image) and descendant::image/@src"> Document '<sch:value-of select="../@id"/>' : miss film poster. </sch:assert> </sch:rule> <!-- Image resolved --> <sch:rule context="image"> <sch:assert test="not(@unresolved) and @uriid"> Fragment '<sch:value-of select="../@id"/>' - image at <sch:value-of select="@src"/> is unresolved. </sch:assert> </sch:rule> </sch:pattern>
Using external code list
Define code list file
Create a PSML file (e.g. film_codes.psml
) as below and upload it to the PageSeeder group.
<document level="portable"> <section id="title"> <fragment id="1"> <heading level="1">film codes</heading> </fragment> </section> <section id="rules"> <fragment id="classification-codes"> <heading level="2">classification-codes</heading> <list> <item>P</item> <item>PG</item> <item>M</item> <item>R18+</item> </list> </fragment> <fragment id="genre-codes"> <heading level="2">genre-codes</heading> <list> <item>Action</item> <item>Drama</item> <item>Thriller</item> <item>Romance</item> <item>Comedy</item> </list> </fragment> </section> </document>
Declare the reference to the external code list file in schematron
<sch:pattern name="Properties"> <sch:let name="URI" value="'/ps/films/tutorial/documents/film_codes.psml'"/> <sch:let name="code-list-document" value="document($URI)" /> ... </sch:pattern>
Please confirm the URI of your code list file and replace the sample file path above (the folder path can be found by expanding the details at the top of the document view page).
Use the value in code list to check document content
Property contains single value:
<!-- Classification --> <sch:rule context="property[@name='classification']"> <sch:let name="classification-list" value="$code-list-document//fragment[@id='classification-codes']"/> <sch:assert test="$classification-list//item = @value"> Fragment '<sch:value-of select="../@id"/>' - classification '<sch:value-of select="@value"/>' is not valid. Matching values are <sch:value-of select="$classification-list/list"/> </sch:assert> </sch:rule>
Property contains multiple values:
<!-- Genre --> <sch:rule context="property[@name='genre']"> <sch:assert test="@count and count(value)"> Fragment '<sch:value-of select="../@id"/>' - genre should be specified at least one genre. </sch:assert> </sch:rule> <sch:rule context="property[@name='genre']/value"> <sch:let name="genre-list" value="$code-list-document//fragment[@id='genre-codes']"/> <sch:assert test="$genre-list//item=current()"> Fragment '<sch:value-of select="../@id"/>' - genre '<sch:value-of select="current()"/>' is not valid. </sch:assert> </sch:rule>
Reference
Validating Code Lists with Schematron