Tutorials

Task-driven tutorials and recipes for PageSeeder

How to convert XML data to PSML documents

Skills requiredXML, XSLT
Time required (minutes)30
Intended audienceDeveloper
DifficultyEasy
CategoryDocument

Objective

Content of PSML documents often comes from other systems. This can mean converting from a variety of possible XML structures. This tutorial will demonstrate how to convert XML generated by Wikipedia to PSML documents using XSLT . The PSML will then be uploaded to PageSeeder for checking the results.

Prerequisites

To complete this tutorial requires:

  • Software to read and write zip files.
  • Access to the command prompt on the computer running the tutorial (on Windows run cmd);
  • A decent text editing program such as Notepad++, Sublime or something that is not Windows Notepad. Notepad will have problems with the line endings.
  • Access to a PageSeeder server with at least a contributor role on the tutorial group.

All the necessary files for this tutorial are on Github .

Tutorial

Installing an XSLT processor

XSLT is a W3C standard programming language primarily designed to process XML content. One task it is particularly well suited to is convert XML to other syntax such as plain text or alternate structures such as HTML.

To interpret and run XSLT code requires an XSLT processor. The steps below will install the Saxon XSLT processor on Windows.

  • If it is not already present, install Java as follows:
  1. Go to https://www.oracle.com/java  and choose Java for Developers / Java SE.
  2. Download the JDK or JRE .exe for windows (x64 for 64 bit) and run it.

Run the XSLT code

  • Copy the following files from Github to a tutorial folder, for example c:\ps\films
    • wikipediafilms.xml – the source XML data.
    • films.xsl – the XSLT code
  • Open a command prompt in the folder with the files from the previous step and use Saxon to process the XML with the XSLT code. For example:
> java -jar c:\saxon\saxon9he.jar -s:wikipediafilms.xml -xsl:films.xsl -o:output.txt

This should create files in the current folder according to the following naming pattern:

 film-[n].psml

or

film-1.psml
film-2.psml
film-3.psml
etc....

Before continuing, open some files in a text editor to check the content. Also as reference for XSLT conversion code or PSML markup, review wikipediafilms.xml and films.xsl.

Include images

Adding images to the collection requires storing the image files in a paths relative to the text. For example, the following:

c:\ps\films\
c:\ps\films\images

Copy the film images from Github to this folder.

Package and upload the PSML

The final step requires moving the data from a local file system into PageSeeder. Do this via the following steps:

  • 'Zip' the folder with the *.psml and image files into a single zip archive. 
  • Upload the archive to the PageSeeder group and select the unzip icon (see image below).
  • After unzipping the archive, simply continue through the upload.

ps_upload-films.jpg

Next Steps

The film documents can now be viewed in PageSeeder and can be used in conjunction with other tutorials to further explore PageSeeder development. Recommended tutorials are:

Created on , last edited on