How to convert XML data to PSML documents
|Skills required||XML, XSLT|
|Time required (minutes)||30|
Content of PSML documents often comes from other systems. This can mean converting from a variety of possible XML structures. This tutorial will demonstrate how to convert XML generated by Wikipedia to PSML documents using XSLT . The PSML will then be uploaded to PageSeeder for checking the results.
To complete this tutorial requires:
- Software to read and write zip files.
- Access to the command prompt on the computer running the tutorial (on Windows run
- A decent text editing program such as Notepad++, Sublime or something that is not Windows Notepad. Notepad will have problems with the line endings.
- Access to a PageSeeder server with at least a contributor role on the tutorial group.
All the necessary files for this tutorial are on Github .
Installing an XSLT processor
XSLT is a W3C standard programming language primarily designed to process XML content. One task it is particularly well suited to is convert XML to other syntax such as plain text or alternate structures such as HTML.
To interpret and run XSLT code requires an XSLT processor. The steps below will install the Saxon XSLT processor on Windows.
- If it is not already present, install Java as follows:
- Go to https://www.oracle.com/java and choose Java for Developers / Java SE.
- Download the JDK or JRE .exe for windows (x64 for 64 bit) and run it.
- Go to http://sourceforge.net/projects/saxon/files/ , download the latest SaxonHE
.zipfile, unzip and install it.
Run the XSLT code
- Copy the following files from Github to a tutorial folder, for example
wikipediafilms.xml– the source XML data.
films.xsl– the XSLT code
- Open a command prompt in the folder with the files from the previous step and use Saxon to process the XML with the XSLT code. For example:
> java -jar c:\saxon\saxon9he.jar -s:wikipediafilms.xml -xsl:films.xsl -o:output.txt
This should create files in the current folder according to the following naming pattern:
film-1.psml film-2.psml film-3.psml etc....
Before continuing, open some files in a text editor to check the content. Also as reference for XSLT conversion code or PSML markup, review
Adding images to the collection requires storing the image files in a paths relative to the text. For example, the following:
Copy the film images from Github to this folder.
Package and upload the PSML
The final step requires moving the data from a local file system into PageSeeder. Do this via the following steps:
- 'Zip' the folder with the *
.psmland image files into a single zip archive.
- Upload the archive to the PageSeeder group and select the unzip icon (see image below).
- After unzipping the archive, simply continue through the upload.
The film documents can now be viewed in PageSeeder and can be used in conjunction with other tutorials to further explore PageSeeder development. Recommended tutorials are:
- How to convert XML data to PSML documents with XRefs– learn how to link PSML document with cross-references (XRefs).
- Upload and link spreadsheet data with documents – learn how to integrate Excel data with PageSeeder documents.
- Create a searchable index for an external website – use the PageSeeder eco-system to build search capabilities into an external website.
Created on , last edited on