Universal Portable Format
For PageSeeder data, the Universal Portable Format is the metaphorical equivalent of a USB drive. It allows users to transport a collection of documents and images from one server to another. Using a single ZIP format file to package all contents, the features of the universal portable format are:
- A consistent, standalone, server-independent representation of a PageSeeder file set.
- Isomorphic conversion to/from PageSeeder (except for edit history and comments which belong to the group, not the document), including:
- All the necessary files, plus a manifest.
- Links fully resolved using relative paths.
- Uses the widely supported ‘zip’ format.
Processing the universal portable format is standard for several PageSeeder interfaces, including:
- Upload (input).
- Export (output).
- Start and End points for a number of Apache Ant tasks.
Structure
The universal portable format zip package has the following folders and files:
META-INF
– a folder containingmanifest.xml
and a PSML metadata file for every folder and non-PSML file in the package. For example:- folder –
myspec/images.psml
- file –
myspec/images/figure1.jpg.psml
Any META-INF files are optional when uploading the package to PageSeeder.
META-INF/_urls
– a folder containing a representation of every URL in the package, using the convention[scheme]/[host]/[port]/[Unique ID].psml
. For example:- scheme folder –
https
- host folder –
en.wikipedia.org
- port folder –
443
- unique ID file –
210022.psml
The unique ID can be any string unique for scheme/host/port
. For an xref to reference a URL, the @href
attribute must match the URL.
When uploading, there is no need to simulate the URL structure, the URL metadata files can be directly under the META-INF/_urls
path, like the following:
META-INF/_urls/210022.psml
- The actual files in the package, arranged relative to a specified context path as described in the following.
Content
The manifest.xml
lists all the documents in the export set using the following format:
<uris> <uri id="123" scheme="http" host="acme.com" port="80" path="/ps/acme/specs/documents/my%20spec.psml" decodedpath="/ps/acme/specs/documents/my spec.psml" mediatype="application/vnd.pageseeder.psml+xml" documenttype="spec" /> ... </uris>
Non-PSML document metadata is expressed as the <document>
element with the attribute level="metadata"
. There are no required sub-elements. The elements <documentinfo>
, <fragmentinfo>
, <metadata>
and <fragments>
are optional.
<document level="metadata"> <documentinfo> <uri id="234" docid="fig2" scheme="http" host="acme.com" port="80" path="/ps/acme/products/images/figure%202.jpg" decodedpath="/ps/acme/products/images/figure 2.jpg" mediatype="image/jpg"> <displaytitle>Figure 2</displaytitle> <description>Overall system diagram</description> <labels>Spec,System</labels> </uri> </documentinfo> </document>
All PageSeeder PSML files have the attribute level="portable"
on the <document>
element. The only required sub-element is <section>
. The elements <documentinfo>
, <fragmentinfo>
, <metadata>
and <toc>
are optional.
Context path
The locations of all files in the export set are defined by the context path.
The context path always starts with /ps
(the site prefix), and:
- When exporting a document – the parent folder of the document is the default value.
- When exporting a folder – the default value is the folder itself.
For example, the representation of the following file /ps/acme/specs/documents/spec.psml
is:
- with context path –
/ps/acme/specs
- the package path –
documents/spec.psml
Where files are in the same group, but outside the specified context, including them in the export requires using the _local
folder. For example, exporting the file /ps/acme/specs/images/figure1.jpg
from a context of /ps/acme/specs/documents
would produce the package path _local/images/figure1.jpg
.
A file that is not in the same group, must use the _external
folder. For example /ps/acme/products/images/figure2.jpg
- site prefix –
/ps
- project –
/acme
- group –
/products
- folder –
/images
- file –
/figure2.jpg
If the specified context and the source group is /ps/acme/specs/documents
, the file figure2.jpg
would have the package path:
_external/acme/products/images/figure2.jpg
For xref and image paths to resolve when uploading exported documents to another server, both the source and target documents must be in the context. For example, if xrefs in the group acme-specs
were pointing to documents in the acme-products
group then the context would need to be /ps/acme
.
Example
Source
Group | Path (in group) | References |
---|---|---|
acme-specs | documents/book.psml | graph.jpg, figure1.JPG, figure2.jpg |
acme-specs | documents/graph.jpg | |
acme-specs | images/figure1.JPG | |
acme-products | images/figure2.jpg |
Specifications
Parameter | Value |
---|---|
Source path | /ps/acme/specs/documents/book.psml |
Context path | /ps/acme/specs/documents |
Destination path | {OUT}/ |
Output
{OUT}/META-INF/manifest.xml {OUT}/META-INF/graph.jpg.psml {OUT}/META-INF/_local/images.psml {OUT}/META-INF/_local/images/figure1.JPG.psml {OUT}/META-INF/_external/acme/products.psml {OUT}/META-INF/_external/acme/products/images.psml {OUT}/META-INF/_external/acme/products/images/figure2.jpg.psml {OUT}/book.psml (References: graph.jpg, _local/images/figure1.JPG, _external/acme/products/images/figure2.jpg) {OUT}/graph.jpg {OUT}/_local/images/figure1.JPG {OUT}/_external/acme/products/images/figure2.jpg