Skip to main content

 Tutorials

Task-driven tutorials and recipes for PageSeeder

How to import xrefs, captions and citations from DOCX

Skills requiredXML
Time required (minutes)15
Intended audienceDeveloper
DifficultyEasy
CategoryDocument

Objective

This tutorial demonstrates how to import xrefs, captions and citations from a DOCX file as PageSeeder PSML documents.

It is design to work with a DOCX file exported from PageSeeder using specific styles but could be modified to use other styles.

Prerequisites

To complete this tutorial requires:

  • PageSeeder administrator access to a server with PageSeeder version 6.2 or later installed.

All the files for this tutorial are on Github .

Tutorial

Project setup and installing configuration bundle

  1. Login to PageSeeder as an administrator.
  2. Go to the project where you would like to install these capabilities. Alternatively create a new project by clicking the System administration button at top right. Then click on New project under Projects & Groups and enter a project name, plus any description, then Submit.
  3. Click the Project administration button at the top.
  4. Select Template files from the left menu and click the Create button at the bottom of the page to open the Samples dialog.
  5. Install a configuration bundle – Click the Bundle button, then in the Bundle type field, select Document type from the dropdown.
    1. Click the Select button to the right of a bundle – for this example, select Advanced auto-numbered publication. This bundle is only required for exporting.
    2. Install the bundle under the root document type for your publication (e.g references, shown in the Document type field on the right of the dialog). Click Install.
  6. Install the import bundle used in Step 9 – Click the Create button again, click the Bundle button, then in the Bundle type field, select General from the dropdown.
    1. Click the Select button to the right of Import auto-numbered publication DocX.
    2. Click Install.

Verify package content - project configuration files

In the Template files page check the files as follows.

Import process

The document/docx folder contains the following files:

FilenameDescription
upload/build.xmlCustom ANT script for importing the report docx as PSML.
upload/import-transform.xslXSLT to process xrefs, captions, and bibliography specific to the report on import.
word-import-config.xmlThe configuration to import the report as PSML. The custom style mappings are at the end of the file.
psml-split-config.xmlThe configuration to split the report PSML into the correct PSML documents and fragments on import.
Publish process – Import

Scroll to the bottom of the page to see the publish folder which contains the following file:

FilenameDescription
publish-config.xmlThe publish configuration for the import script above, including @override="true" on the <target> element, which means they replace the default script in this project.
Publication types

Select Template configuration from the left menu, then Publication types tab. In the first column, it should show the report publication type, which includes these files:

FilenameDescription
publication-config.xmlThe configuration file that determines numbering in PageSeeder, including heading, paragraph, and figure / table numbering, see Publications and publication types. Only required after importing.
pdf-export-config.xmlThe configuration file that determines the formatting of PSML as PDF, see Exporting to PDF. Not required for this tutorial.
word-export-config.xmlThe configuration file that determines the formatting of PSML as DOCX, see Export Microsoft Word docx config usage. Only required for exporting.
word-export-template.docxThe default DOCX template which you can customize. Only required for exporting.

If you prefer a different name to report, select Template files from the left menu, scroll down to the report folder and select Rename from the right drop-down menu. Change the name and click Rename. Remember to substitute the new name whenever this tutorial mentions report.

Uploading a publication

Upload the preprepared example publication as follows:

  1. In PageSeeder go to the project from Step 1 and create a new group by clicking the Project administration button at the top. Then select Create group from the left menu and enter a group name, plus any description, then Submit and Go to group.
  2. Click Group documents (folder icon) at the top, then click the documents folder.
  3. Select Upload document from the (+) menu top right and drag the documents-[date].zip file (downloaded from Github - see link under Prerequisites) to the upload dialogue.
  4. Click the Unzip icon next to this zip file.
  5. Click the Continue button at the bottom of the dialogue.
  6. Click the Confirm and upload button and wait for the “Successfully uploaded” message, then click Close.
  7. Click the “System Report” folder, then click the “System Report” document.
  8. Click the Document information icon on the right, click Make this document a publication, ensure there is a Publication ID and for this example, select Publication type "report". Click Submit then Save and refresh the page in your browser.

Export as a Word document

  1. Go to any document in your publication.
  2. Click the Document export (rocket) icon in the left margin.
  3. Under the Choose an export action drop down, select Export publication as custom DOCX and click Run.
  4. Open the document in Microsoft Word and see how the xrefs, captions and citations are displayed.

Import a report Word document as PSML

Import the exported docx file into a different group to avoid confusion.

  1. Click on the Group administration (wrench icon), select your project from the left menu, then click Create group.
  2. Enter a group name, plus any description, then Submit.
  3. Click Go to group and on Group documents (folder icon) at the top, then click the documents folder.
  4. Select Upload document from the (+) menu top right and drag the .docx file (you exported) to the upload dialogue.
  5. Click the Import options icon (arrow on the right of the file) and then on Import custom document as PageSeeder PSML.
  6. Click the Preview button and a preview of the import will appear. You can explore it by clicking links and using the back button at the top.
  7. Click the Back to upload button and click Save.
  8. Click the Continue button and click Confirm and upload when it appears.
  9. When the "Successfully uploaded ..." message appears click the Close button (ignore the warnings).
  10. Click the folder for your imported publication, then on the root document and you see the imported report but the Table of contents is not numbered.
  11. To apply numbering click the Document information icon on the right, then on Make this document a publication at the bottom of the panel.
  12. Ensure there is an ID and under Type select "report", then click Submit.
  13. Click the Save button at the bottom and then close the info panel by clicking the cross at top right.
  14. Refresh the page in your browser and the numbering is displayed.

This custom import does not create placeholders, split definitions into a shared library, label the appendix documents or link the bibliography to a sources library.

Created on , last edited on

Available tutorials