Advanced

Advanced topics

Default Word Import Processing

To control the docx import process, use the Default config

Following is the default import handling for docx files:

References document

The references document type has been designed to represent a book structure, with front matter, a table of contents and then a collection of documents.

  • If the Word document has content in the Document Property Title field, it will become the Document title of the PSML Reference document.
  • The Word document filename will be the filename for the PSML Reference document.
  • If there is no Title in the Word document Properties, the docx filename will become both the Document title AND the filename of the PSML References document.
  • Content in the Word document that is before the first heading 1 will import into the first section (front matter) of the Reference document.
  • The content of headings 1 and 2 in the Word document will import as the content of cross-references in the second section of the Reference document.  These cross-references point to a collection of PSML component documents. This second section is equivalent to a Word  Table of Contents.

Component documents

  • Each component PSML document has a PSML Document title and contains the content of it's heading 1 (or heading 2) and any content following it the Word document until the next heading 1 or 2. The text of the PSML Document title is generated from the content in the first paragraph of each referenced document - being the content of the heading 1 or heading 2.
    • For example, given a Word document containing a heading 1 followed directly by a heading 2 then a paragraph and then a heading 3 and a list - this will split into a first PSML component document containing just the text of the heading 1. A second PSML component document will contain the content of the heading 2, the following paragraph, the heading 3 and the list.

Fragments within component documents

  • Upon upload, the content of a heading 1 (or heading 2) in the Word document will be in the first fragment of the component document.
  • Any content immediately following the heading will import into the second fragment, whether or not it is a heading 3 or lower, or any other content.
  • Content that is a heading 3 will be in a new fragment and any content that follows the heading will be in the same fragment until another heading 3 or a heading 4 is encountered. 
  • Content of a heading 4 and any content that follows the heading will be in a new fragment. Heading 5 or lower headings and any content that follows them will be in the same fragment as the parent heading 4.

Hyperlinks and cross-references

 

Created on , last edited on