Default Word Import Processing
To control the docx import process, use the Default config.
Following is the default import handling for docx files:
The references document type has been designed to represent a book structure, with front matter, a table of contents and then a collection of documents.
- If the Word document has content in the Document Property Title field, that content becomes the Document title of the PSML Reference document.
- The Word document filename is the filename for the PSML Reference document.
- If there is no Title in the Word document Properties, the docx filename becomes both the Document title AND the filename of the PSML References document.
- Content in the Word document that is before the first heading 1 imports into the first section (front matter) of the Reference document.
- The content of headings 1 and 2 in the Word document import as the content of cross-references in the second section of the Reference document. These cross-references point to a collection of PSML component documents. This second section is equivalent to a Word Table of Contents.
- Each component PSML document has a PSML Document title and contains the content of its heading 1 (or heading 2) and any content following it the Word document until the next heading 1 or 2. The text of the PSML Document title is generated from the content in the first paragraph of each referenced document - being the content of the heading 1 or heading 2.
- For example, given a Word document containing a heading 1 followed directly by a heading 2 then a paragraph and then a heading 3 and a list, this splits into a first PSML component document containing only the text of the heading 1. A second PSML component document contains the content of the heading 2, the following paragraph, the heading 3 and the list.
Fragments within component documents
- Upon upload, the content of a heading 1 (or heading 2) in the Word document is in the first fragment of the component document.
- Any content immediately following the heading imports into the second fragment, whether or not it is a heading 3 or lower, or any other content.
- Content that is a heading 3 is in a new fragment and any content that follows the heading is in the same fragment until another heading 3 or a heading 4 is encountered.
- Content of a heading 4 and any content that follows the heading is in a new fragment. Heading 5 or lower headings and any content that follows them is in the same fragment as the parent heading 4.
Hyperlinks and cross-references