Advanced

Advanced topics

Word import config schema

Version: 0.6.2

This file controls how Word docx files are converted into PSML. There are a number of ways to access it for editing. For users with at least contributor rights on the server, the best option is through the  publish configurations pages in the developer perspective.

<xs:schema  elementFormDefault="unqualified"  version="0.6.2"
  xmlns:xs="http://www.w3.org/2001/XMLSchema"
  xmlns:xml="http://www.w3.org/XML/1998/namespace" />

<config>

The root element of the instance, <config> is a container for the three key elements.

version provides a value that can be used for configuration management or technical support. 

<xs:element  name="config">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="split"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="lists"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="styles"
               minOccurs="0"  maxOccurs="1" />
      </xs:sequence>
      <xs:attribute  name="version"
                     type="xs:string" />
    </xs:complexType>
  </xs:element>

<split>

Is a container for the elements that determines how each imported document is stored. Options include:

  • Keep each Word file as a single PageSeeder document,
  • Split the source document into a linked collection of references and component documents,
  • Give the imported PSML documents a @type of "default"
  • Give the imported PSML documents a custom type.
<xs:element  name="split">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="main"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="document"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="section"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="mathml"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="footnotes"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="endnotes"
               minOccurs="0"  maxOccurs="1" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

<main>

Determines the nature of the main PSML file generated by the import process.  

type a document type for the primary document: the reference document (default is references).

label whether labels are attached to the primary document: the reference document.

<xs:element  name="main">
  <xs:complexType>
    <xs:sequence>
     <xs:element  name="type"
             minOccurs="0"
             maxOccurs="1"
                  type="xs:string" />
     <xs:element  name="label"
             minOccurs="0"
             maxOccurs="1"
                  type="xs:string" />
    </xs:sequence>
  </xs:complexType>
</xs:element>

Example

<main>
  <type>legislation</type>
  <label>production,test</label>
</main>

<mathml>

Controls the processing of any MathML objects in the Word file.

select use a value of "true" or "false" to determine if MathML content will be processed or ignored.

convert-to-mml use a value of "true" or "false" to determine whether MathML objects will be converted to the original math ml (mml) syntax or left as OfficeOpenXML syntax (always "true" for "generate-fragments" option).

output use a value of "generate-files" or "generate-fragments" to determine whether each MathML object will be placed in a separate file, under a mathml folder or in a fragment inside it's own document with the path mathml/mathml-[n].psml (requires pso-docx version 0.7.8 or later).

<xs:element  name="mathml">
    <xs:complexType>
      <xs:attribute  name="select"
                     type="xs:boolean" />
      <xs:attribute  name="convert-to-mml"
                     type="xs:boolean" />
      <xs:attribute  name="output">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="generate-files" />
            <xs:enumeration  value="generate-fragments" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<mathml select="true" 
        output="generate-files" 
        convert-to-mml="true"/>

See use of:

mathml-generate-files 

<footnotes>

Controls the processing of Word footnote markers.

select use a value of "true" or "false" to determine if Word footnotes will be processed or ignored.

output use a value of "generate-files" or "generate-fragments" to determine whether each footnote object will be placed in a separate file, under a footnotes folder or in a fragment in a footnotes/footnotes.psml file,

<xs:element  name="footnotes">
    <xs:complexType>
      <xs:attribute  name="select"  type="xs:boolean" />
      <xs:attribute  name="output">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="generate-files" />
            <xs:enumeration  value="generate-fragments" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<footnotes select="true" 
           output="generate-files"/>

<endnotes>

Controls the processing of Word endnote markers.

select use a value of "true" or "false" to determine if Word endnotes will be processed or ignored

output use a value of "generate-files" or "generate-fragments" to determine whether each endnote object will be placed in a separate file, under a endnotes folder or in a fragment in a endnotes/endnotes.psml file,

<xs:element  name="endnotes">
    <xs:complexType>
      <xs:attribute  name="select"  type="xs:boolean" />
      <xs:attribute  name="output">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="generate-files" />
            <xs:enumeration  value="generate-fragments" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<endnotes select="true" 
           output="generate-files"/>

<document>

This determines what markup will split the Word document into component documents. The components will be bound in order, by a references document, which will be considered the <main> document of the conversion.

select use a value of "true" or "false" determines whether or not the Word file gets split into component documents.

use-real-titles use a value of "true" or "false" if the PSML filename of each split document should be extracted from the Word content. If "false", the split document filename will simply be the Word file name plus an incrementing number. If "true", the psml filenames for the component/secondary documents will be generated from the text in the first paragraph of each split document.

<xs:element  name="document">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="sectionbreak"
               minOccurs="0"  maxOccurs="unbounded" />
        <xs:element  ref="outlinelevel"
               minOccurs="0"  maxOccurs="unbounded" />
        <xs:element  ref="wordstyle"
               minOccurs="0"  maxOccurs="unbounded" />
        <xs:element  ref="splitstyle"
               minOccurs="0"  maxOccurs="unbounded" />
      </xs:sequence>
      <xs:attribute  name="select"
                     type="xs:boolean" />
      <xs:attribute  name="use-real-titles"
                     type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Examples

<split>
  <document select="false">
    <outlinelevel select="0" />
  </document>
</split>

See use of:

 document-split-document-false 

In the GitHub example, using the default <document select="false">, the Word document will not split into component documents even though there are child elements specified under the <document> element.

<document select="true" use-real-titles="true" />

Using select="true", the Word document will split into component documents at the <wordstyle>'s or <outlinelevel>'s specified.

If importing using the default import config, or, if no <wordstyle> or <outlinelevel> is specified, the Word document will split at standard Word styles Heading 1 and Heading 2.

The default value for use-real-titles attribute is "false". The PSML filename of the split document will simply be the Word file name plus an incrementing number.
If  "true",  the PSML filenames for the component/secondary documents will be generated from the text in the first paragraph of each split document.

See use of:

document-split-outline-level 

The GitHub example uses select="true" and specifies to split the Word document at all styles that have an outline level of zero.  Standard Heading 1 style has an outline level of zero, Heading 2 has an outline level of 1 etc. The value of use-real-titles="true" so the PSML filenames of the component documents will be generated from the text in the the first paragraph of each split document.

See use of: 

document-split-paragraph-style 

The GitHub example uses select="true" and specifies to split the Word document at all styles that have Word style Heading 1.

document-split-multiple-paragraph-style 

The GitHub example uses select="true" and specifies to split the Word document at all styles that have Word style Heading 1 and Heading 2.

<split>
 <document select="true">
  <outlinelevel select="0" />
  <wordstyle  select="Heading2" />
 </document>
</split>

See use of:

document-split-multiple-split-values-1 

The GitHub example uses select="true" and specifies to split the Word document at all styles that have an outline level of zero and all styles that are Heading 1.

document-split-multiple-split-values-2 

The GitHub example uses select="true" and specifies to split the Word document at all styles that have an outline level of zero and all styles have an outline level of 1 and also all styles that are Heading 1 and all styles that are Heading 2.

<section>

Determines how the PSML  component document is split into fragments.

select use a value of "true" or "false" to determine if the Component document will be split into fragments.

use-real-titles use a value of "true" or "false".

<xs:element  name="section">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="sectionbreak"
               minOccurs="0"  maxOccurs="unbounded" />
        <xs:element  ref="outlinelevel"
               minOccurs="0"  maxOccurs="unbounded" />
        <xs:element  ref="wordstyle"
               minOccurs="0"  maxOccurs="unbounded" />
        <xs:element  ref="splitstyle"
               minOccurs="0"  maxOccurs="unbounded" />
      </xs:sequence>
      <xs:attribute  name="select" 
                     type="xs:boolean" />
      <xs:attribute  name="use-real-titles"
                     type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Example

<section select="false" />

See use of:

section-split-document-false 

The GitHub example uses section select="false". This means that each component document will not be split into fragments, even though their are child elements contained in teh . It will have all content in the second fragment of the component document. The first fragment contains the content of the element that the <document> was split at.

<section select="true" />

See use of:

section-split-outline-level 

The GitHub example uses section select="true". The component documents will be split into fragments only at all styles that have an outline level of zero. Standard Word style Heading 1 has an outline level of zero.

section-split-multiple-outline-level 

Using section select="true", the component documents will be split into fragments only at all styles that have an outline level of zero. Standard Word style Heading 1 has an outline level of zero.

section-split-paragraph-style 

Using section select="true", the component documents will be split into fragments only at all Word styles that have an outline level of zero and all styles that have an outline level of 1. Standard Word style Heading 1 has an outline level of zero, Heading 2 has an outline level of 1 etc.

section-split-multiple-paragraph-style 

Using section select="true", the component documents will be split into fragments only at all Word styles are Heading 1.

section-split-multiple-split-values-1 

Using section select="true", the component documents will be split into fragments only at all Word styles are Heading 1 and Heading 2.

section-split-multiple-split-values-2 

Using section select="true", the component documents will be split into fragments only at all Word styles have an outline level of zero, an outline level of 1, a Word style Heading 1 and a Word style Heading 2.

section-split-splitstyle 

Using section select="true", the component documents will be split into fragments only at all Word styles that are Word style Heading 1.

<sectionbreak>

Controls new sections in PSML.

select use a value of "true" or "false" to determine if Word section breaks will be used to create sections in PageSeeder or whether they will be ignored.

<xs:element  name="sectionbreak">
    <xs:complexType>
      <xs:attribute  name="select">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="evenPage" />
            <xs:enumeration  value="oddPage" />
            <xs:enumeration  value="continuous" />
            <xs:enumeration  value="nextColumn" />
            <xs:enumeration  value="nextPage" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<split>
    <document select="true">
      <sectionbreak select="evenPage" />
      <sectionbreak select="oddPage" />
    </document>
</split>

<outlinelevel>

Processes the outline levels that are attached to styles or directly to paragraphs.

select use a value of "true" or "false" to determine if Word outline levels will be used in the conversion.

<xs:element  name="outlinelevel">
    <xs:complexType>
      <xs:attribute  name="select">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:pattern  value="[0-8]" />
            <xs:pattern  value="[0-8]-[0-8]" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<split>
    <document select="true">
      <outlinelevel select="0" />
      <outlinelevel select="1" />
    </document>
  </split>

See use of:

document-split-outline-level 

The GitHub example is splitting the Word document at all styles that have an outlinelevel="0". Standard Word style Heading 1 has an outline level of zero.

document-split-multiple-outline-level 

The Word document will split at all styles that have an outlinelevel="0" and all Word styles that have an outline level of 1. Standard Word style Heading 1 has an outline level of zero and Heading 2 has an outline level of 1.

section-split-outline-level 

The GitHub example is splitting the component document at all styles that have an outlinelevel="0". Standard Word style Heading 1 has an outline level of zero.

section-split-multiple-outline-level 

The component document will split at all styles that have an outlinelevel="0" and all Word styles that have an outline level of 1. Standard Word style Heading 1 has an outline level of zero and Heading 2 has an outline level of 1.

<splitstyle>

Processes a style name with an explicit purpose of splitting fragments.

select use the value of a Word paragraph style ID to determine if the Word document will be split into fragments at the Word paragraph style (note: the ID is different from Word paragraph style name).

<xs:element  name="splitstyle">
    <xs:complexType>
      <xs:attribute  name="select"  type="xs:string" />
    </xs:complexType>
  </xs:element>

Example

<split>
    <document select="true">     
      <splitstyle select="splittingStyle1"/>
    </document>
  </split>

Given a custom Word style "splittingStyle1", which has been created in a Word document, the Word document will be split into component documents at all instances of the custom Word style.

See use of:

document-split-splitstyle 

The GitHub example is splitting the Word document into component documents at all instances of Word style Heading 1.

<lists>

Determines how heading and list numbering is interpreted.

<xs:element  name="lists">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="add-numbering-to-document-titles"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="convert-to-list-roles"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="convert-to-numbered-paragraphs"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="convert-manual-numbering"
               minOccurs="0"  maxOccurs="1" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

<add-numbering-to-document-titles>

To add numbering to the titles in the references document.

select use a value "true" or "false" to determine whether numbering is adding to the references document link title or whether it is not.

<xs:element  name="add-numbering-to-document-titles">
    <xs:complexType>
      <xs:attribute  name="select"  type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Example

<add-numbering-to-document-titles select="false"/>

See use of:

document-split-document-false-numbering-false 

<add-numbering-to-document-titles select="true"/>

See use of:

document-split-document-false-numbering-true 

document-split-document-true-numbering-true-outline-level 

document-split-document-true-numbering-true-multiple-outline-level 

document-split-document-true-numbering-true-paragraph-styles 

document-split-document-true-numbering-true-multiple-paragraph-styles 

<convert-to-list-roles>

Allows lists to contain a @role attribute set with the value of the Word paragraph style.

select use a value of "true" or "false" to determine whether PSML lists inherit the name of the Word list style as a @role or not. By default, this value is "false".

<xs:element  name="convert-to-list-roles">
    <xs:complexType>
      <xs:attribute  name="select"
                     type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Example

<convert-to-list-roles select="false" />

See use of:

lists-default-list-role-false 

lists-multilevel-list-role-false 

lists-numbered-paragraphs-role-false 

<convert-to-list-roles select="true" />

See use of:

lists-default-list-role-true 

lists-multilevel-list-role-true 

lists-numbered-paragraphs-role-true 

<convert-to-numbered-paragraphs>

Is used to control the conversion of  numbered paragraph styles to numbered paragraphs or lists in PageSeeder. To convert to numbered paragraphs, the @select attribute must be set to "true". If it contains any other value, it will convert to <list> or <nlist> depending on the type of numbered value.

select use a value of "true" or "false" to determine whether numbered Word paragraph styles are converted to numbered paragraphs or lists, or whether they are not.

<xs:element  name="convert-to-numbered-paragraphs">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="level"
               minOccurs="0"  maxOccurs="unbounded" />
      </xs:sequence>
      <xs:attribute  name="select"
                     type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Example

<convert-to-numbered-paragraphs select="false" />

See use of:

numbered-paragraphs-false 

The GitHub example has select="false". Any Word styles that have numbering and are  in a list, will not be numbered on import.

<convert-to-numbered-paragraphs select="true" />

Using select="true", any Word styles, that have numbering and are in a list, will have a numbering attribute applied on import.

See use of:

Numbering output as an inline label: in: 
numbered-paragraphs-true-output-inline 

The GitHub example has select="true". Only Word styles that have numbering, are in a list and that are level 1 in that list, will have a numbering attribute applied on import. The number will be placed inside a inline label with label name "inline=level1".

<convert-manual-numbering>

To convert non-automated numbering values that can exist in a Word document.

select use a value of "true" or "false" to determine whether manual numbering in the Word document is converted or whether it is not.

<xs:element  name="convert-manual-numbering">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="value"
               minOccurs="0"  maxOccurs="unbounded" />
      </xs:sequence>
      <xs:attribute  name="select"  type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Example

<convert-manual-numbering select="false" />

See use of:

manual-numbering-false 

The GitHub example has select="false". Any manual numbering on Word styles that are in a list, will be imported just as plain text.

<convert-manual-numbering select="true" />

Using  select="true", any manual numbering on Word styles will be imported. 

See use of:

<autonumbering> in manual-numbering-true-autonumbering 

The GitHub example has select="true".  Any manual numbering on Word styles that are in a list, will be imported according to the search "match" regular expressions specified in the <value>  elements.
The first "match" search is for: the word "Schedule" followed by non-breaking space (indicated by the HTML format code &#160;), followed by any number inclusive of 0 to 9 occurring one or more times. If found, the text will import into a paragraph with a numbered attribute.
eg <para numbered="true">.

<inline> in manual-numbering-true-inline 

Using select="true".  Any manual numbering on Word styles that are in a list, will be imported according to the search "match" regular expressions specified in the <value> elements.
The third "match" search is for: zero or one character space followed by any number inclusive of 0 to 9 occurring one or more times then followed by a single character space. If found, the text will import into an inline label with label name "numbering-subdivision" at the start of a paragraph.
eg  <para><inline label="numbering-subdivision">1 </inline>

<prefix> in manual-numbering-true-prefix 

Using select="true".  Any manual numbering on Word styles that are in a list, will be imported according to the search "match" regular expressions specified in the <value> elements.
The first "match" search is for:  the word "Schedule" followed by non-breaking space (indicated by the HTML format code &#160;), followed by any number inclusive of 0 to 9 occurring one or more times. If found, the text will import into a paragraph with the found text as the value of a prefix attribute.
eg <para prefix="Schedule 1">

<inline> and <prefix> in manual-numbering-true-prefix-inline 

Using select="true".  Any manual numbering on Word styles that are in a list, will be imported according to the search "match" regular expressions specified in the <value> elements.
The first and second "match" searches, if found, the text will import into paragraphs with the found text as the value of a prefix attribute.
eg <para prefix="Schedule 1">, <para prefix="Part 1">
The third, fourth and fifth "match" searches, if found, the text will import into inline labels at the start of a paragraph.
eg
 <para><inline label="numbering-subdivision">1 </inline>,
<para><inline label="numbering-firstlevel"> (5A)</inline>
,
<para><inline label="numbering-secondlevel"> (a)</inline>

<level>

Controls the processing of each level of paragraph.

value use "1" to "6" to correspond to level of the paragraph.

output use "prefix" or "text" to convert the number from the Word paragraph to a PSML prefix or text, use "numbering" to create numbered paragraphs in PSML, use "[inline=[label]" to wrap the number in a PSML inline label, label characters are restricted to [a-zA-Z0-9_\-].

<xs:element  name="level">
    <xs:complexType>
      <xs:attribute  name="value">
        <xs:simpleType>
          <xs:restriction  base="xs:integer">
            <xs:minInclusive  value="1" />
            <xs:maxInclusive  value="6" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute  name="output">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:pattern 
        value="numbering|prefix|inline=[a-zA-Z0-9_\-]+|text" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<lists>      
      <convert-to-numbered-paragraphs select="true">
        <level value="1" output="prefix"/>           
        <level value="2" output="text"/>
        <level value="3" output="prefix"/>
        <level value="4" output="numbering"/>
        <level value="5" output="numbering"/>
        <level value="6" output="inline=level6"/>
      </convert-to-numbered-paragraphs>      
  </lists>

See use of:

@output="inline=[label]" in: 
numbered-paragraphs-true-output-inline 

The GitHub example has <convert-to-numbered-paragraphs select="true">.  Any Word styles, that are in a list and that have numbering, will be imported. 
The Word style that is the first level in the Word list, specified by level value="1" will be imported as a paragraph and it will have an attribute indent="1" applied.
The number on the Word style will import, specified by the value of attribute output, into an inline label, with label name "level1".
eg <para indent="1"><inline label="level1">1) </inline>List item 1</para>

numbered-paragraphs-true-multilevel-output-inline 

output="prefix" in: 
numbered-paragraphs-true-output-prefix 
numbered-paragraphs-true-multilevel-output-prefix 

@output="numbering" in: 
numbered-paragraphs-true-output-numbering 
numbered-paragraphs-true-multilevel-output-numbering 

output="text" in: 
numbered-paragraphs-true-output-text 
numbered-paragraphs-true-multilevel-output-text 

@output="prefix " and @output="numbering" in: 
numbered-paragraphs-true-multilevel-output-numbering-prefix 

@output="text" and @output="prefix" in: 
numbered-paragraphs-true-multilevel-output-text-prefix 

multiple @output in: 
numbered-paragraphs-true-multilevel-output-numbering-text-inline-prefix 

<value>

Determines the paragraph ordinal.

prefix use "prefix" to generate a prefix with the value of the current auto-numbering or manual numbering value for each of the Word numbered paragraphs.

autonumbering [Insert description]

match use a regular expression to mark up text in the Word document.  

<xs:element  name="value">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="inline"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  name="prefix" 
                minOccurs="0"  maxOccurs="1" />
        <xs:element  name="autonumbering"
                minOccurs="0"  maxOccurs="1" />
      </xs:sequence>
      <xs:attribute  name="match"  type="xs:string" />
    </xs:complexType>
  </xs:element>
<inline>

Used to mark up content within a block of text — like character style in Word documents.

label use characters restricted to [a-zA-Z0-9_\-].

<xs:element  name="inline">
    <xs:complexType>
      <xs:attribute  name="label"  type="xs:string" />
    </xs:complexType>
  </xs:element>

Example

<convert-manual-numbering select="true">
        <value match="^[\(|\[|\{][a-z]+[\)|\]|\}]"> 
      <inline label="numbering-lowercase" />
        </value>
        <value match="^[\(|\[|\{][A-Z]+[\)|\]|\}]">
          <prefix/>
        </value>
        <value match="^[\(|\[|\{][ivx]+[\)|\]|\}]">
          <list role="numbering-roman"/>
        </value>
        <value match="Part&#160;[A-Z0-9]+">
          <prefix />
         </value>
        <value match="Note:\s*">
           <prefix />
        </value>
        <value match="\s*[0-9]+[A-Z]*$">
           <prefix />
        </value>
</convert-manual-numbering>

See use of:

inline-one-character-style 

inline-multi-character-style 

inline-one-paragraph-style 

<autonumbering>

Ancestor is <lists>.  Parent is <convert-manual-numbering select="true">.

Example

<config>
    <lists>
        <convert-manual-numbering select="true">
            <value match="Chapter&#160;[0-9]+">
                <autonumbering />
            </value>
        </convert-manual-numbering>
    </lists>

See use of:

manual-numbering-true-autonumbering 

<prefix>

Ancestor is <lists>. Parent is <convert-manual-numbering select="true">.

Example

<config>
    <lists>
        <convert-manual-numbering select="true">
            <value match="Chapter&#160;[0-9]+">
                <prefix />
            </value>
        </convert-manual-numbering>
    </lists>

See use of:

manual-numbering-true-prefix 

<styles>

Controls how styles from Word are translated to PSML.

<xs:element  name="styles">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="ignore"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="default"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="wordstyle"
               minOccurs="0"  maxOccurs="unbounded" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

<ignore>

Determines which content should not be processed. For example, the Word Table of Contents paragraphs can often be discarded.

<xs:element  name="ignore">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="wordstyle"  maxOccurs="unbounded" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

Example

        <ignore>
            <wordstyle value="TOC1" />
            <wordstyle value="TOC2" />
            <wordstyle value="TOC3" />
            <wordstyle value="TOC4" />
        </ignore>

See use of:

ignore-styles-toc-paragraph-styles 

ignore-styles-body-text-paragraph-style 

ignore-styles-one-paragraph-style 

ignore-styles-multiple-paragraph-styles

 

<default>

Contains the settings for general transformations of docx to PSML.

<xs:element  name="default">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="paragraphStyles"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="characterStyles"
               minOccurs="0"  maxOccurs="1" />
        <xs:element  ref="smart-tag"
               minOccurs="0"  maxOccurs="unbounded" />
      </xs:sequence>
    </xs:complexType>
  </xs:element>

<paragraphStyles>

Defines a mapping for a paragraph style not mapped by <wordstyle> or <lists>.

value "para" transforms all un-mapped Word paragraph styles to a PSML <para> element. "block" transforms all un-mapped Word paragraph styles to a PSML <block> element with a label equal to the Word paragraph style ID.

<xs:element  name="paragraphStyles">
    <xs:complexType>
      <xs:attribute  name="value">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="para" />
            <xs:enumeration  value="block" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<paragraphStyles value="block" />

See use of:

default-paragraph-style-block 

<paragraphStyles value="para" />

See use of:

default-paragraph-style-para 

<characterStyles>

Defines general rule for any Word character style not mapped with <wordstyle>.  

value "none" – strips the markup for un-mapped Word character styles.

 "inline" – transforms un-mapped Word character styles to a PSML <inline> element with a label equal to the Word character style ID. 

<xs:element  name="characterStyles">
    <xs:complexType>
      <xs:attribute  name="value">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="inline" />
            <xs:enumeration  value="none" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<characterStyles value="none" />

See use of:

character-style-none 

<characterStyles value="inline" />

See use of:

default-character-style-inline 

<smart-tag>

Word  smart tag information can be either discarded or captured in PageSeeder as an inline label, with a value equal to that of the smart tag. To do this, the @keep attribute must be set to "true". With any other value, the smart-tag markup will be discarded. 

keep use a value of "true" or "false" to determine whether smart tags are captured as inline labels, or whether they are not

<xs:element  name="smart-tag">
    <xs:complexType>
      <xs:attribute  name="keep"  type="xs:boolean" />
    </xs:complexType>
  </xs:element>

Example

<smart-tag keep="false" />

See use of:

smart-tag-false 

<smart-tag keep="true" />

See use of:

smart-tag-true 

<wordstyle>

These rules transform Word paragraph or character styles into PSML elements.

type a name for a fragment in a secondary document. Used where the ancestor is <split>.

select use the Word style ID for the Word paragraph style.  Used where the ancestor is <split>.

value use the Word style ID for the Word paragraph style. Used where the ancestor is <ignore>.

name use the Word style ID for the Word paragraph style.

table use where the ancestor is <styles>.
Use only when @psmlelement="caption"
Use value "default" when it applies to all tables
Or use the value of a specific table style ID. 

psmlelement use "title""heading", "para", "block", "inline", "preformat", "caption""monospace".

<xs:element  name="wordstyle">
    <xs:complexType>
        <xs:all>
          <xs:element  ref="label"
                 minOccurs="0"  maxOccurs="1" />
          <xs:element  name="type"
                  minOccurs="0"  maxOccurs="1" type="xs:string" />
          <xs:element  ref="level"
                 minOccurs="0"  maxOccurs="1" />
          <xs:element  ref="numbering"
                 minOccurs="0"  maxOccurs="1" />
          <xs:element  ref="indent"
                 minOccurs="0"  maxOccurs="1" />
        </xs:all>
        <xs:attribute  name="select"  type="xs:string" />
        <xs:attribute  name="value"  type="xs:string" />
        <xs:attribute  name="name"  type="xs:string" />
        <xs:attribute  name="table"  type="xs:string" />
        <xs:attribute  name="psmlelement">
            <xs:simpleType>
              <xs:restriction  base="xs:string">
                <xs:enumeration  value="title" />
                <xs:enumeration  value="para" />
                <xs:enumeration  value="heading" />
                <xs:enumeration  value="block" />
                <xs:enumeration  value="inline" />
                <xs:enumeration  value="monospace" />
                <xs:enumeration  value="preformat" />
                <xs:enumeration  value="caption" />
              </xs:restriction>
            </xs:simpleType>
          </xs:attribute>
    </xs:complexType>
  </xs:element>

Examples

<wordstyle name="Heading1" psmlelement="heading"/>

See use of: 

headings-one-paragraph-style 

headings-multiple-paragraph-styles 

<wordstyle name="Heading1" psmlelement="para"/>

See use of: 

para-one-paragraph-style 

para-multiple-paragraph-styles 

para-one-paragraph-style-with-block 

para-one-paragraph-style-with-inline 

para-with-numbering-inline 

para-with-numbering-numbering 

para-with-numbering-prefix 

para-with-numbering-text 

para-multiple-with-numbering-inline 

para-multiple-with-numbering-prefix 

para-multiple-with-numbering-text 

para-multiple-with-numbering-numbering 

para-multiple-with-numbering-prefix-numbering-inline 

<wordstyle name="Heading1" psmlelement="block">
    <label value="heading1block"/>
</wordstyle>

See use of:

block-one-paragraph-style

 block-multiple-paragraph-styles 

Captions attached to Images

The default conversion of the default Word paragraph style 'Caption',  when attached to an Image,  will be into a psmlelement=”block”.

Example

<wordstyle name="Caption" psmlelement="block">
    <label value="Caption" />
</wordstyle>

See use of:

empty-configuration-(images) 

<wordstyle name="Emphasis" psmlelement="inline">
    <label value="emphasisinline"/>
</wordstyle>

See use of:

inline-multi-character-style 

<wordstyle name="Heading1" psmlelement="preformat" />

See use of:

preformat-one-paragraph-style 

preformat-multi-paragraph-style 

<wordstyle name="Emphasis" psmlelement="monospace" />

See use of:

monospace-one-character-style 

monospace-multi-character-style 

monospace-one-paragraph-style 

monospace-multi-paragraph-style 

psmlelement="caption"

See use of:

<wordstyle name="Caption" psmlelement="caption" table="default"/>
  • When applied to a Word Default table style (w:styleId="TableGrid"):

[Insert example config for Word Default table style]

<wordstyle name="CaptionRedtable" psmlelement="caption" table="MediumGrid3-Accent2"/>
  • When applied to a Word Custom table style (eg w:styleId="CaptionRedtable")

[Insert example config for a Word custom table style]

<label>

When ancestor is <main> or <split>, there are no attributes.

type use a value of "block" or "inline" to determine whether text is surrounded by a block label or an inline label.

value use a value of "[valid label name]", restricted to these characters [a-zA-Z0-9_\-].

<xs:element  name="label">
    <xs:complexType  mixed="true">
      <xs:attribute  name="type">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="block" />
            <xs:enumeration  value="inline" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
      <xs:attribute  name="value">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:pattern  value="[a-zA-Z0-9_\-]+" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

When ancestor is <split>:

document-split-multiple-paragraph-style-with-one-label 

document-split-multiple-paragraph-style-with-multiple-labels 

When ancestor is <styles>:

<label type="block" value="chapter">

headings-one-paragraph-style-with-block 

<label type="inline" value="part_num">

headings-one-paragraph-style-with-inline 

<type>

Examples

When ancestor is <document>:

document-split-multiple-paragraph-style-with-one-type 

document-split-multiple-paragraph-style-with-multiple-types 

When ancestor is <main>:

 

<numbering>

Supports a range of options for numbering headings and paragraphs.

select use "true" or "false" to determine if numbering is applied or whether it is not.

value use "numbering" (add @numbered="true"), use "inline" to wrap number in an inline label specified by nested <label value="[valid label name]"> element, use "text" to include number in paragraph text, use "prefix" to insert number in @prefix attribute.

<xs:element  name="numbering">
    <xs:complexType>
      <xs:sequence>
        <xs:element  ref="label"  minOccurs="0" />
      </xs:sequence>
      <xs:attribute  name="select"  type="xs:boolean" />
      <xs:attribute  name="value">
        <xs:simpleType>
          <xs:restriction  base="xs:string">
            <xs:enumeration  value="numbering" />
            <xs:enumeration  value="inline" />
            <xs:enumeration  value="text" />
            <xs:enumeration  value="prefix" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<numbering select="false" />
<numbering select="true" value="numbering" numbered="true"/>

headings-with-numbering-numbering 

headings-multiple-with-numbering-numbering 

<numbering select="true" value="inline"/>

headings-with-numbering-inline 

headings-multiple-with-numbering-inline 

<numbering select="true" value="text"/>

headings-with-numbering-text 

headings-multiple-with-numbering-text 

<numbering select="true" value="prefix"/>

headings-with-numbering-prefix 

headings-multiple-with-numbering-prefix 

Use of multiple numbering @value:

headings-multiple-with-numbering-prefix-numbering-inline 

<indent>

Determines position of the paragraph in the heirarchy.

value values of "1" to "6"

<xs:element  name="indent">
    <xs:complexType>
      <xs:attribute  name="value">
        <xs:simpleType>
          <xs:restriction  base="xs:integer">
            <xs:minInclusive  value="1" />
            <xs:maxInclusive  value="6" />
          </xs:restriction>
        </xs:simpleType>
      </xs:attribute>
    </xs:complexType>
  </xs:element>

Example

<indent value="2" />

Created on , last edited on