Media types
Media types can be used to:
- Store metadata about any non-PSML file using all the features of PSML properties.
- Extract, and store as editable properties, metadata from the following binary file types:
- Microsoft Office (
.docx
,.xlsx
,.pptx
). - PDF (
.pdf
). - Images (
.jpg
,.jpeg
,.gif
,.png
). - Attach editable properties to External URLs.
Extraction
PageSeeder can extract metadata when uploading files, or by reprocessing existing files through the option on the Document config page under the Dev tab in the Developer perspective.
To preview the raw metadata, click the “eye” icon that is visible beside the garbage can (delete) icon on the Upload documents dialogue when the user is in the Developer perspective. Also on this dialogue are Developer Options, where the metadata from newer versions of a file can overwrite existing metadata.
Alternately, the metadata properties can be viewed anywhere the Document properties are displayed.
Following are the default field names available for each file type.
Microsoft Office
dc-title dc-description dc-creator dc-subject cp-keywords cp-category cp-revision cp-version dcterms-created dcterms-modified
docinfo-title docinfo-subject docinfo-keywords docinfo-author docinfo-creator docinfo-producer docinfo-creationdate docinfo-moddate docinfo-[any custom property]
Images
iptc-keywords exif-image-description exif-user-comment exif-artist exif-date-time exif-date-time-orginal exif-image-width exif-image-height exif-x-resolution (dpi) exif-y-resolution (dpi) exif-copyright exif-focal-length (mm) exif-f-number exif-iso-speed-ratings exif-gps-altitude (meters) exif-gps-altitude-ref exif-gps-latitude exif-gps-latitude-ref exif-gps-longitude exif-gps-longitude-ref exif-gps-dest-latitude exif-gps-dest-latitude-ref exif-gps-dest-longitude exif-gps-dest-longitude-ref
The following clean-up is done on metadata when they are used in document templates:
cp-keywords
,docinfo-keywords
,iptc-keywords
:- replace ';' by ','.
- then remove any spaces after or before ','.
- then replace other non-label chars by '_'.
- then truncate to 250 chars at last comma.
dc-title
,docinfo-title
,exif-image-description
:- truncate to 250 chars.
Configuration
The media-template.psml
controls the processing of the metadata fields for each file type. It is available through the Document config page located under the Dev tab in the Developer perspective.
Following are the default media-template.psml
files for different file extensions. They follow the same format as document templates except that @level
on <document>
must be metadata
.
The metadata fields are inserted by using {$meta.[field name]}
for attributes and <t:value name="meta.[field name]" />
for content.
To change these, click Create media type to override an existing media templates. After creating or modifying a media template, all relevant files can be updated using the reprocess link and choosing either:
- “Add new metadata properties only (preserve existing)”
or
- “Overwrite all metadata and document properties (title, docid, description, labels)”
.docx, .pptx, .xlsx
<document xmlns:t="http://pageseeder.com/psml/template" level="metadata"> <documentinfo> <uri title="{$meta.dc-title}"> <description> <t:value name="meta.dc-description" /> </description> <labels> <t:value name="meta.cp-keywords" /> </labels> </uri> </documentinfo> <metadata> <properties> <property name="author" title="Author" value="{$meta.dc-creator}" /> </properties> </metadata> </document>
<document xmlns:t="http://pageseeder.com/psml/template" level="metadata"> <documentinfo> <uri title="{$meta.docinfo-title}"> <description> <t:value name="meta.docinfo-subject" /> </description> <labels> <t:value name="meta.docinfo-keywords" /> </labels> </uri> </documentinfo> <metadata> <properties> <property name="author" title="Author" value="{$meta.docinfo-author}" /> </properties> </metadata> </document>
.gif, .jpg, .png
<document xmlns:t="http://pageseeder.com/psml/template" level="metadata"> <documentinfo> <uri title="{$meta.exif-image-description}"> <description> <t:value name="meta.exif-user-comment" /> </description> <labels> <t:value name="meta.iptc-keywords" /> </labels> </uri> </documentinfo> <metadata> <properties> <property name="author" title="Author" value="{$meta.exif-artist}" /> <property name="latitude" title="Latitude" value="{$meta.exif-gps-latitude} {$meta.exif-gps-latitude-ref}" /> <property name="longitude" title="Longitude" value="{$meta.exif-gps-longitude} {$meta.exif-gps-longitude-ref}" /> </properties> </metadata> </document>
Metadata processing makes uploads slower. If there is no requirement for metadata, removing references to meta.
fields from the media template makes uploads faster. To disable metadata processing altogether, add the following to the media-template.psml
:
<document level="metadata"> </document>
Editing
Metadata for a file can be edited through the Properties dialogue available on the Document View or Document Browse pages. Alternately, the Edit sheet, available on Document Browse, Search or Images pages, supports bulk editing of properties in a grid view (see following example).
Edit sheet
The metadata editor behavior is configurable through the editor-config.xml
for the editor name PSMLMetadata
using the same options as the PSML properties editor.
To create this file, click Create media type on the Document config page. Following is an example editor-config.xml
file.
<editor-configs> <editor-config name="PSMLMetadata"> <field name="width" type="text" label="Width" pattern="[0-9]" /> <field name="height" type="text" label="Height" pattern="[0-9]" /> <field name="hi-res" type="xref" label="Hi-res"> <xref-config> <target filters="pssubtype:image" /> </xref-config> </field> <field name="action" type="select" label="Action"> <value>None</value> <value>Zoom</value> <value>Fullscreen</value> </field> </editor-config> </editor-configs>
Further information about PageSeeder support for metadata, and how to process it, is available through the following tutorial:
How to use metadata to substitute lo-res with hi-res images.