SCHEMA Blog (EN)

Corporate blog of SCHEMA GmbH

From the Word Import to the Document Import for FrameMaker Files

Leave a comment

Longtime users of SCHEMA ST4 DocuManager may remember the function “Word
import” introduced in 2005. Then it was tried to change the structure and markups of
any given Word document with the help of a VBA macro (Visual Basic for
Applications) in such a way that a marked up version according to DocuManager
requirements was created. Technically in order to do so every character of the word
document was looked at and – on the basis of the implemented algorithm – decided,
which markup the character, word or paragraph should receive. The only thing nice
about this implementation: Because of the inefficiency of the editing of word
documents via VBA and the fact that computers then performed considerably slower
it was possible to watch the algorithms do their job. As if by magic the original
document slowly transformed into a DocuManager document which could then be
imported. Admittedly the arc of suspense notably flattened after having watched the
transformation of some pages.

Then we accepted rather quickly that the implementation of the function “Word
import” required complete revision with regard to efficiency, customizability and ways
of expansion concerning the import of other documents. The result was available for
the first time in the DocuManager 2.0.1 in 2007. In addition to the implementation the
name of the function was changed as well. Since then it has been called “Document
Import”. With the version ST4 DocuManager 2012 Adobe FrameMaker documents
can be imported for the first time via the same function as the word documents
before. It also goes for unstructured FM files.

FM-Dokumentimport

The Document Import is technically based on XML structures: A very simply
structured XML file is expected as input. The interactive mapping dialogue in the
DocuManager loads this file and determines the used paragraph and character
formats (from the elements used). Doing so context information is evaluated as well
(e.g. paragraph in table, listing). These paragraph and character formats can then be
assigned by the user to the formats configured in the DocuManager. The Document
Import finally transforms the input XML file, under consideration of the mapping
information, to an XML representation conforming to DocuManager.
In order to create this XML input file the DocuManager plug-in in Adobe FrameMaker
initially uses the integrated FrameMaker function “Save As XML”. Other than
intended in the standard function, however, the conversion table of FrameMaker for
assigning the paragraph and character formats to the expected XML elements is not
used. On the one hand this is due to the fact that applying the conversion table
requires special know-how in FrameMaker and on the other hand that the
completeness of the conversion table would have to be newly checked for each
document.
Further disadvantages of the standard function “Save As XML”:

  • The tables lose some of their properties. For example the information on table
    and column width is lost.
  • All graphics are converted to a format defined by FrameMaker of – most of the
    time – worse quality than the original (per default GIF). This could only be
    changed if a “Structured Application” were especially created for this purpose
    and integrated into the XML export process.

The DocuManager plug-in avoids these problems by using the following measures.

  • The table properties are collected before the XML export by the ST4
    FrameMaker import plug-in and stored in such a way that they are preserved
    during the XML import.
  • The information on image size and path to the original graphic are stored in
    such a way during XML export that they are preserved during the XML import.
    Where possible the graphic reference is reset to the path to the original file
    and the unintentionally converted graphic file is ignored. Additionally the
    information on image size is preserved.

The mapping of the paragraph and character formats is carried out as is usual in ST4
via the mapping dialogue of the document import. Should the source format change
most of the time only minor adaptations to the mapping are necessary. Mappings
once saved can be reused again and again and be changed via drag-and-drop.

Unfortunately all this is happening so fast that even with larger documents the
algorithm can no longer be watched doing its job.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s