Provided by: libmsoffice-word-template-perl_2.05-3_all bug

NAME

       MsOffice::Word::Template - generate Microsoft Word documents from Word templates

SYNOPSIS

         my $template = MsOffice::Word::Template->new($filename);
         my $new_doc  = $template->process(\%data);
         $new_doc->save_as($path_for_new_doc);

DESCRIPTION

   Purpose
       This module treats a Microsoft Word document as a template for generating other documents. The idea is
       similar to the "mail merge" functionality in Word, but with much richer possibilities. The whole power of
       a Perl templating engine can be exploited, for example for

       •   dealing with complex, nested datastructures

       •   using control directives for loops, conditionals, subroutines, etc.

       •   defining custom data processing functions or macros

       Template authors just use basic highlighing in MsWord to mark the templating directives :

       •   fragments highlighted in yelllow are interpreted as data directives, i.e. the template result will be
           inserted  at  that  point  in  the document, keeping the current formatting properties (bold, italic,
           font, etc.).

       •   fragments highlighted in green are interpreted as control directives that do  not  directly  generate
           content,  like  loops,  conditionals,  etc.  Paragraphs  or  table  rows  around  such directives are
           dismissed, in order to avoid empty paragraphs or empty rows in the resulting document.

       The syntax of data and control directives depends on the backend templating engine.  The  default  engine
       is  the  Perl Template Toolkit; other engines can be specified as subclasses -- see the "TEMPLATE ENGINE"
       section below.

   Status
       This distribution is  a  major  refactoring  of  the  first  version,  together  with  a  refactoring  of
       MsOffice::Word::Surgeon. New features include support for headers and footers, for metadata and for image
       insertion. The internal object-oriented structure has been redesigned.

       This  module  has been used successfully for a pilot project in my organization, generating quite complex
       documents from deeply nested datastructures.  However it  has  not  been  used  yet  at  large  scale  in
       production,  so  it  is  quite  likely  that some youth defects may still be discovered.  If you use this
       module, please keep me informed of your difficulties, tricks, suggestions, etc.

METHODS

   new
         my $template = MsOffice::Word::Template->new($docx);
         # or : my $template = MsOffice::Word::Template->new($surgeon);   # an instance of MsOffice::Word::Surgeon
         # or : my $template = MsOffice::Word::Template->new(docx => $docx, %options);

       In its simplest form, the constructor takes a single argument which is either a string (path  to  a  docx
       document),  or  an  instance  of MsOffice::Word::Surgeon. Otherwise the constructor takes a list of named
       parameters, which can be

       docx
           path  to  a  MsWord  document  in  docx  format.  This  will  automatically  create  an  instance  of
           MsOffice::Word::Surgeon and pass it to the constructor through the "surgeon" keyword.

       surgeon
           an  instance  of  MsOffice::Word::Surgeon. This is a mandatory parameter, either directly through the
           "surgeon" keyword, or indirectly through the "docx" keyword.

       data_color
           the Word highlight color for marking data directives (default : yellow)

       control_color
           the Word highlight color for marking control directives (default : green).   Such  directives  should
           produce no content. They are treated outside of the regular text flow.

       part_names
           an arrayref to the list of package parts to be processed as templates within the ".docx" ZIP archive.
           The  default  list is the main document ("document.xml"), together with all headers and footers found
           in the ZIP archive.

       property_files
           an arrayref to the list of property files (i.e. metadata) to be processed  as  templates  within  the
           ".docx"  ZIP  archive.  For  historical  reasons,  MsWord  has  three different XML files for storing
           document properties : "core.xml", "app.xml" and "custom.xml" : the default list contains those  three
           files. Supply an empty list if you don't want any document property to be processed.

       In  addition  to the attributes above, other attributes can be passed to the constructor for specifying a
       templating engine different from the default Perl Template  Toolkit.   These  are  described  in  section
       "TEMPLATE ENGINE" below.

   process
         my $new_doc = $template->process(\%data);
         $new_doc->save_as($path_for_new_doc);

       Processes  the  template  on  a  given data tree, and returns a new document (actually, a new instance of
       MsOffice::Word::Surgeon).  That document can then be saved  using "save_as" in MsOffice::Word::Surgeon.

AUTHORING TEMPLATES

   Textual content
       A template is just a regular Word document, in  which  the  highlighted  fragments  represent  templating
       directives.

       The data directives, i.e. the "holes" to be filled must be highlighted in yellow. Such zones must contain
       the  names  of variables to fill the holes. If the template engine supports it, names of variables can be
       paths into a complex datastructure, with dots separating the levels, like "foo.3.bar.-1" -- see "GET"  in
       Template::Manual::Directive and Template::Manual::Variables if you are using the Template Toolkit.

       Control directives like "IF", "FOREACH", etc. must be highlighted in green. When seeing a green zone, the
       system  will  remove  XML  markup  for  the  surrounding text and run nodes. If the directive is the only
       content of the paragraph, then the paragraph node is also removed. If this occurs within the  first  cell
       of  a  table  row,  the markup for that row is also removed. This mechanism ensures that the final result
       will not contain empty paragraphs or empty rows at places corresponding to control directives.

       In consequence of this distinction between yellow and green highlights, templating zones cannot mix  data
       directives  with  control directives : a data directive within a green zone would generate output outside
       of the regular XML flow (paragraph nodes, run nodes and text nodes), and therefore MsWord would  generate
       an  error  when  trying  to  open such content. There is a workaround, however : data directives within a
       green zone will work if they also generate the appropriate markup for paragraph nodes, run nodes and text
       nodes.

       To highlight using LibreOffice, set the Character Highlighting to Export As "Highlighting" instead of the
       default "Shading". See https://help.libreoffice.org/7.5/en-US/text/shared/optionen/01130200.html.

       See also MsOffice::Word::Template::Engine::TT2 for additional advice on authoring templates based on  the
       Template Toolkit.

   Images
       Insertion of generated images such as barcodes is done in two steps:

       •   the  template  must  contain  a placeholder image : this is an arbitrary image, positioned within the
           document through usual MsWord commands, including alignment instructions,  border,  etc.  That  image
           must          be          given          an          alternative          text         --         see
           https://support.microsoft.com/en-us/office/add-alternative-text-to-a-shape-picture-chart-smartart-graphic-or-other-object-44989b2a-903c-4d9a-b742-6a75b451c669).
           That text will be used as a unique identifier for the image.

       •   somewhere in the document (it doesn't matter where), a directive must replace the  placeholder  image
           by a generated image.  For example for a barcode, the TT2 directive looks like :

             [[ PROCESS barcode type="QRCode" img="my_image_name" content="some value for the QR code" ]]

           See  "barcodes"  in MsOffice::Word::Template::Engine::TT2 for details. The source code can be used as
           an example of how to implement other image generating blocks.

   Metadata (also known as "document properties" in MsWord parlance)
       MsWord documents store metadata, also called "document properties". Each property has a name and a value.
       A number of property names are builtin, like 'author' or 'description'; other custom  properties  can  be
       defined.  Properties  are  edited from the MsWord "Backstage view" (the screen displayed after a click on
       the File tab).

       For feeding values into document properties, just use the regular syntax of the  templating  engine.  For
       example  with the default Template Toolkit engine, directives are enclosed in '[% ' and ' %]'; so you can
       write

         [% path.to.subject.data %]

       within the 'subject' property of the MsWord template, and the resulting document will  have  its  subject
       filled with the given data path.

       Obviously,  the  reason  for  this  different  mechanism  is  that MsWord has no support for highlighting
       contents in property values.

       Unfortunately, this mechanism only works for document properties of  type  'string'.   MsWord  would  not
       allow specific templating syntax within fields of type boolean, number or date.

TEMPLATE ENGINE

       This  module  invokes  a  backend templating engine for interpreting the template directives. The default
       engine is MsOffice::Word::Template::Engine::TT2,  built  on  top  of  Template  Toolkit.  Another  engine
       supplied  in  this  distribution is MsOffice::Word::Template::Engine::Mustache, mostly as an example.  To
       implement another engine, just subclass MsOffice::Word::Template::Engine.

       To use an engine different from the default, the following arguments must be supplied to the "new" method
       :

       engine_class
           The name of  the  engine  class.  If  the  class  sits  within  the  MsOffice::Word::Template::Engine
           namespace, just the suffix is sufficient; otherwise, specify the fully qualified class name.

       engine_args
           An optional list of parameters that may be used for initializing the engine

       After  initialization  the  engine  will  receive  a  "compile_template" method call for each part in the
       ".docx" package. The default parts to be handled are the main document  body  ("document.xml"),  and  all
       headers  and footers. A different list of package parts can be supplied through the "part_names" argument
       to the constructor.

       In addition to the package parts, templates are  also  compiled  for  the  property  files  that  contain
       metadata  such as author name, subject, description, etc. The list of files can be controlled through the
       "property_files" argument to the constructor.

       When processing templates, the engine must make sure that ampersand characters  and  angle  brackets  are
       automatically replaced by the corresponding HTML entities (otherwise the resulting XML would be incorrect
       and  could  not be opened by Microsoft Word).  The Mustache engine does this automatically.  The Template
       Toolkit engine would normally require to explicitly add an "html" filter at each directive :

         [% foo.bar | html %]

       but thanks to the Template::AutoFilter module, this is performed automatically.

TROUBLESHOOTING

       If a document generated by this module cannot open in Word, it is probably because the XML  generated  by
       your template is not equilibrated and therefore not valid.  For example a template like this :

         This paragraph [[ IF condition ]]
            may have problems
         [[END]]

       is  likely  to  generate  incorrect XML, because the IF statement starts in the middle of a paragraph and
       closes at a different paragraph -- therefore when the condition evaluates  to  false,  the  XML  tag  for
       closing the initial paragraph will be missing.

       Compound  directives  like  IF  ..  END,  FOREACH  ..  END,   TRY  ..  CATCH  ..  END should therefore be
       equilibrated, either all within the same paragraph, or each directive on a separate  paragraph.  Examples
       like this should be successful :

         This paragraph [[ IF condition ]]has an optional part[[ ELSE ]]or an alternative[[ END ]].

         [[ SWITCH result ]]
         [[ CASE 123 ]]
            Not a big deal.
         [[ CASE 789 ]]
            You won the lottery.
         [[ END ]]

AUTHOR

       Laurent Dami, <dami AT cpan DOT org<gt>

COPYRIGHT AND LICENSE

       Copyright 2020-2024 by Laurent Dami.

       This  program  is free software, you can redistribute it and/or modify it under the terms of the Artistic
       License version 2.0.

POD ERRORS

       Hey! The above document had some coding errors, which are explained below:

       Around line 304:
           alternative text 'https://help.libreoffice.org/7.5/en-US/text/shared/optionen/01130200.html' contains
           non-escaped | or /

       Around line 320:
           alternative                                                                                      text
           'https://support.microsoft.com/en-us/office/add-alternative-text-to-a-shape-picture-chart-smartart-graphic-or-other-object-44989b2a-903c-4d9a-b742-6a75b451c669'
           contains non-escaped | or /

perl v5.40.0                                       2024-10-31                      MsOffice::Word::Template(3pm)