Provided by: tcllib_2.0+dfsg-4_all bug

NAME

       doctools::idx::import - Importing keyword indices

SYNOPSIS

       package require doctools::idx::import ?0.2.2?

       package require Tcl 8.5 9

       package require struct::map

       package require doctools::idx::structure

       package require snit

       package require pluginmgr

       ::doctools::idx::import objectName

       objectName method ?arg arg ...?

       objectName destroy

       objectName import text text ?format?

       objectName import file path ?format?

       objectName import object text object text ?format?

       objectName import object file object path ?format?

       objectName config names

       objectName config get

       objectName config set name ?value?

       objectName config unset pattern...

       objectName includes

       objectName include add path

       objectName include remove path

       objectName include clear

       IncludeFile currentfile path

       import text configuration

________________________________________________________________________________________________________________

DESCRIPTION

       This package provides a class to manage the plugins for the import of keyword indices from other formats,
       i.e. their conversion from, for example docidx, json, etc.

       This  is  one  of  the  three  public pillars the management of keyword indices resides on. The other two
       pillars are

       [1]    Exporting keyword indices, and

       [2]    Holding keyword indices

       For information about the Concepts of keyword indices, and their parts, see the same-named section.   For
       information  about  the  data structure which is the major output of the manager objects provided by this
       package see the section Keyword index serialization format.

       The plugin system of our class is based on the package pluginmgr, and  configured  to  look  for  plugins
       using

       [1]    the environment variable DOCTOOLS_IDX_IMPORT_PLUGINS,

       [2]    the environment variable DOCTOOLS_IDX_PLUGINS,

       [3]    the environment variable DOCTOOLS_PLUGINS,

       [4]    the path "~/.doctools/idx/import/plugin"

       [5]    the path "~/.doctools/idx/plugin"

       [6]    the path "~/.doctools/plugin"

       [7]    the path "~/.doctools/idx/import/plugins"

       [8]    the path "~/.doctools/idx/plugins"

       [9]    the path "~/.doctools/plugins"

       [10]   the registry entry "HKEY_CURRENT_USER\SOFTWARE\DOCTOOLS\IDX\IMPORT\PLUGINS"

       [11]   the registry entry "HKEY_CURRENT_USER\SOFTWARE\DOCTOOLS\IDX\PLUGINS"

       [12]   the registry entry "HKEY_CURRENT_USER\SOFTWARE\DOCTOOLS\PLUGINS"

       The last three are used only when the package is run on a machine using Windows(tm) operating system.

       The whole system is delivered with two predefined import plugins, namely

       docidx See docidx import plugin for details.

       json   See json import plugin for details.

       Readers  wishing  to  write  their  own  import  plugin for some format, i.e.  plugin writers reading and
       understanding the section containing the Import plugin API v2 reference is an absolute necessity,  as  it
       specifies the interaction between this package and its plugins in detail.

CONCEPTS

       [1]    A keyword index consists of a (possibly empty) set of keywords.

       [2]    Each keyword in the set is identified by its name.

       [3]    Each keyword has a (possibly empty) set of references.

       [4]    A reference can be associated with more than one keyword.

       [5]    A reference not associated with at least one keyword is not possible however.

       [6]    Each  reference  is  identified  by  its  target, specified as either an url or symbolic filename,
              depending on the type of reference (url, or manpage).

       [7]    The type of a reference (url, or manpage) depends only  on  the  reference  itself,  and  not  the
              keywords it is associated with.

       [8]    In  addition  to a type each reference has a descriptive label as well. This label depends only on
              the reference itself, and not the keywords it is associated with.

       A few notes

       [1]    Manpage references are intended to be used for references to the documents the index is made  for.
              Their  target  is  a  symbolic  file name identifying the document, and export plugins may replace
              symbolic with actual file names, if specified.

       [2]    Url references are intended on the othre hand are inteded to be used for links to  anything  else,
              like websites. Their target is an url.

       [3]    While  url  and  manpage  references  share  a  namespace for their identifiers, this should be no
              problem, given that manpage identifiers are symbolic filenames and as such they should never  look
              like urls, the identifiers for url references.

API

   PACKAGE COMMANDS
       ::doctools::idx::import objectName
              This  command  creates  a  new  import manager object with an associated Tcl command whose name is
              objectName. This object command is explained in full detail in the  sections  Object  command  and
              Object  methods.  The object command will be created under the current namespace if the objectName
              is not fully qualified, and in the specified namespace otherwise.

   OBJECT COMMAND
       All objects created by the ::doctools::idx::import command have the following general form:

       objectName method ?arg arg ...?
              The method method and its arg'uments determine the exact behavior of  the  command.   See  section
              Object methods for the detailed specifications.

   OBJECT METHODS
       objectName destroy
              This method destroys the object it is invoked for.

       objectName import text text ?format?
              This  method  takes  the  text  and  converts  it  from  the  specified  format  to  the canonical
              serialization of a keyword index using the import plugin for the format. An error is thrown if  no
              plugin  could  be  found for the format.  The serialization generated by the conversion process is
              returned as the result of this method.

              If no format is specified the method defaults to docidx.

              The specification of what a canonical serialization is can be found in the section  Keyword  index
              serialization format.

              The plugin has to conform to the interface specified in section Import plugin API v2 reference.

       objectName import file path ?format?
              This  method is a convenient wrapper around the import text method described by the previous item.
              It reads the contents of the specified file into memory, feeds the result  into  import  text  and
              returns the resulting serialization as its own result.

       objectName import object text object text ?format?
              This  method is a convenient wrapper around the import text method described by the previous item.
              It expects that object is  an  object  command  supporting  a  deserialize  method  expecting  the
              canonical  serialization of a keyword index.  It imports the text using import text and then feeds
              the resulting serialization into the object via deserialize.  This method returns the empty string
              as it result.

       objectName import object file object path ?format?
              This method behaves like import object text, except that it reads the text  to  convert  from  the
              specified file instead of being given it as argument.

       objectName config names
              This  method returns a list containing the names of all configuration variables currently known to
              the object.

       objectName config get
              This method returns a dictionary containing the names and values of  all  configuration  variables
              currently known to the object.

       objectName config set name ?value?
              This  method sets the configuration variable name to the specified value and returns the new value
              of the variable.

              If no value is specified it simply returns the current value, without changing it.

              Note that while the user can set the predefined configuration variables user and format  doing  so
              will have no effect, these values will be internally overridden when invoking an import plugin.

       objectName config unset pattern...
              This method unsets all configuration variables matching the specified glob patterns. If no pattern
              is specified it will unset all currently defined configuration variables.

       objectName includes
              This  method  returns a list containing the currently specified paths to use to search for include
              files when processing input.  The order of paths in the list corresponds to  the  order  in  which
              they  are  used, from first to last, and also corresponds to the order in which they were added to
              the object.

       objectName include add path
              This methods adds the specified path to the list of paths to use to search for include files  when
              processing  input.  The  path is added to the end of the list, causing it to be searched after all
              previously added paths. The result of the command is the empty string.

              The method does nothing if the path is already known.

       objectName include remove path
              This methods removes the specified path from the list of paths to use to search for include  files
              when processing input. The result of the command is the empty string.

              The method does nothing if the path is not known.

       objectName include clear
              This method clears the list of paths to use to search for include files when processing input. The
              result of the command is the empty string.

IMPORT PLUGIN API V2 REFERENCE

       Plugins  are  what  this package uses to manage the support for any input format beyond the Keyword index
       serialization format. Here we specify the API the objects created by this package use  to  interact  with
       their plugins.

       A plugin for this package has to follow the rules listed below:

       [1]    A plugin is a package.

       [2]    The name of a plugin package has the form doctools::idx::import::FOO, where FOO is the name of the
              format  the  plugin  will  generate  output  for. This name is also the argument to provide to the
              various import methods of import manager objects to get a string encoding a keyword index in  that
              format.

       [3]    The plugin can expect that the package doctools::idx::export::plugin is present, as indicator that
              it was invoked from a genuine plugin manager.

       [4]    The plugin can expect that a command named IncludeFile is present, with the signature

              IncludeFile currentfile path
                     This  command  has  to be invoked by the plugin when it has to process an included file, if
                     the format has the concept of such. An example of such a format would be docidx.

                     The plugin has to supply the following arguments

                     string currentfile
                            The path of the file it is currently processing. This may be the empty string if  no
                            such is known.

                     string path
                            The path of the include file as specified in the include directive being processed.

                     The result of the command will be a 5-element list containing

                     [1]    A boolean flag indicating the success (True) or failure (False) of the operation.

                     [2]    In  case  of  success  the  contents  of  the  included  file,  and the empty string
                            otherwise.

                     [3]    The resolved, i.e. absolute path of the included file, if possible, or the unchanged
                            path argument. This is for display in  an  error  message,  or  as  the  currentfile
                            argument of another call to IncludeFile should this file contain more files.

                     [4]    In case of success an empty string, and for failure a code indicating the reason for
                            it, one of

                            notfound
                                   The specified file could not be found.

                            notread
                                   The specified file was found, but not be read into memory.

                     [5]    An  empty  string  in case of success of a notfound failure, and an additional error
                            message describing the reason for a notread error in more detail.

       [5]    A plugin has to provide one command, with the signature shown below.

              import text configuration
                     Whenever an import manager of doctools::idx has to parse input for an index it will  invoke
                     this command.

                     string text
                            This  argument will contain the text encoding the index per the format the plugin is
                            for.

                     dictionary configuration
                            This argument will contain the current configuration to apply to the parsing,  as  a
                            dictionary mapping from variable names to values.

                            The  following configuration variables have a predefined meaning all plugins have to
                            obey, although they can ignore this information at their discretion. Any other other
                            configuration variables recognized by a plugin will be described in the manpage  for
                            that plugin.

                            user   This  variable is expected to contain the name of the user owning the process
                                   invoking the plugin.

                            format This variable is expected to contain the name of the format whose  plugin  is
                                   invoked.

       [6]    A  single usage cycle of a plugin consists of the invokations of the command import. This call has
              to leave the plugin in a state where another usage cycle can be run without problems.

KEYWORD INDEX SERIALIZATION FORMAT

       Here we specify the format used by the doctools v2 packages to serialize  keyword  indices  as  immutable
       values for transport, comparison, etc.

       We distinguish between regular and canonical serializations. While a keyword index may have more than one
       regular serialization only exactly one of them will be canonical.

       regular serialization

              [1]    An index serialization is a nested Tcl dictionary.

              [2]    This  dictionary  holds  a  single  key, doctools::idx, and its value. This value holds the
                     contents of the index.

              [3]    The contents of the index are a Tcl dictionary holding the title of the index, a label, and
                     the keywords and references. The relevant keys and their values are

                     title  The value is a string containing the title of the index.

                     label  The value is a string containing a label for the index.

                     keywords
                            The value is a Tcl dictionary, using the keywords known to the index  as  keys.  The
                            associated  values are lists containing the identifiers of the references associated
                            with that particular keyword.

                            Any reference identifier used in these lists has to exist as a key in the references
                            dictionary, see the next item for its definition.

                     references
                            The value is a Tcl dictionary, using the identifiers for the references known to the
                            index as keys. The associated values are 2-element lists  containing  the  type  and
                            label of the reference, in this order.

                            Any  key here has to be associated with at least one keyword, i.e. occur in at least
                            one of the reference lists which are the values  in  the  keywords  dictionary,  see
                            previous item for its definition.

              [4]    The type of a reference can be one of two values,

                     manpage
                            The  identifier  of the reference is interpreted as symbolic file name, referring to
                            one of the documents the index was made for.

                     url    The identifier of the reference is interpreted as an url, referring to some external
                            location, like a website, etc.

       canonical serialization
              The canonical serialization of a keyword index has the format as specified in the  previous  item,
              and then additionally satisfies the constraints below, which make it unique among all the possible
              serializations of the keyword index.

              [1]    The keys found in all the nested Tcl dictionaries are sorted in ascending dictionary order,
                     as generated by Tcl's builtin command lsort -increasing -dict.

              [2]    The  references  listed  for  each  keyword  of  the index, if any, are listed in ascending
                     dictionary order of their labels, as generated by Tcl's builtin command  lsort  -increasing
                     -dict.

BUGS, IDEAS, FEEDBACK

       This  document,  and  the package it describes, will undoubtedly contain bugs and other problems.  Please
       report such in the category  doctools  of  the  Tcllib  Trackers  [http://core.tcl.tk/tcllib/reportlist].
       Please also report any ideas for enhancements you may have for either package and/or documentation.

       When proposing code changes, please provide unified diffs, i.e the output of diff -u.

       Note  further  that  attachments  are strongly preferred over inlined patches. Attachments can be made by
       going to the Edit form of the ticket immediately after its creation, and then using the left-most  button
       in the secondary navigation bar.

KEYWORDS

       conversion,  docidx, documentation, import, index, json, keyword index, manpage, markup, parsing, plugin,
       reference, url

CATEGORY

       Documentation tools

COPYRIGHT

       Copyright (c) 2009-2019 Andreas Kupries <andreas_kupries@users.sourceforge.net>

tcllib                                                0.2.2                          doctools::idx::import(3tcl)