Provided by: mmorph_2.3.4.2-17_amd64 bug

NAME

       mmorph - MULTEXT morphology tool

SYNOPSIS


       information:
              mmorph [ -vh ]

       parse only:
              mmorph -y | -z [ -a addfile ]
              -m morphfile [ -d debug_map ] [ -l logfile ] [ infile [ outfile ]]

       generate:
              mmorph -c | -n [ -t trace_level ] [ -s trace_level ] [ -a addfile ]
              -m morphfile [ -d debug_map ] [ -l logfile ] [ infile [ outfile ]]

       simple lookup:
              mmorph [ -fi ] [ -b | -k ] [ -r rejectfile ]
              -m morphfile [ -d debug_map ] [ -l logfile ] [ infile [ outfile ]]

       record/field lookup:
              mmorph -C classes [ -fU ] [ -E | -O ] [ -b | [ -k ] [ -B class ]]
              -m morphfile [ -d debug_map ] [ -l logfile ] [ infile [ outfile ]]

       dump database:
              mmorph -p | -q
              -m morphfile [ -d debug_map ] [ -l logfile ] [ infile [ outfile ]]

DESCRIPTION

       In the simplest mode of operation, with just the -m morphfile option, mmorph operates in lookup mode:  it
       will  open  an  existing  database  called  morphfile.db  and  lookup  all  the  string segments (usually
       corresponding to words) in the input.

       To create the database from the lexical entries specified in "morphfile", use -c -m morphfile.  The  file
       morphfile.db  should  not exist.  When the database is complete it will lookup the segments in the input.
       If used ineractively (input and output is a terminal), a prompt is printed when the program  expects  the
       user to type a segment string.  No prompting occurs in record/field mode.

       To  test the rule applications on the lexical entries specified in morphfile, without creating a database
       and without looking up segments, use -n -m morphfile.  This automatically sets the trace level to 1 if it
       was not specified.

       In order to do the same operations as above, but on the alternate set of lexical entries in addfile,  use
       the  extra  option  -a  addfile.   The lexical entries in morphfile will be ignored.  This is useful when
       making additions to a standard morphological description.  Be aware that entries added  to  the  database
       morphfile.db do not replace existing ones.

   How to test a morphological description
       Use  the  -n option.  In the Grammar section, specify goal rules that will match the desired results.  In
       the Lexicon section specify the lexical items you want to test.  When running all rules will  be  applied
       (recursively)  to the lexical items, if the rule is a goal, then the result of the application is printed
       on the output.

       Suggestion: Put the two parts mentioned above (goal rules and Lexicon  section)  in  separate  files  and
       reference these files with an #include directive where they should occur in the main input file.

       If you are using an existing description and want to test only new lexical entries, use the options -n -a
       addfile, and put the lexical entries in addfile.

OPTIONS

       -a addfile
              Ignore lexical entries in morphfile, take them from addfile instead.

       -B class
              Specifies  the  record  class  that  occurs before the beginning of a sentence.  Capitalized words
              occurring just after such records will also be looked up  with  all  their  letters  converted  to
              lowercase (according to LC_CTYPE, see below).

       -b     fold  case  before  lookup.  Uppercase  letters  are  converted to lowercase letters (according to
              LC_CTYPE, see below) before a word is looked up.

       -C classes
              Determines record/field mode. Specifies the record classes that should be looked up.  Class  names
              should be separated by comma ",", TAB, space, bar "|" or backslash "\".

       -c     Create  a  new  database  for  lookup.   The name of the created file is the name of morphfile (-m
              option) with suffix .db.  It should not exist; if it exists the user  should  remove  it  manually
              before  running  mmorph -c (this is a minimal protection against accidental overwriting a database
              that might have taken a long time to create).

       -d debug_map
              Specify which debug options are wanted. Each bit in debug_map corresponds to an option.
              bit decimal  hexadecimal purpose
          no bits       0  0x0    no debug option (default)
                1       1  0x1    debug initialisation
                2       2  0x2    debug yacc parsing
                3       4  0x4    debug rule combination
                4       8  0x8    debug spelling application
                5      16  0x10   print statistics with -p or -q options
         all bits      -1  0xffff all debug options whatever they are
              To combine options add the decimal or hexadecimal values together.  Example: -t 0x5 specifies bits
              (options) 1 and 4.

       -E     In record/field mode, extends the morphology annotations if they already exist (the default is  to
              leave existing annotations as is).

       -O     In  record/field  mode, overwrite the morphology annotations if they already exist (the default is
              to leave existing annotations as is).

       -f     Flush the output after each segment lookup. This is useful only if input and output are piped from
              and to a program that needs to synchronize them.

       -h     Print help and exit.

       -i     Prepend the result of each lookup with the identifier of the  input  segment  it  corresponds  to.
              Currently  input  segments  are  identified  by their sequential number, starting at 0.  With this
              indication, the extra newline separating the solutions for different input segments is not printed
              because it is not needed.  If a lookup has no solutions, only the segment identifier is printed on
              the output. The segment identifier is also prepended to rejected segments.  A tab  always  follows
              the segment identifier.

       -k     fallback  fold case.  If a word lookup failed, then convert all uppercase letters to lowercase and
              try lookup again.  (conversion is done according to LC_CTYPE, see below).

       -l logfile
              Specify the file for writing trace and error messages.  Defaults to standard error.

       -m morphfile
              Specify the file containing the morphology description.  See mmorph (5) for a description  of  the
              formalism's syntax.

       -n     No database creation or lookup (test mode).

       -p     Dump  the typed feature structure database to outfile (or standard output).  The count of distinct
              tfs is given in the logfile (or standard error) if bit 5 of debug option is set.

       -q     Dump the forms in the database to outfile (or standard output).  Some statistics are given in  the
              logfile (or standard error) if bit 5 of debug option is set.

       -r rejectfile
              In  non  record/field  mode,  specifies  the  file where to write input segments that could not be
              looked up.  Defaults to standard error.

       -s trace_level
              Trace spelling rule application:
              0  no tracing (default).
              1  trace valid surface forms.
              2  trace rules whose lexical part match.
              3  trace surface left context match (surface word construction).
              4  trace surface right context mismatch and rule blocking.
              5  trace rule non blocking.
              A trace_level implies all preceding ones.

       -t trace_level
              Specify the level of tracing for rule application:
              0  no tracing (default).
              1  trace goal rules that apply.
              2  trace all rules that apply, indentation indicates the recursion depth.
              10 trace also rules that were tried but did not apply
              A trace_level implies all preceding ones.

       -U     In record/field mode, unknown words (i.e. that were unsuccessfully looked up) are  annotated  with
              ??\??.

       -v     Print version and exit.

       -y     Parse  only:  do  not process the description other than for syntax checking.  While developping a
              morphology description you may  use  this  option  to  catch  syntax  errors  quickly  after  each
              modification before running it "for real".

       -z     implies -y. Parse and output the lexical descriptions in normalized form.

       infile file containing the segments to lookup, one per line. Defaults to the standard input.

       outfile
              file  in  which  the  output  of  the  program  is  written.  One line per solution.  Solutions of
              different input segments are separated by an empty line.  Defaults to the standard output.

WORD GRAMMAR AND SPELLING RULES

       For a detailed account of the principles and mechanisms used in mmorph, please  refer  to  the  documents
       cited in the SEE ALSO section below.

       Briefly  sketched,  morphosyntactic descriptions written for mmorph describe how words are constructed by
       the concatenation of morphemes, and  how  this  concatenation  process  changes  the  spelling  of  these
       morphemes.   The  first part, the word structure grammar, is specified by restricted context free rewrite
       rules whose formalism is inspired by unification based systems (cf.  Shieber 1986).  The second part, the
       spelling changes, is specified by spelling rules  in  a  formalism  based  on  the  two  level  model  of
       morphology.   This  approach  to  morphology  is  described  in  Ritchie,  Russell et.  al, 1992 and more
       concisely in Pulman and Hepple 1993.

ENVIRONMENT VARIABLES

       To decide which characters are displayable on the output, mmorph uses the language  specific  description
       that  setlocale(3) sets according to the environment variable LC_CTYPE.  For the languages that are dealt
       with in MULTEXT it is a good idea to have that variable set to iso_8859_1.

EXAMPLES

       Here is a summary of the common usage of mmorph options:

              mmorph -n -m morphfile
       Test mode: reads the whole of morphfile and prints results on standard error.  No database is created, no
       words are looked up.

              mmorph -c -m morphfile
       Database creation:  reads the whole of morphfile and stores the results  in  a  database  (morphfile.db).
       Typed  feature  structures  are collected in a separate file (morphfile.tfs).  Standard input is read for
       words to look up in the new database.

              mmorph -m morphfile
       Lookup mode: reads only the Alphabets, Attributes and Types sections of morphfile. Standard input is read
       for words to look up according to the existing database (mmorphfile.db and morphfile.tfs).

              mmorph -m morphfile -a addfile
       Addition mode:  ignores the Lexicon section of morphfile, but addfile is consulted, and the  results  are
       added  to  the database.  Standard input is read for words to look up according to the augmented database
       (mmorphfile.db and morphfile.tfs).

DIAGNOSTICS

       Error messages should be self explanatory.  Please refer to mmorph(5) for a  formal  description  of  the
       syntax.

FILES

       morphfile.db
              database file of forms generated for descriptions in file morphfile given as option -m.

       morphfile.tfs
              database file of typed feature structures associated to morphfile.db.

SEE ALSO

       mmorph(5), setlocale(3).

       G. Russell and D. Petitpierre, MMORPH - The Multext Morphology Program, Version 2.3, October1995, MULTEXT
              deliverable report for task 2.3.1.

       Ritchie,  G.  D.,  G.J.  Russell,  A.W. Black and S.G. Pulman (1992), Computational Morphology: Practical
              Mechanisms for the English Lexicon, Cambridge Mass., MIT Press.

       Pulman, S.G. and M.R. Hepple, (1993) ``A feature-based formalism for two level phonology:  a  description
              and implementation'', Computer Speech and Language 7, pp.333-358.

       Shieber,  S.M.  (1986),  An  Introduction  to Unification-Based Approaches to Grammar, CSLI Lecture Notes
              Number 4, Stanford University

AUTHOR

       Dominique Petitpierre, ISSCO, <petitp@divsun.unige.ch>

ACKNOWLEDGEMENTS

       The parser for the morphology description formalism was written using  yacc(1)  and  flex(1).   Flex  was
       written  by  Vern Paxson, <vern@ee.lbl.gov>, and is distributed in the framework of the GNU project under
       the condition of the GNU General Public License

       The database module in the current version uses the db library package developed  at  the  University  of
       California, Berkeley by Margo Seltzer, Keith Bostic <bostic@cs.berkeley.edu> and Ozan Yigit.

       The crc procedures used for taking a signature of the typed feature structure declarations are taken from
       the fingerprint package by Daniel J. Bernstein and use code written by Gary S. Brown.

                                            Version 2.3, October 1995                                  MMORPH(1)