Provided by: unibetacode_2.3-5_amd64 bug

NAME

       unibetaprep - Pre-process Beta Code files for beta2uni(1)

SYNOPSIS

       unibetaprep [-i input_file.pre] [-o output_file.beta]

DESCRIPTION

       unibetaprep(1) reads a document encoded using Beta Code that may contain special character codes from the
       full  Beta Code of the Thesaurus Linguae Graecae (TLG) specification, and converts it to a Beta Code file
       that has those special  characters  converted  to  Unicode  escape  sequences.   This  departs  from  the
       traditional encoding of those special characters in favor of Unicode code point assignments.

       Beta  Code  is  an  ASCII-only encoding scheme most commonly used for digital representation of polytonic
       Greek.

       Beta Code has become a widely-adopted standard for encoding classical Greek.  It was developed  by  David
       Packard  in  the  1970s  and  adopted by the Thesaurus Linguae Graecae (TLG) Project at the University of
       California, Irvine shortly thereafter.  This encoding was later adopted by the  Perseus  Project  in  the
       1980s  (originally  at  Harvard  University,  now  at  Tufts University) and by many other collections of
       classical and Koine Greek.  Today, the TLG corpus alone contains over 100 million words from classical to
       Byzantine Greek.

       The TLG uses uppercase Latin letters; the Perseus Project uses  lowercase.   unibetaprep(1)  will  accept
       either.

       Many  classicists  who  use  Beta Code have been actively involved in The Unicode Standard, with evolving
       recommendations for mapping between Beta Code and Unicode.   unibetaprep(1)  provides  a  capability  for
       GNU/Linux users who wish to convert Beta Code texts to Unicode.

       The  most notable range of special characters in the TLG specification is the complete range of Byzantine
       Musical Symbols, in the Unicode range U+1D000 through U+1D0FF, inclusive.  This range corresponds to  the
       TLG  special  character  encodings "#2000" through "#2245", respectively.  If a character sequence in the
       TLG Beta Code specification corresponds to a Unicode  glyph  or  glyph  combination,  unibetaprep  should
       handle the translation correctly.

       Most  of these Beta Code sequences consist of a "#", "%", "<", ">", "[", or "]" character followed by one
       or more decimal digits.  Sequences corresponding to idiosyncratic Beta Code glyphs are not translated  to
       Unicode.   The  Beta  Code quotation mark sequences "1, "2, "4, and "5 are converted to represent Unicode
       code points U+201E, U+201C, U+201A, and U+201B, respectively.  For other special code sequences,  consult
       the TLG Beta Code Quick Reference Guide, or examine the flex program source in file unibetaprep.l.

       The  output  of  unibetaprep  is  designed to provide the input to beta2uni(1), which then produces UTF-8
       Unicode output.

       Note: Thesaurus Linguae Graecae and TLG are registered trademarks of the University of California.

OPTIONS

       -i          Specify the input file. The default is STDIN.

       -o          Specify the output file. The default is STDOUT.

       -v          Print the program version and exit.

       --version   Print the program version and exit.

       Sample usage:

              unibetaprep -i my_input_file.pre -o my_output_file.beta

       The output file, my_output_file.beta, can then be used as input for beta2uni(1)  for  conversion  into  a
       UTF-8 Unicode document.

FILES

       ASCII text files using Beta Code to encode polytonic Greek.

SEE ALSO

       beta2uni(1), uni2beta(1), libunibetacode(3), unibetacode(5)

AUTHOR

       unibetaprep was written by Paul Hardy.

LICENSE

       unibetaprep is Copyright © 2018, 2019, 2020 Paul Hardy.

       This  program  is  free  software;  you  can  redistribute it and/or modify it under the terms of the GNU
       General Public License as published by the Free Software Foundation; either version 2 of the License,  or
       (at your option) any later version.

BUGS

       No known bugs exist.

                                                   2019 Jan 26                                    UNIBETAPREP(1)