Provided by: tabix_1.21+ds-1_amd64 bug

NAME

       bgzip - Block compression/decompression utility

SYNOPSIS

       bgzip  [-cdfhikrt]  [-b  virtualOffset] [-I index_name] [-l compression_level] [-o outfile] [-s size] [-@
       threads] [file ...]

DESCRIPTION

       Bgzip compresses files in a similar manner to, and compatible with, gzip(1).  The file is compressed into
       a series of small (less than 64K) 'BGZF' blocks.  This allows indexes to be built against the  compressed
       file and used to retrieve portions of the data without having to decompress the entire file.

       If  no  files  are  specified on the command line, bgzip will compress (or decompress if the -d option is
       used) standard input to standard output.  If a file is specified, it will be compressed (or  decompressed
       with  -d).   If  the  -c  option  is  used, the result will be written to standard output, otherwise when
       compressing bgzip will write to a new file with a .gz suffix and remove the original.  When decompressing
       the input file must have a .gz suffix, which will be removed  to  make  the  output  name.   Again  after
       decompression  completes  the  input  file  will  be removed. When multiple files are given as input, the
       operation is performed on all of them. Access and modification time of input file from filesystem is  set
       to output file.  Note, access time may get updated by system when it deems appropriate.

OPTIONS

       --binary  Bgzip  will  attempt to ensure BGZF blocks end on a newline when the input is a text file.  The
                 exception to this is where a single line is larger than a BGZF  block  (64Kb).   This  can  aid
                 tools  that  use the index to perform random access on the compressed stream, as the start of a
                 block is likely to also be the start of a text record.

                 This option processes text files as if they were  binary  content,  ignoring  the  location  of
                 newlines.  This also restores the behaviour for text files to bgzip version 1.15 and earlier.

       -b, --offset INT
                 Decompress  to  standard  output  from  virtual  file  position  (0-based uncompressed offset).
                 Implies -c and -d.

       -c, --stdout
                 Write to standard output, keep original files unchanged.

       -d, --decompress
                 Decompress.

       -f, --force
                 Overwrite files without asking, or  decompress  files  that  don't  have  a  known  compression
                 filename extension (e.g., .gz) without asking.  Use --force twice to do both without asking.

       -g, --rebgzip
                 Try  to  use  an  existing  index to create a compressed file with matching block offsets.  The
                 index must be specified using the -I file.gzi option.  Note that this  assumes  that  the  same
                 compression library and level are in use as when making the original file.  Don't use it unless
                 you know what you're doing.

       -h, --help
                 Displays a help message.

       -i, --index
                 Create  a  BGZF index while compressing.  Unless the -I option is used, this will have the name
                 of the compressed file with .gzi appended to it.

       -I, --index-name FILE
                 Index file name.

       -k, --keep
                 Do not delete input file during operation.

       -l, --compress-level INT
                 Compression level to use when compressing.  From 0 to 9, or -1 for the default level set by the
                 compression library. [-1]

       -o, --output FILE
                 Write to a file, keep original files unchanged, will overwrite an existing file.

       -r, --reindex
                 Rebuild the index on an existing compressed file.

       -s, --size INT
                 Decompress INT bytes (uncompressed size) to standard output.  Implies -c.

       -t, --test
                 Test the integrity of the compressed file.

       -@, --threads INT
                 Number of threads to use [1].

BGZF FORMAT

       The BGZF  format  written  by  bgzip  is  described  in  the  SAM  format  specification  available  from
       http://samtools.github.io/hts-specs/SAMv1.pdf.

       It  makes  use  of  a  gzip  feature which allows compressed files to be concatenated.  The input data is
       divided into blocks which are no larger than 64 kilobytes both before and  after  compression  (including
       compression headers).  Each block is compressed into a gzip file.  The gzip header includes an extra sub-
       field with identifier 'BC' and the length of the compressed block, including all headers.

GZI FORMAT

       The  index  format  is a binary file listing pairs of compressed and uncompressed offsets in a BGZF file.
       Each compressed offset points to the start of a BGZF block.  The uncompressed offset is the corresponding
       location in the uncompressed data stream.

       All values are stored as little-endian 64-bit unsigned integers.

       The file contents are:

           uint64_t number_entries

       followed by number_entries pairs of:

           uint64_t compressed_offset
           uint64_t uncompressed_offset

EXAMPLES

           # Compress stdin to stdout
           bgzip < /usr/share/dict/words > /tmp/words.gz

           # Make a .gzi index
           bgzip -r /tmp/words.gz

           # Extract part of the data using the index
           bgzip -b 367635 -s 4 /tmp/words.gz

           # Uncompress the whole file, removing the compressed copy
           bgzip -d /tmp/words.gz

AUTHOR

       The BGZF library was originally implemented by Bob Handsaker and modified by  Heng  Li  for  remote  file
       access and in-memory caching.

SEE ALSO

       gzip(1), tabix(1)

htslib-1.21                                     12 September 2024                                       bgzip(1)