Provided by: biobambam2_2.0.185+ds-2_amd64 bug

NAME

       bamsormadup  -  sort  name  collated SAM or BAM file by coordinate and mark duplicates or sort SAM or BAM
       file by query name

SYNOPSIS

       bamsormadup [options]

DESCRIPTION

       bamsormadup has two modes of operation depending on the value of the SO parameter. If SO=coordinate or if
       the SO key is not given, then bamsormadup reads a name collated BAM or SAM file from standard input, runs
       a fix mate process, sorts the contained alignments by coordinate, marks duplicate alignments  and  writes
       the  sorted  alignments  to  standard output in BAM format. An alignment file is name collated if all the
       alignments for one read name appear consecutively in the file. If SO=queryname then the program  reads  a
       BAM  or SAM file from standard input, sorts it by queryname and writes the sorted file on standard output
       in BAM format.

       The following key=value pairs can be given:

       level=<-1|0|1|9|11>: set compression level of the output BAM file. Valid values are

       -1:    zlib/gzip default compression level

       0:     uncompressed

       1:     zlib/gzip level 1 (fast) compression

       9:     zlib/gzip level 9 (best) compression

       If libmaus has been compiled with support for igzip (see https://software.intel.com/en-us/articles/igzip-
       a-high-performance-deflate-compressor-with-optimizations-for-genomic-data) then an additional valid value
       is

       11:    igzip compression

       inputformat=<bam>:  set  the   input   file   format.    This   can   be   either   bam   or   sam   (see
       http://samtools.sourceforge.net/SAM1.pdf)

       threads=<[1]>: number of threads used

       M=<stderr>:  name of the metrics file for duplicate marking (metrics are written to standard error if not
       set)

       tmpfile=<bamsormadup_hostname_pid_starttime>: prefix for temporary files. By default the temporary  files
       are created in the current directory.  Set tmpfile=mem:tmp_ to store temporary files in RAM instead of on
       disk. Note that this may require very large amounts of RAM depending on the input.

       SO=<coordinate|queryname>: set the sort order. Valid values are

       coordinate
              sort alignments by coordinate. Input is assumed to be name collated.

       queryname
              sort alignments by query name. No assumption is made on the order of the input.

       reference=<>:  name of reference FastA file when writing CRAM. This file will be used for filling missing
       UR and M5 fields of SQ header lines. It may refer to a local file or a file stored  on  an  http  or  ftp
       server.  The  file is uncompressed on the fly if the file name ends on .gz . If the REF_CACHE environment
       variable is set to the name of an existing directory, then normalised cache files will be written to this
       directory for each reference sequence. The file names are constructed from the directory name and the MD5
       checksum of each reference sequence. This writing of cached files is omitted however, if a previously ex‐
       isting file is found in the list of read only cache locations given by the REF_PATH environment variable.

       optminpixeldif=<100>: distance (x and y inside same tile) inside which reads are  considered  as  optical
       duplicates

       rcsupport==<0>:  if  1 then create rc aux field (unclipped coordinate) for mapped reads when sorting from
       query to coordinate order

       numerical==<>: store numerical index in the given file. By default numerical index is not stored.

       numericalindexmod=1024: use this block size for producing numerical index

       fragmergepar=1: number of threads used for merging fragment lists in duplicate marking. The run-time will
       generally benefit from an increased number here, but parallel merging requires a large number of simulta‐
       neously open files, which will cause problems on some systems.

       crammode=: CRAM encoding profile. See the documentation for the scramble program for possible options.

AUTHOR

       Written by German Tischler.

REPORTING BUGS

       Report bugs to <germant@miltenyibiotec.de>

COPYRIGHT

       Copyright © 2009-2015 German Tischler, © 2011-2015 Genome Research Limited.  License GPLv3+: GNU GPL ver‐
       sion 3 <http://gnu.org/licenses/gpl.html>
       This is free software: you are free to change and redistribute it.  There is NO WARRANTY, to  the  extent
       permitted by law.

BIOBAMBAM                                          April 2015                                     BAMSORMADUP(1)