Provided by: freecontact_1.0.21-15_amd64 bug

NAME

       freecontact - fast protein contact predictor

SYNOPSIS

       freecontact [OPTION]

       freecontact --parprof [evfold|psicov|psicov-sd] < alignment.aln > contacts.out

       /usr/share/freecontact/a2m2aln --query '^RASH_HUMAN/(\d+)' < alignment.fa | freecontact --parprof evfold
       > contacts.out

       freecontact --ali=ALIFILE --apply-gapth=BOOL --clustpc=NUM --density=NUM --cov20=BOOL
       --estimate-ivcov=BOOL --gapth=NUM --icme-timeout=NUM --input-format=[flat|xml] --mincontsep=NUM
       --output-format=[evfold|pfrmat_rr|bioxsd] --pseudocnt=NUM --pscount-weight=NUM --rho=NUM --threads=NUM
       --veczw=BOOL

       freecontact --help --debug --quiet --version

DESCRIPTION

       FreeContact is a protein residue contact predictor optimized for speed.  FreeContact can function as an
       accelerated drop-in for the published contact predictors EVfold-mfDCA of DS. Marks et al. (2011) [1], and
       PSICOV of D. Jones et al. (2011) [2].

       FreeContact is accelerated by a combination of vector instructions, multiple threads, and faster
       implementation of key parts.  Depending on the alignment, 8-fold or higher speedups are possible.

       A sufficiently large alignment is required for meaningful results.  As a minimum, an alignment with an
       effective (after-weighting) sequence count bigger than the length of the query sequence should be used.
       Alignments with tens of thousands of (effective) sequences are considered good input.

       jackhmmer(1) or hhblits(1) can be used to generate the alignments, for example.

       [1] PLoS One. 2011;6(12):e28766. doi: 10.1371/journal.pone.0028766. Epub 2011 Dec 7.  Protein 3D
       structure computed from evolutionary sequence variation.  Marks DS, Colwell LJ, Sheridan R, Hopf TA,
       Pagnani A, Zecchina R, Sander C.

       [2] Bioinformatics. 2012 Jan 15;28(2):184-90. Epub 2011 Nov 17.  PSICOV: precise structural contact
       prediction using sparse inverse covariance estimation on large multiple sequence alignments.  Jones DT,
       Buchan DW, Cozzetto D, Pontil M.

   Input
       The following formats are supported:

       flat
           The following simple input file format is used:

            # querystart=5
            # query=QUERYwithinsertionSEQUENCEWITHNOGAPSORINSERTIONS
            QUERYSEQUENCEWITHNOGAPSORINSERTIONS
            -ALIGNED---SEQUENCE--WITH-GAPS-----
            ANOTHER-ALIGNED------------SEQUENCE

           The  '#' header lines are optional. Header lines are used to calculate contact residue numbers and to
           look up respective query residues for certain output formats.

           If no query is defined, the first sequence in the alignment is used as the query sequence.  The query
           sequence must not contain gaps in the alignment.

           All alignment rows must be the same length, and may contain only  [ABCDEFGHIJKLMNOPQRSTUVWXYZ-].  [B]
           is  mapped  to  [D], [Z] is mapped to [E], [JOUX] are mapped to [X].  [X] matches only itself for the
           entire program.

           A2M input alignments can be converted  to  the  above  format  using  /usr/share/freecontact/a2m2aln.
           a2m2aln can be used to pipe the alignment directly into freecontact.

       xml XML  document with one <fc:alignment xmlns:fc="http://rostlab.org/freecontact/xsd"/> element, defined
           in the FreeContact schema [4] derived from BioXSD [5].

           Example: /usr/share/doc/freecontact/examples/PF00071_v25_1000.xml.

   Output
       The original EVfold-mfDCA or PSICOV output format is  used  by  default  when  the  respective  parameter
       profile is selected.

       evfold (EVfold-mfDCA)
            5 K 6 L 0.332129 3.59798
            | | | | |        + corrected norm (CN) contact score
            | | | | + mutual information (MI) score
            | | | + contact amino acid residue code
            | | + contact residue number
            | + contact amino acid residue code
            + contact residue number

           Contacts are sorted on residue number.

       pfrmat_rr (PSICOV)
           CASP residue-residue separation prediction (PFRMAT RR) format [3]:

            55 67 0 8 10.840280
            |  |  | | + contact score
            |  |  +-+ range [Å] of Cb-Cb distance predicted for the residue pair
            |  |      (C-alpha for glycines)
            |  |      These two fields are invariant in the output.
            |  + contact residue number
            + contact residue number

           Contacts are sorted on score, descending.

           [3] <http://predictioncenter.org/casp10/index.cgi?page=format>

       bioxsd
           XML document with one <fc:contactMap xmlns:fc="http://rostlab.org/freecontact/xsd"/> element, defined
           in the FreeContact schema [4] derived from BioXSD [5].

           Example: /usr/share/doc/freecontact/examples/PF00071_v25_1000.evfold.50.xml.

           Note: as BioXSD is under active development in collaboration with FreeContact, the FreeContact schema
           may actually be derived from a version not yet available at [5].

           [4] <file:///usr/share/freecontact/freecontact.xsd>

           [5] <http://bioxsd.org>

       The output may not list all possible contacts.

REFERENCES

           Submitted.   FreeContact:  fast  and free direct residue-residue contact prediction.  Kaján L, Sustik
           MA, Marks DS, Hopf TA, Kalaš M, Rost B.

OPTIONS

       -a [ --threads ] arg
           Threads to use [0-). 0 means as many as cores.

       --apply-gapth arg
           When true, exclude residue columns and rows  with  a  weighted  gap  frequency  >  --gapth  from  the
           covariance matrix [Boolean].

       -c [ --clustpc ] arg
           BLOSUM clustering percentage [0-100].

       --cov20 arg
           If true, leave one amino acid off the covariance matrix, making it non-overdetermined [Boolean].

       -d [ --density ] arg
           Target precision matrix density [0-1]. Set 0 to not control density.

       --debug
           Turn on debugging.

       --estimate-ivcov arg
           Use inverse covariance matrix estimation instead of matrix inversion [Boolean].

       -f [ --ali ] arg (=-)
           Alignment file [path]. If '-', standard input. Default: '-'.

       -g [ --gapth ] arg
           Weighted gap frequency threshold (0-1].

       -h [ --help ]
           Produce this help message.

       -i [ --input-format ] arg (=flat)
           Input format [flat|xml].

       --icme-timeout arg (=1800)
           Inverse  covariance  matrix  estimation  timeout  in  seconds  [0-).  Applied  to  each iversion call
           independently. If a timeout occurs, the program exits with status 2.

       --mincontsep arg
           Minimum sequence-wise contacting residue pair separation given in amino acids as  (j-i>=arg).  1  for
           adjacent residues. [1-).

       -o [ --output-format ] arg
           Output format [evfold|pfrmat_rr|bioxsd].

       --parprof arg (=default)
           Parameter profile (optional) [default|evfold|psicov].  The default profile is evfold.

           Command line arguments can be used to override profile values.

           evfold
               Triggers EVfold-mfDCA [1] compatibility mode.

           psicov
               Triggers PSICOV [2] compatibility mode.

           psicov-sd
               Triggers PSICOV [2] sensible default mode: fixed default rho, no density control.

       -w [ --pscount-weight ] arg
           Pseudocount weight [0-1].

       -p [ --pseudocnt ] arg
           Pseudocount [0-).

       --pep
           Print  effective parameters on standard error.  Use this option to see what parameters freecontact(1)
           is run with in detail.  This is especially useful when the --parprof option is  used  in  combination
           with other options.

       --rho arg
           Initial value of Glasso regularization parameter [0-).  If negative, choose value automatically.

       --quiet arg (=0)
           Print nothing but error messages on standard error.  Does not affect --debug.

       --veczw arg
           Use vectorized sequence weighting when available [Boolean].

       --version
           Print version.

EXIT STATUS

       0   No error - success.

       1   Unspecified error.

       2   A timeout (see --icme-timeout) occurred.

EXAMPLES

        /usr/share/freecontact/a2m2aln --query '^RASH_HUMAN/(\d+)' < '/usr/share/doc/freecontact/examples/PF00071_v25_1000.fa' | \
         freecontact --parprof evfold > PF00071_v25_1000.evfold

        freecontact --parprof evfold -i xml -o bioxsd < '/usr/share/doc/freecontact/examples/PF00071_v25_1000.xml' > PF00071_v25_1000.evfold.xml

        freecontact --parprof psicov < /usr/share/doc/freecontact/examples/demo_1000.aln > demo_1000.psicov

NOTES

       For  optimal performance, use the Automatically Tuned Linear Algebra Software (ATLAS) library compiled on
       the machine where freecontact is run.

AUTHOR

       László Kaján <lkajan@rostlab.org>

SEE ALSO

       jackhmmer(1), hhblits(1), blastpgp(1)

1.0.21                                             2025-05-06                                     FREECONTACT(1)