Provided by: percona-toolkit_3.2.1-1_all bug

NAME

       pt-stalk - Collect forensic data about MySQL when problems occur.

SYNOPSIS

       Usage: pt-stalk [OPTIONS]

       pt-stalk waits for a trigger condition to occur, then collects data to help diagnose problems.  The tool
       is designed to run as a daemon with root privileges, so that you can diagnose intermittent problems that
       you cannot observe directly.  You can also use it to execute a custom command, or to collect data on
       demand without waiting for the trigger to occur.

RISKS

       Percona Toolkit is mature, proven in the real world, and well tested, but all database tools can pose a
       risk to the system and the database server.  Before using this tool, please:

       •   Read the tool's documentation

       •   Review the tool's known "BUGS"

       •   Test the tool on a non-production server

       •   Backup your production server and verify the backups

DESCRIPTION

       Sometimes  a  problem  happens  infrequently and for a short time, giving you no chance to see the system
       when it happens. How do you solve intermittent MySQL problems when you can't observe them? That's why pt-
       stalk exists. In addition to using it when there's a known problem on your servers, it is a good idea  to
       run  pt-stalk  all  the  time,  even  when  you  think nothing is wrong.  You will appreciate the data it
       collects when a problem occurs, because problems such as MySQL lockups or spikes  in  activity  typically
       leave no evidence to use in root cause analysis.

       pt-stalk  does  two  things: it watches a MySQL server and waits for a trigger condition to occur, and it
       collects diagnostic data when that trigger  occurs.   To  avoid  false-positives  caused  by  short-lived
       problems, the trigger condition must be true at least "--cycles" times before a "--collect" is triggered.

       To  use  pt-stalk  effectively, you need to define a good trigger.  A good trigger is sensitive enough to
       fire reliably when a problem occurs, so that you don't miss a chance to solve  problems.   On  the  other
       hand,  a  good trigger isn't prone to false positives, so you don't gather information when the server is
       functioning normally.

       The most reliable triggers for MySQL tend to be the number of connections to the server, and  the  number
       of   queries   running   concurrently.  These  are  available  in  the  SHOW  GLOBAL  STATUS  command  as
       Threads_connected and Threads_running.  Sometimes  Threads_connected  is  not  a  reliable  indicator  of
       trouble,  but  Threads_running  usually  is.   Your  job, as the tool's user, is to define an appropriate
       trigger condition for the tool.  Choose carefully, because the quality of your results will depend on the
       trigger you choose.

       You define the trigger with the "--function", "--variable", "--threshold", and "--cycles"  options.   The
       default  values  for  these  options define a reasonable trigger, but you should adjust or change them to
       suite your particular system and needs.

       By default, pt-stalk tool watches MySQL forever until the trigger occurs,  then  it  collects  diagnostic
       data  for a while, and sleeps afterwards to avoid repeatedly collecting data if the trigger remains true.
       The general order of operations is:

          while true; do
             if --variable from --function > --threshold; then
                cycles_true++
                if cycles_true >= --cycles; then
                   --notify-by-email
                   if --collect; then
                      if --disk-bytes-free and --disk-pct-free ok; then
                         (--collect for --run-time seconds) &
                      fi
                      rm files in --dest older than --retention-time
                   fi
                   iter++
                   cycles_true=0
                fi
                if iter < --iterations; then
                   sleep --sleep seconds
                else
                   break
                fi
             else
                if iter < --iterations; then
                   sleep --interval seconds
                else
                   break
                fi
             fi
          done
          rm old --dest files older than --retention-time
          if --collect process are still running; then
             wait up to --run-time * 3 seconds
             kill any remaining --collect processes
          fi

       The diagnostic data is written to files whose names begin  with  a  timestamp,  so  you  can  distinguish
       samples  from  each other in case the tool collects data multiple times.  The pt-sift tool is designed to
       help you browse and analyze the resulting data samples.

       Although this sounds simple enough, in practice there are a number of subtleties, such as detecting  when
       the  disk  is  beginning  to  fill up so that the tool doesn't cause the server to run out of disk space.
       This tool handles these types of potential problems, so it's a good idea to  use  this  tool  instead  of
       writing  something  from  scratch  and possibly experiencing some of the hazards this tool is designed to
       avoid.

CONFIGURING

       You can use standard Percona Toolkit configuration files to set command line options.

       You will probably want to run the tool as a daemon and customize at least the  "--threshold".   Here's  a
       sample configuration file for triggering when there are more than 20 queries running at once:

         daemonize
         threshold=20

       If  you don't run the tool as root, then you will need specify several options, such as "--pid", "--log",
       and "--dest", else the tool will probably fail to start.

OPTIONS

       --ask-pass
           Prompt for a password when connecting to MySQL.

       --collect
           default: yes; negatable: yes

           Collect diagnostic data when the trigger occurs.  Specify "--no-collect" to make the tool  watch  the
           system but not collect data.

           See also "--stalk".

       --collect-gdb
           Collect  GDB  stacktraces.  This is achieved by attaching to MySQL and printing stack traces from all
           threads. This will freeze the server for some period of time, ranging from a second  or  so  to  much
           longer on very busy systems with a lot of memory and many threads in the server.  For this reason, it
           is disabled by default. However, if you are trying to diagnose a server stall or lockup, freezing the
           server causes no additional harm, and the stack traces can be vital for diagnosis.

           In  addition  to  freezing  the  server, there is also some risk of the server crashing or performing
           badly after GDB detaches from it.

       --collect-oprofile
           Collect oprofile data.  This is achieved by starting an oprofile session,  letting  it  run  for  the
           collection  time,  and  then  stopping  and saving the resulting profile data in the system's default
           location.  Please read your system's oprofile documentation to learn more about this.

       --collect-strace
           Collect strace data. This is achieved by attaching strace to the server, which will make it run  very
           slowly  until strace detaches.  The same cautions apply as those listed in --collect-gdb.  You should
           not enable this option together with --collect-gdb, because GDB and strace can't attach to the server
           process simultaneously.

       --collect-tcpdump
           Collect tcpdump data. This option causes tcpdump to capture all traffic on  all  interfaces  for  the
           port on which MySQL is listening.  You can later use pt-query-digest to decode the MySQL protocol and
           extract a log of query traffic from it.

       --config
           type: string

           Read  this  comma-separated list of config files.  If specified, this must be the first option on the
           command line.

       --cycles
           type: int; default: 5

           How many times "--variable" must be greater than "--threshold" before triggering  "--collect".   This
           helps  prevent  false positives, and makes the trigger condition less likely to fire when the problem
           recovers quickly.

       --daemonize
           Daemonize the tool.  This causes the tool to fork into the background and log its output as specified
           in --log.

       --defaults-file
           short form: -F; type: string

           Only read mysql options from the given file.  You must give an absolute pathname.

       --dest
           type: string; default: /var/lib/pt-stalk

           Where to save diagnostic data from "--collect".  Each time the tool collects data, it writes to a new
           set of files, which are named with the current system timestamp.

       --disk-bytes-free
           type: size; default: 100M

           Do not "--collect" if the disk has less than this much free  space.   This  prevents  the  tool  from
           filling up the disk with diagnostic data.

           If  the  "--dest"  directory contains a previously captured sample of data, the tool will measure its
           size and use that as an estimate of how much data is likely to be gathered this time, too.   It  will
           then  be even more pessimistic, and will refuse to collect data unless the disk has enough free space
           to hold the sample and still have the desired amount of free space.  For example, if you'd like 100MB
           of free space and the previous diagnostic sample consumed 100MB, the  tool  won't  collect  any  data
           unless the disk has 200MB free.

           Valid size value suffixes are k, M, G, and T.

       --disk-pct-free
           type: int; default: 5

           Do  not  "--collect"  if the disk has less than this percent free space.  This prevents the tool from
           filling up the disk with diagnostic data.

           This option works similarly to "--disk-bytes-free"  but  specifies  a  percentage  margin  of  safety
           instead  of  a  bytes  margin of safety.  The tool honors both options, and will not collect any data
           unless both margins are satisfied.

       --function
           type: string; default: status

           What to watch for the trigger.  The default value watches "SHOW GLOBAL  STATUS",  but  you  can  also
           watch  "SHOW  PROCESSLIST"  and specify a file with your own custom code.  This function supplies the
           value of "--variable", which is then compared  against  "--threshold"  to  see  if  the  the  trigger
           condition is met.  Additional options may be required as well; see below. Possible values are:

           •   status

               Watch  "SHOW GLOBAL STATUS" for the trigger.  The value of "--variable" then defines which status
               counter is the trigger.

           •   processlist

               Watch "SHOW FULL PROCESSLIST" for the trigger.  The trigger value is the count of processes whose
               "--variable" column matches the "--match" option.  For example, to trigger "--collect" when  more
               than 10 processes are in the "statistics" state, specify:

                  --function processlist \
                  --variable State       \
                  --match statistics     \
                  --threshold 10

           In addition, you can specify a file that contains your custom trigger function, written in Unix shell
           script.  This can be a wrapper that executes anything you wish.  If the argument to "--function" is a
           file,  then  it  takes  precedence  over  built-in  functions,  so  if there is a file in the working
           directory named "status" or "processlist" then the tool will use that  file  even  though  are  valid
           built-in values.

           The  file works by providing a function called "trg_plugin", and the tool simply sources the file and
           executes the function.  For example, the file might contain:

              trg_plugin() {
                 mysql $EXT_ARGV -e "SHOW ENGINE INNODB STATUS" \
                   | grep -c "has waited at"
              }

           This snippet will count the number  of  mutex  waits  inside  InnoDB.   It  illustrates  the  general
           principle:  the function must output a number, which is then compared to "--threshold" as usual.  The
           $EXT_ARGV variable contains the MySQL options mentioned in the "SYNOPSIS" above.

           The file should not alter the tool's existing global  variables.   Prefix  any  file-specific  global
           variables with "PLUGIN_" or make them local.

       --help
           Print help and exit.

       --host
           short form: -h; type: string

           Host to connect to.

       --interval
           type: int; default: 1

           How often to check the if trigger is true, in seconds.

       --iterations
           type: int

           How  many  times to "--collect" diagnostic data.  By default, the tool runs forever and collects data
           every time the trigger occurs.  Specify "--iterations" to collect data a  limited  number  of  times.
           This option is also useful with "--no-stalk" to collect data once and exit, for example.

       --log
           type: string; default: /var/log/pt-stalk.log

           Print all output to this file when daemonized.

       --match
           type: string

           The pattern to use when watching SHOW PROCESSLIST.  See "--function" for details.

       --notify-by-email
           type: string

           Send an email to these addresses for every "--collect".

       --password
           short form: -p; type: string

           Password  to use when connecting.  If password contains commas they must be escaped with a backslash:
           "exam\,ple"

       --pid
           type: string; default: /var/run/pt-stalk.pid

           Create the given PID file.  The tool won't start if the PID  file  already  exists  and  the  PID  it
           contains  is different than the current PID.  However, if the PID file exists and the PID it contains
           is no longer running, the tool will overwrite the PID file with the current PID.   The  PID  file  is
           removed automatically when the tool exits.

       --plugin
           type: string

           Load a plugin to hook into the tool and extend is functionality.  The specified file does not need to
           be  executable, nor does its first line need to be shebang line.  It only needs to define one or more
           of these Bash functions:

           before_stalk
               Called before stalking.

           before_collect
               Called when the trigger occurs, before running a "--collect" subprocesses in the background.

           after_collect
               Called after running a collector process.  The PID of the collector  process  is  passed  as  the
               first argument.  This hook is called before "after_collect_sleep".

           after_collect_sleep
               Called after sleeping "--sleep" seconds for the collector process to finish.  This hook is called
               after "after_collect".

           after_interval_sleep
               Called after sleeping "--interval" seconds after each trigger check.

           after_stalk
               Called  after  stalking.   Since  pt-stalk stalks forever by default, this hook is only called if
               "--iterations" is specified.

           For example, a very simple plugin that touches a file when "--collect" is triggered:

              before_collect() {
                 touch /tmp/foo
              }

           Since the plugin is completely sourced (imported) into the tool's namespace, be careful not to define
           other functions or global variables that already exist in the tool.  You should  prefix  all  plugin-
           specific functions and global variables with "plugin_" or "PLUGIN_".

           Plugins  have  access  to all command line options but they should not modify them.  Each option is a
           global variable like $OPT_DEST which corresponds to "--dest".  Therefore,  the  global  variable  for
           each  command  line  option  is  "OPT_"  plus  the  option  name in all caps with hyphens replaced by
           underscores.

           Plugins can stop the tool by setting the global variable "OKTORUN" to 1.  In this  case,  the  global
           variable "EXIT_REASON" should also be set to indicate why the tool was stopped.

           Plugin  writers  should  keep  in  mind  that  the file destination prefix currently in use should be
           accessed through the $prefix variable, rather than $OPT_PREFIX.

       --mysql-only
           Trigger only MySQL related captures, ignoring all others. The only  not  MySQL  related  value  being
           collected is the disk space, because it is needed to calculate the available free disk space to write
           the result files.  This option is useful for RDS instances.

       --port
           short form: -P; type: int

           Port number to use for connection.

       --prefix
           type: string

           The  filename  prefix  for diagnostic samples.  By default, all files created by the same "--collect"
           instance have a timestamp prefix based on the current local time, like  "2011_12_06_14_02_02",  which
           is December 6, 2011 at 14:02:02.

       --retention-count
           type: int; default: 0

           Keep  the  data for the last N runs. If N > 0, the program will keep the data for the last N runs and
           will delete the older data.

       --retention-size
           type: int; default: 0

           Keep up to --retention-size MB of data. It will keep at least 1 run even if the size is  bigger  than
           the specified in this parameter

       --retention-time
           type: int; default: 30

           Number of days to retain collected samples.  Any samples that are older will be purged.

       --run-time
           type: int; default: 30

           How  long to "--collect" diagnostic data when the trigger occurs.  The value is in seconds and should
           not be longer than "--sleep".  It is usually not necessary to change this; if the default 30  seconds
           doesn't  collect enough data, running longer is not likely to help because the system or MySQL server
           is probably too busy to respond.  In fact, in many cases a shorter collection period is appropriate.

           This value is used two other times.  After collecting,  the  collect  subprocess  will  wait  another
           "--run-time"  seconds  for  its  commands  to finish.  Some commands can take awhile if the system is
           running very slowly (which can likely be the case given that  a  collection  was  triggered).   Since
           empty  files  are  deleted,  the  extra wait gives commands time to finish and write their data.  The
           value is potentially used again just before the tool exits to wait again for any collect subprocesses
           to finish.  In most cases this won't happen because of the aforementioned extra wait.  If it happens,
           the tool will log "Waiting up to N seconds for subprocesses to finish..."  where  N  is  three  times
           "--run-time".  In both cases, after waiting, the tool kills all of its subprocesses.

       --sleep
           type: int; default: 300

           How  long  to  sleep  after  "--collect".  This prevents the tool from triggering continuously, which
           might be a problem if the collection process is intrusive.  It also prevents filling up the  disk  or
           gathering too much data to analyze reasonably.

       --sleep-collect
           type: int; default: 1

           How  long  to  sleep  between  collection  loop  cycles.  This is useful with "--no-stalk" to do long
           collections.  For example, to collect data every minute for an hour, specify: "--no-stalk  --run-time
           3600 --sleep-collect 60".

       --socket
           short form: -S; type: string

           Socket file to use for connection.

       --stalk
           default: yes; negatable: yes

           Watch  the server and wait for the trigger to occur.  Specify "--no-stalk" to collect diagnostic data
           immediately, that is, without waiting for the trigger to occur.  You probably also  want  to  specify
           values for "--interval", "--iterations", and "--sleep".  For example, to immediately collect data for
           1 minute then exit, specify:

              --no-stalk --run-time 60 --iterations 1

           "--cycles",  "--daemonize", "--log" and "--pid" have no effect with "--no-stalk".  Safeguard options,
           like "--disk-bytes-free" and "--disk-pct-free", are still respected.

           See also "--collect".

       --threshold
           type: int; default: 25

           The maximum  acceptable  value  for  "--variable".   "--collect"  is  triggered  when  the  value  of
           "--variable"  is greater than "--threshold" for "--cycles" many times.  Currently, there is no way to
           define a lower threshold to check for a "--variable" value that is too low.

           See also "--function".

       --user
           short form: -u; type: string

           User for login if not current user.

       --variable
           type: string; default: Threads_running

           The variable to compare against "--threshold".  See also "--function".

       --verbose
           type: int; default: 2

           Print more or less information while running.  Since the  tool  is  designed  to  be  a  long-running
           daemon,  the default verbosity level only prints the most important information.  If you run the tool
           interactively, you may want to use a higher verbosity level.

             LEVEL PRINTS
             ===== =====================================
             0     Errors
             1     Warnings
             2     Matching triggers and collection info
             3     Non-matching triggers

       --version
           Print tool's version and exit.

ENVIRONMENT

       This tool does not require any environment variables for configuration, although it can be influenced  to
       work  differently  by through several variables.  Keep in mind that these are expert settings, and should
       not be used in most cases.

       Specifically, the variables that can be set are:

       CMD_GDB
       CMD_IOSTAT
       CMD_MPSTAT
       CMD_MYSQL
       CMD_MYSQLADMIN
       CMD_OPCONTROL
       CMD_OPREPORT
       CMD_PMAP
       CMD_STRACE
       CMD_SYSCTL
       CMD_TCPDUMP
       CMD_VMSTAT

       For example, during collection iostat is called with  a  -dx  argument,  but  because  you  have  an  NFS
       partition, you also need the -n flag there.  Instead of editing the source, you can call pt-stalk as

           CMD_IOSTAT="iostat -n" pt-stalk ...

       which  will  do  exactly  what  you  need.  Combined with the plugin hooks, this gives you a fine-grained
       control of what the tool does.

       It is possible to enable "debug" mode in mysqladmin specifying:

       "CMD_MYSQLADMIN='mysqladmin debug' pt-stalk params ..."

SYSTEM REQUIREMENTS

       This tool requires Bash v3 or newer.  Certain options require other programs:

       "--collect-gdb" requires "gdb"
       "--collect-oprofile" requires "opcontrol" and "opreport"
       "--collect-strace" requires "strace"
       "--collect-tcpdump" requires "tcpdump"

BUGS

       For a list of known bugs, see <http://www.percona.com/bugs/pt-stalk>.

       Please report bugs at <https://jira.percona.com/projects/PT>.  Include the following information in  your
       bug report:

       •   Complete command-line used to run the tool

       •   Tool "--version"

       •   MySQL version of all servers involved

       •   Output from the tool including STDERR

       •   Input files (log/dump/config files, etc.)

       If possible, include debugging output by running the tool with "PTDEBUG"; see "ENVIRONMENT".

DOWNLOADING

       Visit  <http://www.percona.com/software/percona-toolkit/>  to  download  the  latest  release  of Percona
       Toolkit.  Or, get the latest release from the command line:

          wget percona.com/get/percona-toolkit.tar.gz

          wget percona.com/get/percona-toolkit.rpm

          wget percona.com/get/percona-toolkit.deb

       You can also get individual tools from the latest release:

          wget percona.com/get/TOOL

       Replace "TOOL" with the name of any tool.

AUTHORS

       Baron Schwartz, Justin Swanhart, Fernando Ipar, Daniel Nichter, and Brian Fraser

ABOUT PERCONA TOOLKIT

       This tool is part of Percona Toolkit, a collection of advanced command-line tools for MySQL developed  by
       Percona.   Percona  Toolkit  was  forked  from  two  projects  in June, 2011: Maatkit and Aspersa.  Those
       projects were created by Baron Schwartz and  primarily  developed  by  him  and  Daniel  Nichter.   Visit
       <http://www.percona.com/software/> to learn about other free, open-source software from Percona.

COPYRIGHT, LICENSE, AND WARRANTY

       This program is copyright 2011-2018 Percona LLC and/or its affiliates, 2010-2011 Baron Schwartz.

       THIS  PROGRAM  IS  PROVIDED  "AS  IS"  AND  WITHOUT ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, WITHOUT
       LIMITATION, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.

       This program is free software; you can redistribute it and/or modify  it  under  the  terms  of  the  GNU
       General  Public  License  as  published  by the Free Software Foundation, version 2; OR the Perl Artistic
       License.  On UNIX and similar systems, you can issue `man perlgpl' or `man perlartistic'  to  read  these
       licenses.

       You  should have received a copy of the GNU General Public License along with this program; if not, write
       to the Free Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA.

VERSION

       pt-stalk 3.2.1

perl v5.30.3                                       2020-08-30                                       PT-STALK(1p)