Provided by: slurmd_24.11.5-4_amd64 bug

NAME

       slurmd - The compute node daemon for Slurm.

SYNOPSIS

       slurmd [OPTIONS...]

DESCRIPTION

       slurmd  is  the compute node daemon of Slurm. It monitors all tasks running on the compute node , accepts
       work (tasks), launches tasks, and kills running tasks upon request.

OPTIONS

       --authinfo
              Used with configless to set an alternate AuthInfo parameter to be used to establish  communication
              with  slurmctld  before  the configuration file has been retrieved. (E.g., to specify an alternate
              MUNGE socket location.)

       -b     Report node rebooted when daemon restarted. Used for testing purposes.

       -c     Clear system locks as needed. This may be required if slurmd terminated abnormally.

       -C     Print the actual hardware configuration (not the configuration from the slurm.conf file) and exit.
              The format of output is the same as used in slurm.conf to describe a node's configuration plus its
              uptime.

       --conf <node parameters>
              Used in conjunction with the -Z option. Used to override or  define  additional  parameters  of  a
              dynamic  node  using  the  same  syntax  and  parameters  used  to define nodes in the slurm.conf.
              Specifying any of CPUs, Boards, SocketsPerBoard, CoresPerSocket or  ThreadsPerCore  will  override
              the defaults defined by the -C option. NodeName and Port are not supported.

              For example if slurmd -C reports
              NodeName=node1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=31848

              the following --conf specifications will generate the corresponding node definitions:
              --conf "Gres=gpu:2"
              NodeName=node1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=31848 Gres=gpu:2

              --conf "RealMemory=30000"
              NodeName=node1 CPUs=16 Boards=1 SocketsPerBoard=1 CoresPerSocket=8 ThreadsPerCore=2 RealMemory=30000

              --conf "CPUs=16"
              NodeName=node1 CPUs=16 RealMemory=331848

              --conf "CPUs=16 RealMemory=30000 Gres=gpu:2"
              NodeName=node1 CPUs=16 RealMemory=30000 Gres=gpu:2"

       --conf-server <host|ip address>[:<port>]
              Comma-separated  list  of  controllers,  the  first  being  the  primary  slurmctld.  A  port  can
              (optionally) be specified for each controller. These hosts are where the  slurmd  will  fetch  the
              configuration  from  when running in "configless" mode.  NOTE: If specifying an IPv6 address, wrap
              the <ip address> in [] to distinguish the address from the port.  This is required even if no port
              is specified.

       -d <file>
              Specify the fully qualified pathname to the slurmstepd program to be used for shepherding user job
              steps. This can be useful for testing purposes.

       -D     Run slurmd in the foreground. Error and debug messages will be copied to stderr.

       --extra <arbitrary string>
              Set    "extra"    data    on    node    startup.    If    this    is    a    json    string    and
              SchedulerParameters=extra_constraints  is  set in slurm.conf, then jobs may use the --extra option
              to filter based on this "extra" data.

       -f <file>
              Read configuration from the specified file. See NOTES below.

       -F[feature]
              Start this node as a Dynamic Future node. It will try to match a node definition with a  state  of
              FUTURE, optionally using the specified feature to match the node definition.

       -G     Print  Generic  RESource  (GRES)  configuration  (based upon slurm.conf GRES merged with gres.conf
              contents for this node) and exit.

       -h     Help; print a brief summary of command options.

       --instance-id <cloud instance id>
              Set cloud instance ID on node startup.

       --instance-type <cloud instance type>
              Set cloud instance type on node startup.

       -L <file>
              Write log messages to the specified file.

       -M     Lock slurmd pages into system memory using mlockall (2) to disable paging of the  slurmd  process.
              This  may  help in cases where nodes are marked DOWN during periods of heavy swap activity. If the
              mlockall (2) system call is not available, an error will be printed to the  log  and  slurmd  will
              continue as normal.

              It is suggested to set LaunchParameters=slurmstepd_memlock in slurm.conf(5) when setting -M.

       -n <value>
              Set  the  daemon's  nice value to the specified value, typically a negative number.  Also note the
              PropagatePrioProcess configuration parameter.

       -N <nodename>
              Run the daemon with the given nodename. Used to emulate a larger system with more than one  slurmd
              daemon per node. Requires that Slurm be built using the --enable-multiple-slurmd configure option.

       -s     Change  working  directory  of  slurmd  to  SlurmdLogFile  path  if possible, or to SlurmdSpoolDir
              otherwise. If both of them fail it will fallback to /var/tmp.

       --systemd
              Use when starting the daemon with systemd. This will allow slurmd to notify systemd of the new PID
              when using 'scontrol reconfigure'.

       -v     Verbose operation. Multiple -v's increase verbosity.

       -V, --version
              Print version information and exit.

       -Z     Start this node as a Dynamic Normal node. If no --conf is specified, then the slurmd will register
              with the same hardware configuration as defined by the -C option.

ENVIRONMENT VARIABLES

       The following environment variables can be used to override settings compiled into slurmd.

       ABORT_ON_FATAL      When a fatal error is detected, use  abort()  instead  of  exit()  to  terminate  the
                           process. This allows backtraces to be captured without recompiling Slurm.

       SLURM_CONF          The location of the Slurm configuration file. This is overridden by explicitly naming
                           a configuration file on the command line.

       SLURM_DEBUG_FLAGS   Specify  debug  flags for slurmd to use. See DebugFlags in the slurm.conf(5) man page
                           for a full list of flags. The environment variable takes precedence over the  setting
                           in the slurm.conf.

SIGNALS

       SIGTERM SIGINT SIGQUIT
              slurmd will shutdown cleanly.

       SIGHUP Reloads the slurm configuration files, similar to 'scontrol reconfigure'.

       SIGUSR2
              Reread  the  log  level  from  the configs, and then reopen the log file. This should be used when
              setting up logrotate(8).

       SIGPIPE
              This signal is explicitly ignored.

CORE FILE LOCATION

       If slurmd is started with the -D option then the core  file  will  be  written  to  the  current  working
       directory.   Otherwise  if SlurmdLogFile is a fully qualified path name (starting with a slash), the core
       file will be written to the same directory as the log file. Otherwise the core file will  be  written  to
       the  SlurmdSpoolDir  directory,  or "/var/tmp/" as a last resort. If none of the above directories can be
       written, no core file will be produced.

NOTES

       It may be useful to experiment with different slurmd specific configuration parameters using  a  distinct
       configuration  file  (e.g.  timeouts).  However,  this special configuration file will not be used by the
       slurmctld daemon or the Slurm programs, unless you specifically tell each of  them  to  use  it.  If  you
       desire  changing communication ports, the location of the temporary file system, or other parameters used
       by other Slurm components, change the common configuration file, slurm.conf.

       If you are using configless mode with a login node that runs a lot of client commands, you  may  consider
       running  slurmd  on that machine so it can manage a cached version of the configuration files. Otherwise,
       each client command will use the  DNS  record  to  contact  the  controller  and  get  the  configuration
       information, which could place additional load on the controller.

COPYING

       Copyright  (C)  2002-2007  The Regents of the University of California.  Copyright (C) 2008-2010 Lawrence
       Livermore National Security.  Copyright (C)  2010-2022  SchedMD  LLC.   Produced  at  Lawrence  Livermore
       National Laboratory (cf, DISCLAIMER).

       This    file    is    part    of    Slurm,   a   resource   management   program.    For   details,   see
       <https://slurm.schedmd.com/>.

       Slurm is free software; you can redistribute it and/or modify it under  the  terms  of  the  GNU  General
       Public License as published by the Free Software Foundation; either version 2 of the License, or (at your
       option) any later version.

       Slurm  is  distributed  in  the  hope  that it will be useful, but WITHOUT ANY WARRANTY; without even the
       implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR  PURPOSE.  See  the  GNU  General  Public
       License for more details.

FILES

       /etc/slurm.conf

SEE ALSO

       slurm.conf(5), slurmctld(8)

April 2025                                        Slurm Daemon                                         slurmd(8)