Provided by: ceph-base_19.2.1-0ubuntu3_amd64 bug

NAME

       crushtool - CRUSH map manipulation tool

SYNOPSIS

       crushtool ( -d map | -c map.txt | --build --num_osds numosds
       layer1 ... | --test ) [ -o outfile ]

DESCRIPTION

       crushtool is a utility that lets you create, compile, decompile and test CRUSH map files.

       CRUSH  is  a  pseudo-random data distribution algorithm that efficiently maps input values (which, in the
       context of Ceph, correspond to Placement Groups) across a heterogeneous, hierarchically structured device
       map.  The algorithm was originally described in detail in the following paper (although  it  has  evolved
       some since then):

          http://www.ssrc.ucsc.edu/Papers/weil-sc06.pdf

       The tool has four modes of operation.

       --compile|-c map.txt
              will compile a plaintext map.txt into a binary map file.

       --decompile|-d map
              will take the compiled map and decompile it into a plaintext source file, suitable for editing.

       --build --num_osds {num-osds} layer1 ...
              will create map with the given layer structure. See below for a detailed explanation.

       --test will  perform  a dry run of a CRUSH mapping for a range of input values [--min-x,--max-x] (default
              [0,1023]) which can be thought of as simulated Placement Groups. See below  for  a  more  detailed
              explanation.

       Unlike other Ceph tools, crushtool does not accept generic options such as --debug-crush from the command
       line. They can, however, be provided via the CEPH_ARGS environment variable. For instance, to silence all
       output from the CRUSH subsystem:

          CEPH_ARGS="--debug-crush 0" crushtool ...

RUNNING TESTS WITH --TEST

       The  test  mode  will use the input crush map ( as specified with -i map ) and perform a dry run of CRUSH
       mapping or random placement (if --simulate is set ). On completion, two kinds of reports can be  created.
       1)  The  --show-...  option  outputs  human  readable  information on stderr.  2) The --output-csv option
       creates CSV files that are documented by the --help-output option.

       Note: Each Placement Group (PG) has an integer ID which can be obtained from ceph pg dump (for example PG
       2.2f means pool id 2, PG id 32).  The pool and PG IDs are combined by a function to get a value which  is
       given to CRUSH to map it to OSDs. crushtool does not know about PGs or pools; it only runs simulations by
       mapping values in the range [--min-x,--max-x].

       --show-statistics
              Displays a summary of the distribution. For instance:

                 rule 1 (metadata) num_rep 5 result size == 5:    1024/1024

              shows  that  rule  1  which  is named metadata successfully mapped 1024 values to result size == 5
              devices when trying to map them to num_rep 5 replicas. When  it  fails  to  provide  the  required
              mapping,  presumably because the number of tries must be increased, a breakdown of the failures is
              displayed. For instance:

                 rule 1 (metadata) num_rep 10 result size == 8:   4/1024
                 rule 1 (metadata) num_rep 10 result size == 9:   93/1024
                 rule 1 (metadata) num_rep 10 result size == 10:  927/1024

              shows that although num_rep 10 replicas were required, 4 out of 1024 values ( 4/1024 ) were mapped
              to result size == 8 devices only.

       --show-mappings
              Displays the mapping of each value in the range [--min-x,--max-x].  For instance:

                 CRUSH rule 1 x 24 [11,6]

              shows that value 24 is mapped to devices [11,6] by rule 1.

              One of the following is required when using the --show-mappings option:

                 a. --num-rep

                 b. both --min-rep and --max-rep

              --num-rep stands for "number of replicas, indicates the number of replicas in a pool, and is  used
              to specify an exact number of replicas (for example --num-rep 5). --min-rep and --max-rep are used
              together to specify a range of replicas (for example, --min-rep 1 --max-rep 10).

       --show-bad-mappings
              Displays which value failed to be mapped to the required number of devices. For instance:

                 bad mapping rule 1 x 781 num_rep 7 result [8,10,2,11,6,9]

              shows that when rule 1 was required to map 7 devices, it could map only six : [8,10,2,11,6,9].

       --show-utilization
              Displays  the  expected  and  actual utilization for each device, for each number of replicas. For
              instance:

                 device 0: stored : 951      expected : 853.333
                 device 1: stored : 963      expected : 853.333
                 ...

              shows that device 0 stored 951 values and was expected to store 853.  Implies --show-statistics.

       --show-utilization-all
              Displays the same as --show-utilization but does not suppress output when the weight of  a  device
              is zero.  Implies --show-statistics.

       --show-choose-tries
              Displays how many attempts were needed to find a device mapping.  For instance:

                 0:     95224
                 1:      3745
                 2:      2225
                 ..

              shows  that  95224  mappings succeeded without retries, 3745 mappings succeeded with one attempts,
              etc. There are as many rows as the value of the --set-choose-total-tries option.

       --output-csv
              Creates CSV files (in the current directory) containing information documented  by  --help-output.
              The  files are named after the rule used when collecting the statistics. For instance, if the rule
              : 'metadata' is used, the CSV files will be:

                 metadata-absolute_weights.csv
                 metadata-device_utilization.csv
                 ...

              The first line of the file shortly explains the column layout. For instance:

                 metadata-absolute_weights.csv
                 Device ID, Absolute Weight
                 0,1
                 ...

       --output-name NAME
              Prepend  NAME  to  the  file  names  generated  when  --output-csv  is  specified.  For   instance
              --output-name FOO will create files:

                 FOO-metadata-absolute_weights.csv
                 FOO-metadata-device_utilization.csv
                 ...

       The  --set-...  options can be used to modify the tunables of the input crush map. The input crush map is
       modified in memory. For example:

          $ crushtool -i mymap --test --show-bad-mappings
          bad mapping rule 1 x 781 num_rep 7 result [8,10,2,11,6,9]

       could be fixed by increasing the choose-total-tries as follows:

          $ crushtool -i mymap --test
                 --show-bad-mappings --set-choose-total-tries 500

BUILDING A MAP WITH --BUILD

       The build mode will generate hierarchical maps. The  first  argument  specifies  the  number  of  devices
       (leaves)  in  the CRUSH hierarchy. Each layer describes how the layer (or devices) preceding it should be
       grouped.

       Each layer consists of:

          bucket ( uniform | list | tree | straw | straw2 ) size

       The bucket is the type of the buckets in the layer (e.g. "rack"). Each  bucket  name  will  be  built  by
       appending a unique number to the bucket string (e.g. "rack0", "rack1"...).

       The second component is the type of bucket: straw should be used most of the time.

       The  third  component  is  the  maximum  size  of  the  bucket. A size of zero means a bucket of infinite
       capacity.

EXAMPLE

       Suppose we have two rows with two racks each and 20 nodes per rack. Suppose each node contains 4  storage
       devices  for Ceph OSD Daemons. This configuration allows us to deploy 320 Ceph OSD Daemons. Lets assume a
       42U rack with 2U nodes, leaving an extra 2U for a rack switch.

       To reflect our hierarchy of devices, nodes, racks and rows, we would execute the following:

          $ crushtool -o crushmap --build --num_osds 320 \
                 node straw 4 \
                 rack straw 20 \
                 row straw 2 \
                 root straw 0
          # id        weight  type name       reweight
          -87 320     root root
          -85 160             row row0
          -81 80                      rack rack0
          -1  4                               node node0
          0   1                                       osd.0   1
          1   1                                       osd.1   1
          2   1                                       osd.2   1
          3   1                                       osd.3   1
          -2  4                               node node1
          4   1                                       osd.4   1
          5   1                                       osd.5   1
          ...

       CRUSH rules are created so the generated crushmap can be tested. They are the  same  rules  as  the  ones
       created by default when creating a new Ceph cluster. They can be further edited with:

          # decompile
          crushtool -d crushmap -o map.txt

          # edit
          emacs map.txt

          # recompile
          crushtool -c map.txt -o crushmap

RECLASSIFY

       The reclassify function allows users to transition from older maps that maintain parallel hierarchies for
       OSDs  of  different  types  to  a  modern CRUSH map that makes use of the device class feature.  For more
       information,                                            see                                             ‐
       https://docs.ceph.com/en/latest/rados/operations/crush-map-edits/#migrating-from-a-legacy-ssd-rule-to-device-classes.

EXAMPLE OUTPUT FROM --TEST

       See  https://github.com/ceph/ceph/blob/master/src/test/cli/crushtool/set-choose.t  for  sample  crushtool
       --test commands and output produced thereby.

AVAILABILITY

       crushtool is part of Ceph, a massively scalable, open-source, distributed storage system. Please refer to
       the Ceph documentation at https://docs.ceph.com for more information.

SEE ALSO

       ceph(8), osdmaptool(8),

AUTHORS

       John Wilkins, Sage Weil, Loic Dachary

COPYRIGHT

       2010-2014, Inktank Storage, Inc. and contributors. Licensed  under  Creative  Commons  Attribution  Share
       Alike 3.0 (CC-BY-SA-3.0)

dev                                               May 22, 2025                                      CRUSHTOOL(8)