Provided by: reposurgeon_4.31-1_amd64 bug

NAME

       repocutter - surgical and filtering operations on Subversion dump files

SYNOPSIS

       repocutter [-q] [-d n] [-i 'filename'] [-r 'selection'] 'subcommand'

DESCRIPTION

       This program does surgical and filtering operations on Subversion dump files. While it is is not as
       flexible as reposurgeon(1), it can perform Subversion-specific transformations that reposurgeon cannot,
       and can be useful for processing Subversion repositories into a form suitable for conversion. Also, it
       supports the version 3 dumpfile format, which reposurgeon does not.

       In all commands, the -r (or --range) option limits the selection of revisions over which an operation
       will be performed. Usually other revisions will be passed through unaltered, except in the select and
       deselect commands for which the option controls which revisions will be passed through. A selection
       consists of one or more comma-separated ranges. A range may consist of an integer revision number or the
       special name HEAD for the head revision. Or it may be a colon-separated pair of integers, or an integer
       followed by a colon followed by HEAD.

       (Older versions of this tool, before 4.30, treated -r as an implied selection filter rather than passing
       through unselected revisions unaltered. If you have old scripts using repocutter they may need
       modification.)

       Normally, each subcommand produces a progress spinner on standard error; each turn means another revision
       has been filtered. The -q (or --quiet) option suppresses this.

       The -d option enables debug messages on standard error. It takes an integer debug level. These messages
       are probably only of interest to repocutter developers.

       The -i option sets the input source to a specified filename. This is primarily useful when running the
       program under a debugger. When this option is not present the program expects to read a stream from
       standard input.

       Generally, if you need to use this program at all, you will find that you need to pipe your dump file
       through multiple instances of it doing one kind of operation each. This is not as expensive as it sounds;
       with the exception of the reduce subcommand, the working set of this program is bounded by the size of
       the the largest single blob plus its metadata. It does not need to hold the entire repo metadata in
       memory.

       The -t option sets a tag to be included in error message. This will be useful for determining which stage
       of a multistage repocutter pipeline failed.

       The following subcommands are available:

       select
           The 'select' subcommand selects a range and permits only revisions and nodes in that range to pass to
           standard output. A range beginning with 0 includes the dumpfile header.

       deselect
           The 'deselect' subcommand selects a range and permits only revisions and nodes NOT in that range to
           pass to standard output.

       see
           Render a very condensed report on the repository node structure, mainly useful for examining strange
           and pathological repositories. File content is ignored. You get one line per repository operation,
           reporting the revision, operation type, file path, and the copy source (if any). Directory paths are
           distinguished by a trailing slash. The 'copy' operation is really an 'add' with a directory source
           and target; the display name is changed to make them easier to see. This report can be restricted by
           a selection set.

       renumber
           Renumber all revisions, patching Node-copyfrom headers as required. Any selection option is ignored.
           Takes no arguments. The -b option can be used to set the base to renumber from, defaulting to 0.

       log
           Generate a log report, same format as the output of svn log on a repository, to standard output.

       setlog
           Replace the log entries in the input dumpfile with the corresponding entries in the LOGFILE, which
           should be in the format of an svn log output. Replacements may be restricted to a specified range.

       propdel
           Delete the property PROPNAME. May be restricted by a revision selection. You may specify multiple
           properties to be deleted.

       proprename
           Rename the property OLDNAME to NEWNAME. May be restricted by a revision selection. You may specify
           multiple properties to be renamed.

       propset
           Set the property PROPNAME to PROPVAL. May be restricted by a revision selection. You may specify
           multiple property settings.

       ppropclean
           Every path with a suffix matching one of SUFFIXES gets a property turned off. The default prperty is
           svn::Another prperty may be set with the -p option.

       expunge
           Delete all operations with Node-path or Node-copyfrom-path headers matching specified Golang regular
           expressions (opposite of 'sift'). Any revision left with no Node records after this filtering has its
           Revision

           Matches are constrained so that each match must be a path segment or a sequence of path segments;
           that is, the left end must be either at the start of path or immediately following a /, and the right
           end must precede a / or be at end of string. With a leading ^ the match is constrained to be a
           leading sequence of the pathname; with a trailing $, a trailing one. record is removed as well.

           The -f/-fixed option disables regexp compilation of the patterns, treating them as fixed strings.

       sift
           Delete all operations with Node-path or Node-copyfrom-path headers not matching specified Golang
           regular expressions (opposite of 'expunge'). Any revision left with no Node records after this
           filtering has its Revision record removed as well. This transform can be restricted by a selection
           set.

           Matches are constrained so that each match must be a path segment or a sequence of path segments;
           that is, the left end must be either at the start of path or immediately following a /, and the right
           end must precede a / or be at end of string. With a leading ^ the match is constrained to be a
           leading sequence of the pathname; with a trailing $, a trailing one.

           The -f/-fixed option disables regexp compilation of the patterns, treating them as fixed strings.

       closure
           The 'closure' subcommand computes the transitive closure of a path set under the relation 'copies
           from' - that is, with the smallest set of additional paths such that every copy-from source is in the
           set.

       pathlist
           List all distinct node-paths in the stream, once each, in the order first encountered.

       pathrename
           Modify Node-path headers, Node-copyfrom-path headers, and svn::expression FROM; replace with TO. TO
           may contain Golang-style backreferences (${1}, ${2} etc - curly brackets not optional) to
           parenthesized portions of FROM.

           Matches are constrained so that each match must be a path segment or a sequence of path segments;
           that is, the left end must be either at the start of path or immediately following a /, and the right
           end must precede a / or be at end of string. With a leading ^ the match is constrained to be a
           leading sequence of the pathname; with a trailing $, a trailing one.

           Multiple FROM/TO pairs may be specified and are applied in order. This transform can be restricted by
           a selection set.

       pop
           Pop initial segment off each path. May be useful after a sift command to turn a dump from a
           subproject stripped from a dump for a multiple-project repository into the normal form with
           trunk/tags/branches at the top level. This transform can be restricted by a selection set.

       push
           Push an initial segment onto each path. Normally used to add a "trunk" prefix to every path ion a
           flat repository. This transform can be restricted by a selection set.

       skipcopy
           Replace the source revision and path of a copy at the upper end of the selection with the source
           revisions and path of a copy at the lower end. Fails unless both revisions are copies. Used to remove
           an unwanted intermediate copy or copies - fails noisily if there is a change operating on the target
           path between these revisions.

       swap
           Swap the top two elements of each pathname in every revision in the selection set. Useful following a
           sift operation for straightening out a common form of multi-project repository. If a PATTERN argument
           is given, only paths matching the pattern are swapped. This transform can be restricted by a
           selection set.

       swapsvn
           Like swap, but is aware of Subversion structure. Used for transforming multiproject repositories into
           a standard layout with trunk, tags, and branches at the top level.

           Fires when the second component of a matching path is "trunk", "branches", or "tags", or the path
           consists of a single segment that is a top-level project directory; passes through all paths for this
           is not so unaltered.

           Top-level project directories with properties or comments make this command die (return status 1)
           with an error message on stderr; otherwise these directories are silently discarded.

           Otherwise, swaps "trunk" and the top-level (project) directory straight up. For tags and branches,
           the following two components are swapped to the top. thus, "foo/branches/release23" becomes
           "branches/release23/foo", putting the project directory beneath the branch.

           Also fires when an entire project directory is copied; this is transformed into a copy of trunk and
           copies of each subbranch and tag that exists.

           After the swap, there are attempts to recognize spans of copies into branch directories, and copies
           into tag subdirectories that are parallel in all top-level (project) directories. These are coalesced
           into single copies in the inverted structure. No attempts is made to coalesce deletes; the user must
           manually trim unneeded branches.

           Accordingly, copies with three-segment sources and three-segment targets are transformed; for tags/
           and branches/ paths the last segment (the subdirectory below the branch name) is dropped, Following
           copies are skipped.

           This has two minor negative consequences. One is that metadata belonging to all deletes or copies
           after the first one in a coalesced span is lost. The other is that branches and tags local to
           individual project directories are promoted to global branches and tags across the entire transformed
           repository; no content is lost this way.

           Parallel rename sequences are also coalesced.

           If a PATTERN argument is given, only paths matching the pattern are swapped.

           Note that the result of swapping does not have initial trunk/branches/tags directory creations and
           can thus not be fed directly to svnload. reposurgeon copes with this, but Subversion will not.

           This transform can be restricted by a selection set.

       replace
           Perform a regular expression search/replace on blob content. The first character of the argument
           (normally /) is treated as the end delimiter for the regular-expression and replacement parts. This
           transform can be restricted by a selection set.

       strip
           Replace content with unique generated cookies on all node paths matching the specified regular
           expressions; if no expressions are given, match all paths. Useful when you need to examine a
           particularly complex node structure. This transform can be restricted by a selection set.

       obscure
           Replace path segments and committer IDs with arbitrary but consistent names in order to obscure them.
           The replacement algorithm is tuned to make the replacements readily distinguishable by eyeball. This
           transform can be restricted by a selection set.

       reduce
           Strip revisions out of a dump so the only parts left those likely to be relevant to a conversion
           problem. This is done by dropping every node that consists of a change on a file and has no property
           settings.

       testify
           Replace commit timestamps with a monotonically increasing clock tick starting at the Unix epoch and
           advancing by 10 seconds per commit. Replace all attributions with 'fred'. Discard the repository
           UUID. Use this to neutralize procedurally-generated streams so they can be compared. This transform
           can be restricted by a selection set.

       version
           Report major and minor repocutter version.

HISTORY

       Under the name "svncutter", an ancestor of this program traveled in the 'contrib/' director of the
       Subversion distribution. It had functional overlap with reposurgeon(1) because it was directly ancestral
       to that code. It was moved to the reposurgeon(1) distribution in January 2016. This program was ported
       from Python to Go in August 2018, at which time the obsolete "squash" command was retired. The syntax of
       regular expressions in the pathrename command changed at that time.

       The reason for the partial functional overlap between repocutter and reposurgeon is that repocutter was
       first written earlier and became a testbed for some of the design concepts in reposurgeon. After
       reposurgeon was written, the author learned that it could not naturally support some useful operations
       very specific to Subversion, and enhanced repocutter to do those.

BUGS

       There is one regression since the Python version: repocutter no longer recognizes Macintosh-style line
       endings consisting of a carriage return only. This may be addressed in a future version.

SEE ALSO

       reposurgeon(1).

EXAMPLE

       Suppose you have a Subversion repository with the following semi-pathological structure:

           Directory1/ (with unrelated content)
           Directory2/ (with unrelated content)
           TheDirIWantToMigrate/
                           branches/
                                          crazy-feature/
                                                          UnrelatedApp1/
                                                          TheAppIWantToMigrate/
                           tags/
                                          v1.001/
                                                          UnrelatedApp1/
                                                          UnrelatedApp2/
                                                          TheAppIWantToMigrate/
                           trunk/
                                          UnrelatedApp1/
                                          UnrelatedApp2/
                                          TheAppIWantToMigrate/

       You want to transform the dump file so that TheAppIWantToMigrate can be subject to a regular branchy
       lift. A way to dissect out the code of interest would be with the following series of filters applied:

           repocutter expunge '^Directory1' '^Directory2'
           repocutter pathrename '^TheDirIWantToMigrate/' ''
           repocutter expunge '^branches/crazy-feature/UnrelatedApp1/
           repocutter pathrename 'branches/crazy-feature/TheAppIWantToMigrate/' 'branches/crazy-feature/'
           repocutter expunge '^tags/v1.001/UnrelatedApp1/'
           repocutter expunge '^tags/v1.001/UnrelatedApp2/'
           repocutter pathrename '^tags/v1.001/TheAppIWantToMigrate/' 'tags/v1.001/'
           repocutter expunge '^trunk/UnrelatedApp1/'
           repocutter expunge '^trunk/UnrelatedApp2/'
           repocutter pathrename '^trunk/TheAppIWantToMigrate/' 'trunk/'

LIMITATIONS

       The sift and expunge operations can produce output dumps that are invalid. The problem is copyfrom
       operations (Subversion branch and tag creations). If an included revision includes a copyfrom reference
       to an excluded one, the reference target won’t be in the emitted dump; it won’t load correctly in
       Subversion, and while reposurgeon has fallback logic that backs down to the latest existing revision
       before the kissing one this expedient is fragile. The revision number in a copyfrom header pointing to a
       missing revision will be zero. Attempts to be clever about this won’t work; the problem is inherent in
       the data model of Subversion.

AUTHOR

       Eric S. Raymond esr@thyrsus.com. This tool is distributed with reposurgeon; see the project page
       <http://www.catb.org/~esr/reposurgeon>.

                                                   2022-01-12                                      REPOCUTTER(1)