Provided by: perl_5.40.1-5_amd64 bug

NAME

       zipdetails - display the internal structure of zip files

SYNOPSIS

           zipdetails [options] zipfile.zip

DESCRIPTION

       This program creates a detailed report on the internal structure of zip files. For each item of metadata
       within a zip file the program will output

       the offset into the zip file where the item is located.
       a textual representation for the item.
       an optional hex dump of the item.

       The  program assumes a prior understanding of the internal structure of Zip files. You should have a copy
       of the zip file definition, APPNOTE.TXT <https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT>, at
       hand to help understand the output from this program.

   Default Behaviour
       By default the program expects to be given a well-formed zip file.  It will  navigate  the  zip  file  by
       first  parsing  the zip "Central Directory" at the end of the file.  If the "Central Directory" is found,
       it will then walk sequentally through the zip records  starting  at  the  beginning  of  the  file.   See
       "Advanced Analysis" for other processing options.

       If  the  program  finds any structural or portability issues with the zip file it will print a message at
       the point it finds the issue and/or in a summary at the end of the  output  report.  Whilst  the  set  of
       issues  that  can  be  detected  it  exhaustive, don't assume that this program can find all the possible
       issues in a zip file - there are likely edge conditions that need to be addressed.

       If you have suggestions  for  use-cases  where  this  could  be  enhanced  please  consider  creating  an
       enhancement request (see "SUPPORT").

       Date & Time fields

       Date/time  fields found in zip files are displayed in local time. Use the "--utc" option to display these
       fields in Coordinated Universal Time (UTC).

       Filenames & Comments

       Filenames and comments are decoded/encoded  using  the  default  system  encoding  of  the  host  running
       "zipdetails". When the sytem encoding cannot be determined "cp437" will be used.

       The exceptions are

       •    when the "Language Encoding Flag" is set in the zip file, the filename/comment fields are assumed to
            be encoded in UTF-8.

       •    the definition for the metadata field implies UTF-8 charset encoding

       See "Filename Encoding Issues" and "Filename & Comment Encoding Options" for ways to control the encoding
       of filename/comment fields.

   OPTIONS
       General Options

       "-h", "--help"
            Display help

       "--redact"
            Obscure  filenames  and  payload  data  in  the  output.  Handy for the use case where the zip files
            contains sensitive data that cannot be shared.

       "--scan"
            Pessimistically scan the zip file loking for possible zip records.  Can  be  error-prone.  For  very
            large  zip  files  this  option  is  slow.  Consider  using the "--walk" option first. See "Advanced
            Analysis Options"

       "--utc"
            By default, date/time fields are displayed in local time. Use this option  to  display  them  in  in
            Coordinated Universal Time (UTC).

       "-v" Enable Verbose mode. See "Verbose Output".

       "--version"
            Display version number of the program and exit.

       "--walk"
            Optimistically walk the zip file looking for possible zip records.  See "Advanced Analysis Options"

       Filename & Comment Encoding Options

       See "Filename Encoding Issues"

       "--encoding name"
            Use encoding "name" when reading filenames/comments from the zip file.

            When this option is not specified the default the system encoding is used.

       " --no-encoding"
            Disable all filename & comment encoding/decoding. Filenames/comments are processed as byte streams.

            This option is not enabled by default.

       "--output-encoding name"
            Use  encoding  "name" when writing filename/comments to the display.  By default the system encoding
            will be used.

       "--language-encoding", "--no-language-encoding"
            Modern zip files set a metadata entry in zip files, called the "Language encoding flag",  when  they
            write filenames/comments encoded in UTF-8.

            Occasionally  some applications set the "Language Encoding Flag" but write data that is not UTF-8 in
            the filename/comment fields of the zip file. This will usually result in garbled text  being  output
            for the filenames/comments.

            To  deal  with  this  use-case,  set  the  "--no-language-encoding"  option  and, if needed, set the
            "--encoding name" option to encoding actually used.

            Default is "--language-encoding".

       "--debug-encoding"
            Display extra debugging info when a filename/comment encoding has changed.

       Message Control Options

       "--messages", "--no-messages"
            Enable/disable the output of all info/warning/error messages.

            Disabling messages means that no checks are carried out to check that the zip file is well-formed.

            Default is enabled.

       "--exit-bitmask", "--no-exit-bitmask"
            Enable/disable exit status bitmask for messages. Default disabled.  Bitmask values are: 1 for  info,
            2 for warning and 4 for error.

   Default Output
       By default "zipdetails" will output each metadata field from the zip file in three columns.

       1.   The offset, in hex, to the start of the field relative to the beginning of the file.

       2.   The name of the field.

       3.   Detailed information about the contents of the field. The format depends on the type of data:

            •    Numeric Values

                 If the field contains an 8-bit, 16-bit, 32-bit or 64-bit numeric value, it will be displayed in
                 both hex and decimal -- for example ""002A (42)"".

                 Note  that  Zip  files store most numeric values in little-endian encoding (there area few rare
                 instances where big-endian is used). The value read from the zip  file  will  have  the  endian
                 encoding removed before being displayed.

                 Next, is an optional description of what the numeric value means.

            •    String

                 If the field corresponds to a printable string, it will be output enclosed in single quotes.

            •    Binary Data

                 The  term  Binary Data is just a catch-all for all other metadata in the zip file. This data is
                 displayed as a series of ascii-hex byte values in the same order they are  stored  in  the  zip
                 file.

       For example, assuming you have a zip file, "test,zip", with one entry

           $ unzip -l  test.zip
           Archive:  test.zip
           Length      Date    Time    Name
           ---------  ---------- -----   ----
               446  2023-03-22 20:03   lorem.txt
           ---------                     -------
               446                     1 file

       Running "zipdetails" will gives this output

           $ zipdetails test.zip

           0000 LOCAL HEADER #1       04034B50 (67324752)
           0004 Extract Zip Spec      14 (20) '2.0'
           0005 Extract OS            00 (0) 'MS-DOS'
           0006 General Purpose Flag  0000 (0)
                [Bits 1-2]            0 'Normal Compression'
           0008 Compression Method    0008 (8) 'Deflated'
           000A Modification Time     5676A072 (1450614898) 'Wed Mar 22 20:03:36 2023'
           000E CRC                   F90EE7FF (4178503679)
           0012 Compressed Size       0000010E (270)
           0016 Uncompressed Size     000001BE (446)
           001A Filename Length       0009 (9)
           001C Extra Length          0000 (0)
           001E Filename              'lorem.txt'
           0027 PAYLOAD

           0135 CENTRAL HEADER #1     02014B50 (33639248)
           0139 Created Zip Spec      1E (30) '3.0'
           013A Created OS            03 (3) 'Unix'
           013B Extract Zip Spec      14 (20) '2.0'
           013C Extract OS            00 (0) 'MS-DOS'
           013D General Purpose Flag  0000 (0)
                [Bits 1-2]            0 'Normal Compression'
           013F Compression Method    0008 (8) 'Deflated'
           0141 Modification Time     5676A072 (1450614898) 'Wed Mar 22 20:03:36 2023'
           0145 CRC                   F90EE7FF (4178503679)
           0149 Compressed Size       0000010E (270)
           014D Uncompressed Size     000001BE (446)
           0151 Filename Length       0009 (9)
           0153 Extra Length          0000 (0)
           0155 Comment Length        0000 (0)
           0157 Disk Start            0000 (0)
           0159 Int File Attributes   0001 (1)
                [Bit 0]               1 'Text Data'
           015B Ext File Attributes   81ED0000 (2179792896)
                [Bits 16-24]          01ED (493) 'Unix attrib: rwxr-xr-x'
                [Bits 28-31]          08 (8) 'Regular File'
           015F Local Header Offset   00000000 (0)
           0163 Filename              'lorem.txt'

           016C END CENTRAL HEADER    06054B50 (101010256)
           0170 Number of this disk   0000 (0)
           0172 Central Dir Disk no   0000 (0)
           0174 Entries in this disk  0001 (1)
           0176 Total Entries         0001 (1)
           0178 Size of Central Dir   00000037 (55)
           017C Offset to Central Dir 00000135 (309)
           0180 Comment Length        0000 (0)
           #
           # Done

   Verbose Output
       If the "-v" option is present, the metadata output is split into the following columns:

       1.   The offset, in hex, to the start of the field relative to the beginning of the file.

       2.   The offset, in hex, to the end of the field relative to the beginning of the file.

       3.   The length, in hex, of the field.

       4.   A hex dump of the bytes in field in the order they are stored in the zip file.

       5.   A textual description of the field.

       6.   Information  about  the  contents of the field. See the description in the "Default Output" for more
            details.

       Here is the same zip file, "test.zip", dumped using the "zipdetails" "-v" option:

           $ zipdetails -v test.zip

           0000 0003 0004 50 4B 03 04 LOCAL HEADER #1       04034B50 (67324752)
           0004 0004 0001 14          Extract Zip Spec      14 (20) '2.0'
           0005 0005 0001 00          Extract OS            00 (0) 'MS-DOS'
           0006 0007 0002 00 00       General Purpose Flag  0000 (0)
                                      [Bits 1-2]            0 'Normal Compression'
           0008 0009 0002 08 00       Compression Method    0008 (8) 'Deflated'
           000A 000D 0004 72 A0 76 56 Modification Time     5676A072 (1450614898) 'Wed Mar 22 20:03:36 2023'
           000E 0011 0004 FF E7 0E F9 CRC                   F90EE7FF (4178503679)
           0012 0015 0004 0E 01 00 00 Compressed Size       0000010E (270)
           0016 0019 0004 BE 01 00 00 Uncompressed Size     000001BE (446)
           001A 001B 0002 09 00       Filename Length       0009 (9)
           001C 001D 0002 00 00       Extra Length          0000 (0)
           001E 0026 0009 6C 6F 72 65 Filename              'lorem.txt'
                          6D 2E 74 78
                          74
           0027 0134 010E ...         PAYLOAD

           0135 0138 0004 50 4B 01 02 CENTRAL HEADER #1     02014B50 (33639248)
           0139 0139 0001 1E          Created Zip Spec      1E (30) '3.0'
           013A 013A 0001 03          Created OS            03 (3) 'Unix'
           013B 013B 0001 14          Extract Zip Spec      14 (20) '2.0'
           013C 013C 0001 00          Extract OS            00 (0) 'MS-DOS'
           013D 013E 0002 00 00       General Purpose Flag  0000 (0)
                                      [Bits 1-2]            0 'Normal Compression'
           013F 0140 0002 08 00       Compression Method    0008 (8) 'Deflated'
           0141 0144 0004 72 A0 76 56 Modification Time     5676A072 (1450614898) 'Wed Mar 22 20:03:36 2023'
           0145 0148 0004 FF E7 0E F9 CRC                   F90EE7FF (4178503679)
           0149 014C 0004 0E 01 00 00 Compressed Size       0000010E (270)
           014D 0150 0004 BE 01 00 00 Uncompressed Size     000001BE (446)
           0151 0152 0002 09 00       Filename Length       0009 (9)
           0153 0154 0002 00 00       Extra Length          0000 (0)
           0155 0156 0002 00 00       Comment Length        0000 (0)
           0157 0158 0002 00 00       Disk Start            0000 (0)
           0159 015A 0002 01 00       Int File Attributes   0001 (1)
                                      [Bit 0]               1 'Text Data'
           015B 015E 0004 00 00 ED 81 Ext File Attributes   81ED0000 (2179792896)
                                      [Bits 16-24]          01ED (493) 'Unix attrib: rwxr-xr-x'
                                      [Bits 28-31]          08 (8) 'Regular File'
           015F 0162 0004 00 00 00 00 Local Header Offset   00000000 (0)
           0163 016B 0009 6C 6F 72 65 Filename              'lorem.txt'
                          6D 2E 74 78
                          74

           016C 016F 0004 50 4B 05 06 END CENTRAL HEADER    06054B50 (101010256)
           0170 0171 0002 00 00       Number of this disk   0000 (0)
           0172 0173 0002 00 00       Central Dir Disk no   0000 (0)
           0174 0175 0002 01 00       Entries in this disk  0001 (1)
           0176 0177 0002 01 00       Total Entries         0001 (1)
           0178 017B 0004 37 00 00 00 Size of Central Dir   00000037 (55)
           017C 017F 0004 35 01 00 00 Offset to Central Dir 00000135 (309)
           0180 0181 0002 00 00       Comment Length        0000 (0)
           #
           # Done

   Advanced Analysis
       If you have a corrupt or non-standard zip file, particulatly one where the "Central  Directory"  metadata
       at  the  end  of  the  file  is absent/incomplete, you can use either the "--walk" option or the "--scan"
       option to search for any zip metadata that is still present in the file.

       When either of these options is enabled, this program  will  bypass  the  initial  step  of  reading  the
       "Central  Directory"  at  the end of the file and simply scan the zip file sequentially from the start of
       the file looking for zip metedata records. Although this can be error prone, for the most  part  it  will
       find any zip file metadata that is still present in the file.

       The  difference between the two options is how aggressive the sequential scan is: "--walk" is optimistic,
       while "--scan" is pessimistic.

       To understand the difference in more detail you need to know  a  bit  about  how  zip  file  metadata  is
       structured.  Under the hood, a zip file uses a series of 4-byte signatures to flag the start of a each of
       the metadata records it uses. When the "--walk" or the "--scan" option is enabled both  work  identically
       by  scanning  the  file from the beginning looking for any the of these valid 4-byte metadata signatures.
       When a 4-byte signature is found both options will blindly assume that  it  has  found  a  vald  metadata
       record and display it.

       "--walk"

       The "--walk" option optimistically assumes that it has found a real zip metatada record and so starts the
       scan for the next record directly after the record it has just output.

       "--scan"

       The  "--scan"  option  is  pessimistic  and  assumes the 4-byte signature sequence may have been a false-
       positive, so before starting the scan for the next resord, it will rewind to the  location  in  the  file
       directly  after  the  4-byte  sequecce it just processed. This means it will rescan data that has already
       been processed.  For very lage zip files the "--scan" option can be really  realy  slow,  so  trying  the
       "--walk" option first.

       Important  Note: If the zip file being processed contains one or more nested zip files, and the outer zip
       file uses the "STORE" compression method, the "--scan" option will display the zip metadata for both  the
       outer & inner zip files.

   Filename Encoding Issues
       Sometimes  when  displaying  the contents of a zip file the filenames (or comments) appear to be garbled.
       This section walks through the reasons and mitigations that can be applied to work around these issues.

       Background

       When zip files were first created in the 1980's, there was no Unicode or UTF-8. Issues  around  character
       set encoding interoperability were not a major concern.

       Initially, the only official encoding supported in zip files was IBM Code Page 437 (AKA "CP437"). As time
       went  on  users  in  locales  where "CP437" wasn't appropriate stored filenames in the encoding native to
       their locale.  If you were running a system that matched the locale of the zip file,  all  was  well.  If
       not, you had to post-process the filenames after unzipping the zip file.

       Fast  forward  to  the introduction of Unicode and UTF-8 encoding. The approach now used by all major zip
       implementations is to set the "Language encoding flag" (also known as "EFS") in the zip file metadata  to
       signal that a filename/comment is encoded in UTF-8.

       To  ensure  maximum  interoperability when sharing zip files store 7-bit filenames as-is in the zip file.
       For anything else the "EFS" bit needs to be set and the filename is encoded in UTF-8. Although this  rule
       is kept to for the most part, there are exceptions out in the wild.

       Dealing with Encoding Errors

       The most common filename encoding issue is where the "EFS" bit is not set and the filename is stored in a
       character  set that doesnt't match the system encoding. This mostly impacts legacy zip files that predate
       the introduction of Unicode.

       To deal with this issue you first need to know what encoding was used in the zip file.  For  example,  if
       the filename is encoded in "ISO-8859-1" you can display the filenames using the "--encoding" option

           zipdetails --encoding ISO-8859-1 myfile.zip

       A  less  common  variation  of  this  is where the "EFS" bit is set, signalling that the filename will be
       encoded in UTF-8, but the filename is not encoded  in  UTF-8.  To  deal  with  this  scenarion,  use  the
       "--no-language-encoding" option along with the "--encoding" option.

LIMITATIONS

       The following zip file features are not supported by this program:

       •    Multi-part/Split/Spanned Zip Archives.

            This program cannot give an overall report on the combined parts of a multi-part zip file.

            The  best  you  can do is run with either the "--scan" or "--walk" options against individual parts.
            Some will contains zipfile metadata which will be detected and some  will  only  contain  compressed
            payload data.

       •    Encrypted Central Directory

            When  pkzip  Strong  Encryption  is  enabled  in a zip file this program can still parse most of the
            metadata in the zip file. The exception is when the "Central  Directory"  of  a  zip  file  is  also
            encrypted. This program cannot parse any metadata from an encrypted "Central Directory".

       •    Corrupt Zip files

            When "zipdetails" encounters a corrupt zip file, it will do one or more of the following

            •    Display details of the corruption and carry on

            •    Display details of the corruption and terminate

            •    Terminate with a generic message

            Which of the above is output is dependent in the severity of the corruption.

TODO

   JSON/YML Output
       Output some of the zip file metadata as a JSON or YML document.

   Corrupt Zip files
       Although  the  detection  and  reporting  of  most  of  the  common  corruption  use-cases  is present in
       "zipdetails", there are likely to be other edge cases that need to be supported.

       If you have a corrupt Zip file that isn't being processed properly, please report it (see  "SUPPORT").

SUPPORT

       General feedback/questions/bug reports should be sent to <https://github.com/pmqs/zipdetails/issues>.

SEE ALSO

       The        primary         reference         for         Zip         files         is         APPNOTE.TXT
       <https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT>.

       An     alternative     reference     is    the    Info-Zip    appnote.    This    is    available    from
       <ftp://ftp.info-zip.org/pub/infozip/doc/>

       For details of WinZip AES encryption see AES Encryption Information: Encryption  Specification  AE-1  and
       AE-2 <https://www.winzip.com/en/support/aes-encryption/>.

       The  "zipinfo"  program  that  comes with the info-zip distribution (<http://www.info-zip.org/>) can also
       display details of the structure of a zip file.

AUTHOR

       Paul Marquess pmqs@cpan.org.

COPYRIGHT

       Copyright (c) 2011-2024 Paul Marquess. All rights reserved.

       This program is free software; you can redistribute it and/or modify it under  the  same  terms  as  Perl
       itself.

perl v5.40.1                                       2025-07-03                                      ZIPDETAILS(1)