Warning message with large ZIP file: Format error reading ZIP file

Started by Longshou, August 21, 2018, 07:32:08 PM

Previous topic - Next topic

Longshou

Hi,
Thank you very much for the nice ExifTool! We use it to identify corrupted files and I think it works good. But with large ZIP files that are more than four gigabytes, we see a warning message "Format error reading ZIP file":

$ exiftool -api largefilesupport=1 -validate -warning -error -a Archive.zip
Warning                         : Format error reading ZIP file

$ exiftool -api largefilesupport=1 Archive.zip
ExifTool Version Number         : 10.15
File Name                       : Archive.zip
Directory                       : .
File Size                       : 5762 MB
File Modification Date/Time     : 2018:08:17 20:15:51-07:00
File Access Date/Time           : 2018:08:21 15:33:11-07:00
File Inode Change Date/Time     : 2018:08:17 20:15:51-07:00
File Permissions                : rwxr--r--
Warning                         : Format error reading ZIP file
File Type                       : ZIP
File Type Extension             : zip
MIME Type                       : application/zip
Zip Required Version            : 20
Zip Bit Flag                    : 0
Zip Compression                 : Deflated
Zip Modify Date                 : 2017:11:22 11:47:03
Zip CRC                         : 0xd34951ca
Zip Compressed Size             : 180413
Zip Uncompressed Size           : 836168
Zip File Name                   : IVT-CONNECT-750_Object_Counts.nc


It seem like it fails to read though all the content files in a large ZIP file. We've tested it on MAC and Linux, and this will happen with the latest version 11.10 and other older versions. Is there anything wrong with LargeFileSupport for the ZIP format? Please let me know if you need more information. Thanks.

Best Regards,
Longshou

Phil Harvey

Thanks.  I'll look into this and post back when I have something to report.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Unfortunately it seems that Archive::Zip doesn't support zip files larger than 4GB (this is after updating to Archive::Zip 1.62):

> exiftool test.zip -v3
  ExifToolVersion = 11.10
  FileName = test.zip
[...]
  --- using Archive::Zip ---
  Warning = Format error reading ZIP file
  -- processing as binary data --
  FileType = ZIP
  FileTypeExtension = ZIP
[...]


When Archive::Zip fails, ExifTool processes the file itself using a simplified algorithm which doesn't support decompression, decryption, or the 64-bit extensions.  I've looked briefly into the 64-bit extensions, and it would require a complete re-write of the algorithm to be able to support these.  So it looks like 64-bit support isn't going to happen soon.  :(

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Longshou

Hi Phil,
I am glad that you've found the underlying issue with it. Thank you very much for looking into it. I hope this can be fixed in the near future.
Longshou

StarGeek

Is using another module a possibility?  My random google search seems to indicate this has been a problem with Archive::Zip since 2011.  One of the options mentioned there is to use IO::Compress::Zip and IO::Uncompress::Unzip, though I haven't looked to see if those options allow for reading data about the file.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Phil Harvey

Yes, thanks.  I saw that, but I haven't looked at the interface to see how much would need changing.  I think I picked Archive::Zip because it was part of the standard Perl installation on Linux and Mac, and I don't think that IO::Compress::Unzip was, but things may have changed since 2007 when I added ZIP file support.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).