Hi Phil,
I have a pdf file created in Adobe Illustrator. When I read the meta info from the file:
exiftool -fast -j -c %+.6f 1KwRy3PXvWXNpsPCkwBM3.pdf > pdf-file.json
The pdf-file.json has the expected:
"FileType": "PDF",
"FileTypeExtension": "pdf",
"MIMEType": "application/pdf"
And then as a stream with:
curl -s -S -f file:///Users/jonharvey/Downloads/exiftool-pdf-issue/1KwRy3PXvWXNpsPCkwBM3.pdf | exiftool -fast -j -c %+.6f - > pdf-stream.json
However, the pdf-stream.json has:
"FileType": "AI",
"FileTypeExtension": "ai",
"MIMEType": "application/vnd.adobe.illustrator"
I'd like the stream output to match the file output, so the FileType,FileTypeExtension,MIMEType matches the original file extension and is pdf.
Is this possible with any exiftool command options or am I missing a specific curl option?
Thanks,
Jon
cURL from a local file? Interesting. Didn't realize that was something cURL could do.
Try removing the -fast option (https://exiftool.org/exiftool_pod.html#fast-NUM). I know that when I use that option with cURL with an online file, the whole file is not downloaded, so maybe some data is getting cut that would be needed to properly identify the file.
Hi,
I tried removing the fast option for the stream input, i.e.:
curl -s -S -f file:///Users/jonharvey/Downloads/exiftool-pdf-issue/1KwRy3PXvWXNpsPCkwBM3.pdf | exiftool -j -c %+.6f - > pdf-stream2.json
But still get:
"FileType": "AI",
"FileTypeExtension": "ai",
"MIMEType": "application/vnd.adobe.illustrator"
Phil will have to comment on this. I can't replicate the problem locally with the PDFs I tried.
Thanks for the quick reply and taking a look StarGeek.
I have had a chance to look at the source code and it looks like there is logic in lib/Image/ExifTool/PDF.pm starting at line 364 which uses the file's FILE_EXT to either set the FileType to AI or PDF - here it is:
364 Illustrator => {
365 # assume this is an illustrator file if it contains this directory
366 # and doesn't have a ".PDF" extension
367 Condition => q{
368 $self->OverrideFileType("AI") unless $$self{FILE_EXT} and $$self{FILE_EXT} eq 'PDF';
369 return 1;
370 },
371 SubDirectory => { TagTable => 'Image::ExifTool::PDF::Illustrator' },
372 },
So my guess is that as the input is a stream, the FILE_EXT is not set, so the FileType becomes AI.
When the input is a file, FILE_EXT is set to PDF, so the FileType remains PDF.
But like I said that's just my guess...
If I could somehow pass in the input stream's fileName as an option, so FILE_EXT is set then PDF.pm would probably behave the same as with a file input.
Thanks again,
Jon
Illustrator files are difficult to distinguish from regular PDF files so ExifTool uses the file extension as a clue. Normally I try to avoid this if possible.
- Phil