.docx thumbnail extraction

Started by TopSolid, April 27, 2011, 03:17:07 AM

Previous topic - Next topic

TopSolid

Hi,

For the ResourceSpace application, I try to use exiftool to extract embedded thumbnail of office documents, .docx, .doc, etc.
With this command line on windows, it works for jpg image:

exiftool.exe -b –previewimage 1.jpg > image_exif.jpg

But it doesn't with .docx:
exiftool.exe -b –previewimage 1.docx > image_exif.jpg

However with exiftoolGUI, I can see the thumbnail of the .docx, so it should work.
I know there's something about zip feature in this case, but I don't know how.

Anyone has an idea?
I attach a sample of Word file for testing

Thanks in advance!




TopSolid

I have installed complete version in Perl. Reading the source code, exiftool does not manage wmf format, thumbnail of docx format when unzipping docx. It manages only jpeg and png bitmaps... Any ideas guys?

Phil Harvey

ExifToolGUI uses windows routines to display the preview.  ExifTool only extracts embedded JPEG previews, and I don't think DOCX files have one.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TopSolid

Hi Phil,

Thanks for your answer.
In unzipped docx, preview file is saved as wmf format in Docprops/thumbnail.wmf. Any plan to support this format in the next version of exiftool?
It would be fantastic.

Rgds

David


Phil Harvey

Hi David,

Thanks for the suggestion, I'll look into this when I get back after my vacation (in 2 weeks or so).

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TopSolid

Thanks! :)

Keep me posted and have a nice vacation.

Rgds

David

TopSolid

Hello Phil,

Any news about that wmf support by exiftool?

Thanks

David

Phil Harvey

I put that on my list but I hadn't done anything about it yet.  Sometimes I need a little prompting.  Thanks.

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Phil Harvey

Adding the ability to extract the WMF thumbnail looks easy but all of my DOCX samples use a JPEG thumbnail.  Could you email me one with a WMF thumbnail for testing?  My mail is philharvey66 at gmail.com

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TopSolid

Hi Phil,

I just sent you docx files. What Office version did you used (2003/2007/2010)? I have Office 2007 on Windows 7 x64. It is always wmf format for Word, jpeg for pptx and xlsx.

Thanks!

TopSolid

There is also a docx attached to my first post...

Keep me posted!

Thanks

Phil Harvey

Great, thanks.  The more samples the better.  There are various flavours of MXF files, and I'd like to see if I can extract some information from them too.

I was using MS Office for the Mac.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TopSolid

Hi Phil,

I have seen in latest release 8.59 that  wmf docx thumbnail is now supported, great news!  :)
So I try to test it and resulting jpg file is always empty.

I have tried this:
exiftool.exe -b –previewimage 1.docx > 1.jpg

and this:
exiftool.exe -b –ThumbnailImage 1.docx > 1.jpg

I attach docx file.

Any idea?

David

Phil Harvey

...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

TopSolid

Phil,

Thanks, it works with this command line:
exiftool -b -previewwmf 1.docx > 1.wmf
Result is always the original wmf file, not a jpg file.
But I was expecting exiftool to convert wmf file to jpg.

When you announce "Recognize WMF images", is something exiftool could support without any graphic tool box?

Thanks!