Hi, I tried to find an example in the forum about modifying metadata like Author, Title and Subject in a pdf file using the windows command line version with no luck. I know should be simple but.... can't find any. Any example would be greatly appreciated. I need to create a batch file (.bat) for 5,000 pdf files with specific metadata for each file. Adobe Acrobat is a waste of time :o
Thanks,
Andy
At the very basic, you would do something like
exiftool -Author="Charles Dickens" -Title="A Tale of Two Cities" -Subject="A historical novel set during the French Revolution" /path/to/files/
But you need to be aware that there are other tags with those names and there are many tags that other programs might call be a different name, see FAQ #3 (https://exiftool.org/faq.html#Q3) for details. The above command used on a PDF will write to both the PDF tags and the XMP tags, which should cover all the bases.
Thanks, it got me going... and did what I needed! Great stuff!
a few things...
- A problem with windows command line if you use UTF-8 character like éèâ: -author="André âme français". It won't send the proper code to ExifTool. So I just replaced them with non-accent. It gives: Warning: Malformed UTF-8 character(s) - c:/exiftool/icfo3281.pdf. It replaces the éèâ with ? sign. Must be a problem with the command line of windows.
- If the metadata is empty when I write to it, it gives Warning: [minor] Ignored empty rdf:Bag list for dc:creator - c:/exiftool/icfo3281.pdf. But everything is OK in the file.
- This is a strange one. When I write into -Subject it is also copied in Keywords and if I write in -Subject and -keywords, I still have the Subject at the beginning of Keywords.
Must say, it's warp speed compare to Adobe Acrobat and it's works better than javascript in Acrobat! ;D
Quote from: Andy62 on September 22, 2020, 10:46:40 PMA problem with windows command line if you use UTF-8 character like éèâ: -author="André âme français". It won't send the proper code to ExifTool. So I just replaced them with non-accent. It gives: Warning: Malformed UTF-8 character(s) - c:/exiftool/icfo3281.pdf. It replaces the éèâ with ? sign. Must be a problem with the command line of windows.
Windows command line doesn't do well with non-ascii characters. See FAQ #10 (https://sno.phy.queensu.ca/~phil/exiftool/faq.html#Q10) and/or FAQ #18 (https://exiftool.org/faq.html#Q18). For the characters you mention, adding the
-L (latin) option (https://exiftool.org/exiftool_pod.html#L--latin) should help.
QuoteThis is a strange one. When I write into -Subject it is also copied in Keywords and if I write in -Subject and -keywords, I still have the Subject at the beginning of Keywords.
This is one of the same name tags I mentioned. By default for
Subject, exiftool is writing to the
PDF:Subject (see PDF tags (https://exiftool.org/TagNames/PDF.html)) and
XMP:Subject (see XMP Dublin Core tags (https://exiftool.org/TagNames/XMP.html#dc)). The trouble is that the two tags have completely different uses. The former is a simple string. I believe it's the description, though I haven't checked the specs. The latter, on the other hand, is a list type tag which is the XMP version of keywords.
You can narrow down the tags written by using
PDF:Subject or
XMP:Subject to limit which tag is written.
This solves the situations, super! But.... there is always a but in programming :-\
I make the command lines in PHP because all my info is from a MySQL database. This causes problems between encoding types.
The -L works well if I enter the command line directly in the windows command line. But if I paste this command line in a .bat file and run it, It doesn't work. So the .bat file that I make is not in the proper encoding.
Still using -L, in php this fix it: $pdfScript = mb_convert_encoding($pdfScript, "CP850"); before saving the file in PHP. Finaly, it's a 2MB .bat file, works great!
Thank you for your help, much appreciated.