Newbie question : Example of writing metadata to PDF under windows line command?

Started by Andy62, September 22, 2020, 03:57:55 PM

Previous topic - Next topic

Andy62

Hi, I tried to find an example in the forum about modifying metadata like Author, Title and Subject in a pdf file using the windows command line version with no luck. I know should be simple but.... can't find any. Any example would be greatly appreciated. I need to create a batch file (.bat) for 5,000 pdf files with specific metadata for each file. Adobe Acrobat is a waste of time  :o

Thanks,
Andy

StarGeek

At the very basic, you would do something like
exiftool -Author="Charles Dickens" -Title="A Tale of Two Cities" -Subject="A historical novel set during the French Revolution" /path/to/files/

But you need to be aware that there are other tags with those names and there are many tags that other programs might call be a different name, see FAQ #3 for details.  The above command used on a PDF will write to both the PDF tags and the XMP tags, which should cover all the bases. 
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Andy62

Thanks, it got me going... and did what I needed! Great stuff!

a few things...


  • A problem with windows command line if you use UTF-8 character like éèâ: -author="André âme français". It won't send the proper code to ExifTool. So I just replaced them with non-accent. It gives: Warning: Malformed UTF-8 character(s) - c:/exiftool/icfo3281.pdf. It replaces the éèâ with ? sign. Must be a problem with the command line of windows.

  • If the metadata is empty when I write to it, it gives Warning: [minor] Ignored empty rdf:Bag list for dc:creator - c:/exiftool/icfo3281.pdf. But everything is OK in the file.

  • This is a strange one. When I write into -Subject it is also copied in Keywords and if I write in -Subject and -keywords, I still have the Subject at the beginning of Keywords.

Must say, it's warp speed compare to Adobe Acrobat and it's works better than javascript in Acrobat!  ;D

StarGeek

Quote from: Andy62 on September 22, 2020, 10:46:40 PMA problem with windows command line if you use UTF-8 character like éèâ: -author="André âme français". It won't send the proper code to ExifTool. So I just replaced them with non-accent. It gives: Warning: Malformed UTF-8 character(s) - c:/exiftool/icfo3281.pdf. It replaces the éèâ with ? sign. Must be a problem with the command line of windows.

Windows command line doesn't do well with non-ascii characters.  See FAQ #10 and/or FAQ #18.  For the characters you mention, adding the -L (latin) option should help.

QuoteThis is a strange one. When I write into -Subject it is also copied in Keywords and if I write in -Subject and -keywords, I still have the Subject at the beginning of Keywords.

This is one of the same name tags I mentioned.  By default for Subject, exiftool is writing to the PDF:Subject (see PDF tags) and XMP:Subject (see XMP Dublin Core tags).  The trouble is that the two tags have completely different uses.  The former is a simple string.  I believe it's the description, though I haven't checked the specs.  The latter, on the other hand, is a list type tag which is the XMP version of keywords.

You can narrow down the tags written by using PDF:Subject or XMP:Subject to limit which tag is written.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Andy62

This solves the situations, super! But.... there is always a but in programming  :-\

I make the command lines in PHP because all my info is from a MySQL database. This causes problems between encoding types.

The -L works well if I enter the command line directly in the windows command line. But if I paste this command line in a .bat file and run it, It doesn't work. So the .bat file that I make is not in the proper encoding.

Still using -L, in php this fix it: $pdfScript = mb_convert_encoding($pdfScript, "CP850"); before saving the file in PHP. Finaly, it's a 2MB .bat file, works great!

Thank you for your help, much appreciated.