Newbie question : Example of writing metadata to PDF under windows line command?

Started by Andy62, September 22, 2020, 03:57:55 PM

Previous topic - Next topic

Andy62

Hi, I tried to find an example in the forum about modifying metadata like Author, Title and Subject in a pdf file using the windows command line version with no luck. I know should be simple but.... can't find any. Any example would be greatly appreciated. I need to create a batch file (.bat) for 5,000 pdf files with specific metadata for each file. Adobe Acrobat is a waste of time  :o

Thanks,
Andy

StarGeek

At the very basic, you would do something like
exiftool -Author="Charles Dickens" -Title="A Tale of Two Cities" -Subject="A historical novel set during the French Revolution" /path/to/files/

But you need to be aware that there are other tags with those names and there are many tags that other programs might call be a different name, see FAQ #3 for details.  The above command used on a PDF will write to both the PDF tags and the XMP tags, which should cover all the bases. 
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Andy62

Thanks, it got me going... and did what I needed! Great stuff!

a few things...


  • A problem with windows command line if you use UTF-8 character like éèâ: -author="André âme français". It won't send the proper code to ExifTool. So I just replaced them with non-accent. It gives: Warning: Malformed UTF-8 character(s) - c:/exiftool/icfo3281.pdf. It replaces the éèâ with ? sign. Must be a problem with the command line of windows.

  • If the metadata is empty when I write to it, it gives Warning: [minor] Ignored empty rdf:Bag list for dc:creator - c:/exiftool/icfo3281.pdf. But everything is OK in the file.

  • This is a strange one. When I write into -Subject it is also copied in Keywords and if I write in -Subject and -keywords, I still have the Subject at the beginning of Keywords.

Must say, it's warp speed compare to Adobe Acrobat and it's works better than javascript in Acrobat!  ;D

StarGeek

Quote from: Andy62 on September 22, 2020, 10:46:40 PMA problem with windows command line if you use UTF-8 character like éèâ: -author="André âme français". It won't send the proper code to ExifTool. So I just replaced them with non-accent. It gives: Warning: Malformed UTF-8 character(s) - c:/exiftool/icfo3281.pdf. It replaces the éèâ with ? sign. Must be a problem with the command line of windows.

Windows command line doesn't do well with non-ascii characters.  See FAQ #10 and/or FAQ #18.  For the characters you mention, adding the -L (latin) option should help.

QuoteThis is a strange one. When I write into -Subject it is also copied in Keywords and if I write in -Subject and -keywords, I still have the Subject at the beginning of Keywords.

This is one of the same name tags I mentioned.  By default for Subject, exiftool is writing to the PDF:Subject (see PDF tags) and XMP:Subject (see XMP Dublin Core tags).  The trouble is that the two tags have completely different uses.  The former is a simple string.  I believe it's the description, though I haven't checked the specs.  The latter, on the other hand, is a list type tag which is the XMP version of keywords.

You can narrow down the tags written by using PDF:Subject or XMP:Subject to limit which tag is written.
"It didn't work" isn't helpful. What was the exact command used and the output.
Read FAQ #3 and use that cmd
Please use the Code button for exiftool output

Please include your OS/Exiftool version/filetype

Andy62

This solves the situations, super! But.... there is always a but in programming  :-\

I make the command lines in PHP because all my info is from a MySQL database. This causes problems between encoding types.

The -L works well if I enter the command line directly in the windows command line. But if I paste this command line in a .bat file and run it, It doesn't work. So the .bat file that I make is not in the proper encoding.

Still using -L, in php this fix it: $pdfScript = mb_convert_encoding($pdfScript, "CP850"); before saving the file in PHP. Finaly, it's a 2MB .bat file, works great!

Thank you for your help, much appreciated.