Dear all,
I am sure I have missed something here... but I just cannot figure out what I did wrong... Sorry if this is something super obvious and thank you in advance for all help you can provide!
I have been trying to embed a XMP into PDFs --- I read the "adding xmp data" thread (https://exiftool.org/forum/index.php?topic=2922.0). Unfortunately, I cannot go for the "exiftool -tagsfromfile xmp.xml "-all>xmp:all" FILE" option as I do have some unusual tags in my XMP input and I need the complete XMP embedded in the XMP box.
So... here is the command I tried
exiftool "-xml<=pdf-xmp.xml" page-pdf.pdf
And it returns
Warning: Invalid XMP data for XMP:XMP
0 image files updated
1 image files unchanged
I've tried validate my XMP with both W3C validator (https://www.w3.org/RDF/Validator/rdfval) and PDFLib XMP Validator (https://www.pdflib.com/knowledge-base/xmp-metadata/free-xmp-validator/)
Here is the XMP source
---
1: <?xml version="1.0"?>
2:
3: <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
4: <rdf:Description rdf:about=""
5: xmlns:dc="http://purl.org/dc/elements/1.1/">
6: <dc:format>application/pdf</dc:format>
7: <dc:title>
8: <rdf:Alt>
9: <rdf:li xml:lang="en">Coronado Tent City Daily Program (Coronado, CA) 1903-07-30 [p ].</rdf:li>
10: </rdf:Alt>
11: </dc:title>
12: <dc:description>
13: <rdf:Alt>
14: <rdf:li xml:lang="en">Page from Coronado Tent City Daily Program (newspaper). [See LCCN: sn94051565 for catalog record.]. Prepared on behalf of Coronado Public Library.</rdf:li>
15: </rdf:Alt>
16: </dc:description>
17: <dc:date>
18: <rdf:Seq>
19: <rdf:li xml:lang="x-default">1903-07-30</rdf:li>
20: </rdf:Seq>
21: </dc:date>
22: <dc:type>
23: <rdf:Bag>
24: <rdf:li xml:lang="en">text</rdf:li>
25: <rdf:li xml:lang="en">newspaper</rdf:li>
26: </rdf:Bag>
27: </dc:type>
28: </rdf:Description>
29: <rdf:Description rdf:about=""
30: xmlns:xapMM="http://ns.adobe.com/xap/1.0/mm/">
31: <xapMM:InstanceID>uuid:6e360eb7-2c72-4534-9f4a-205b8f590321</xapMM:InstanceID>
32: <xapMM:DocumentID>uuid:969f4ad8-a768-4a84-b555-f86e30214b1d</xapMM:DocumentID>
33: </rdf:Description>
34: <rdf:Description rdf:about=""
35: xmlns:xap="http://ns.adobe.com/xap/1.0/">
36: <xap:CreateDate>2011-07-09T00:54:36-05:00</xap:CreateDate>
37: <xap:ModifyDate>2011-07-09T08:14:42-05:00</xap:ModifyDate>
38: <xap:MetadataDate>2011-07-09T08:14:42-05:00</xap:MetadataDate>
39: </rdf:Description>
40: <rdf:Description rdf:about=""
41: xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
42: <pdf:Producer/>
43: </rdf:Description>
44: </rdf:RDF>
I guess my questions are
1. Why does exiftool think the XMP is invalid?
2. Is there a way to embed the complete XMP into PDF regardless its validility?
Here is an example of a PDF that has my XMP embedded
---
exiftool -D -b -xmp page-pdf.pdf
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 4.0-c321 44.398116, Tue Aug 04 2009 14:24:39">
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about=""
xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:format>application/pdf</dc:format>
<dc:title>
<rdf:Alt>
<rdf:li xml:lang="en">Coronado Tent City Daily Program (Coronado, CA) 1903-07-30 [p ].</rdf:li>
</rdf:Alt>
</dc:title>
<dc:description>
<rdf:Alt>
<rdf:li xml:lang="en">Page from Coronado Tent City Daily Program (newspaper). [See LCCN: sn94051565 for catalog record.]. Prepared on behalf of Coronado Public Library.</rdf:li>
</rdf:Alt>
</dc:description>
<dc:date>
<rdf:Seq>
<rdf:li xml:lang="x-default">1903-07-30</rdf:li>
</rdf:Seq>
</dc:date>
<dc:type>
<rdf:Bag>
<rdf:li xml:lang="en">text</rdf:li>
<rdf:li xml:lang="en">newspaper</rdf:li>
</rdf:Bag>
</dc:type>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:xapMM="http://ns.adobe.com/xap/1.0/mm/">
<xapMM:InstanceID>uuid:6e360eb7-2c72-4534-9f4a-205b8f590321</xapMM:InstanceID>
<xapMM:DocumentID>uuid:969f4ad8-a768-4a84-b555-f86e30214b1d</xapMM:DocumentID>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:xap="http://ns.adobe.com/xap/1.0/">
<xap:CreateDate>2011-07-09T00:54:36-05:00</xap:CreateDate>
<xap:ModifyDate>2011-07-09T08:14:42-05:00</xap:ModifyDate>
<xap:MetadataDate>2011-07-09T08:14:42-05:00</xap:MetadataDate>
</rdf:Description>
<rdf:Description rdf:about=""
xmlns:pdf="http://ns.adobe.com/pdf/1.3/">
<pdf:Producer/>
</rdf:Description>
</rdf:RDF>
</x:xmpmeta>
Again, thank you very much for shedding some light on this!!
Best regards,
Jeffrey
I have spent some more time on this and I think I found a workaround to this problem.
After further investigation, I found that I can most accomplish what I need by setting each XMP tag individually instead of feeding in the XMP xml.
This was almost working with one exception --- My validator is expecting to have rdf li tag within the dc.identifier field.
The solution is to overwrite the tag definition through -config option and add writable => 'lang-alt'.
%Image::ExifTool::UserDefined = (
'Image::ExifTool::XMP::dc' => {
identifier => { Groups => { 2 => 'Image' }, Writable => 'lang-alt' },
},
);
Great design Phil, that UserDefined capability has already saved me twice!!
Again, thanks for creating this great software,
Best regards,
Jeffrey
Hi Jeffrey,
I love it when someone solves their own problem (and posts the solution!). :)
- Phil