Why are sidecar files used ?

Started by Skids, March 08, 2020, 07:50:13 AM

Previous topic - Next topic

Skids

Hi,

I am in the process of trying to add some logic to how my raw images are stored.  This process has demonstrated the problems of having thousands of sidecar files alongside the original raw files, most of which just store a few keywords and other meta data.  So I would like to be rid of them.

Now much like Pandora I lifted the lid of the box of goodies that is a raw or tiff file.  Peaking inside I think I have a basic understanding of how the tables of data tags in  "Image File directories" (IFD) work i.e. First table lists a number of tags, each of which points to a block of data, and then closes with a pointer to the next IFD.  Adding or modifying iptc meta data involves inserting the Ascii data as a block, adding a tag that points to the block and updating any pointers that point to blocks after the modification so that they point to the new location. So if a new block of 30 bytes is added then the location of all blocks following the new block are located 30 bytes further along the file.   Exiftool is capable of doing this.

The other fact worth noting is that most raw files are based on the tiff specification.  So it seems to me that any application that manipulates tiff files should be able to manipulate raw files meaning that there has never has been any need for all those .xmp files.

Obviously a little knowledge is a dangerous thing and I am mistaken.  So why were .xmp files forced upon us poor users of digital cameras?

best wishes

Simon

Phil Harvey

Hi Simon,

For a bit more insight into the raw file mess, read here.

As you can see, various manufacturers take liberties with the TIFF specification (ie. their programmers don't know what they are doing), which makes rewriting these files properly a bit of a challenge to say the least.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

A couple additional items.

Some files either have poor software support for reading embedded metadata, e.g. PNGs, or they don't allow metadata at all, e.g. BMPs.  Also, while it is possible to add metadata to some files, such as AVIs and MKVs, the documentation on doing so is near non-existent.  Or you have to do so through programs like FFMpeg, which, IMO, are harder to understand and not as flexible as exiftool.

Secondly, it can speed up backups immensely. The NEF files from my old D5100 are 15-25 megs each and I can shoot a couple thousand images over a weekend at some conventions I go to.  Newer, better cameras will have even larger file size.  If I directly edit those files, even changing a few bytes such as removing a couple spaces from a description, that could mean, depending upon the backup program, hours of copying the new backup.  Or if backing up to the cloud, days.  It'd be even worse if you're changing metadata in a video file.  But a few bytes in a sidecar?  That'd be pretty quick and easy.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

obetz

I fully agree with StarGeek.

Dealing with sidecars is somewhat inconvenient but worth to do.

Using DxO Photolab as raw processor, I have two sidecar files for each raw file: One for the metadata and one for the processing settings.

Renaming and moving these triplets is best done with a DAM like Photools IMatch, although this one does modify (write) raw files in the default setting so you need to change the settings if you want to avoid touching the raw files.

Skids

QuoteFor a bit more insight into the raw file mess, read here.
. Wow!  A master of understatement, indeed what a mess.  Seems that claims I read that the original manufacturers know how to get the best from their raw files are probably incorrect, as a number of them seem unable to follow their own internal specification.  Don't they do any testing ?

QuoteSecondly, it can speed up backups immensely.
. An excellent point.

I am coming round to the idea that what is needed is a robust method of working with image and multiple sidecar "groups".

Thinks - how about a new file format that wraps an image and all its associate sidecar files........ only half joking ;-)

best wishes
Simon

Phil Harvey

Hi Simon,

Quote from: Skids on March 10, 2020, 08:53:45 AM
Thinks - how about a new file format that wraps an image and all its associate sidecar files........ only half joking ;-)

Not joking:  I invented just such a format in 2005.  It is called MIE (Meta Information Encapsulation; current specification here).  But the chance of anything but ExifTool actually using this format is infinitesimal.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

StarGeek

Quote from: obetz on March 08, 2020, 03:02:03 PM
I fully agree with StarGeek.

Well, I was only stating good reasons why to do it.  I don't use sidecars myself because it adds so much extra work and book keeping.

In the examples I use, I try to losslessly convert any format that has poor metadata support to something better. MKVs get converted to MP4s with either FFMpeg or AVIDemux (lossless container conversion).  PNGs to Tiffs if I need it to be lossless (clip art, simple graphics), otherwise Jpeg if that format is more appropriate (photos).

In the case of backups, I use Duplicati which uses a block based backup based on rdiff, which is further based on rsync.  So the result is even though I can make a minor change to thousands of files, it just saves the diffs and there is only a slight increase in the backup size. 
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

obetz

Quote from: StarGeek on March 12, 2020, 12:05:43 PM
Well, I was only stating good reasons why to do it.  I don't use sidecars myself because it adds so much extra work and book keeping.

I see.

Using IMatch, it's easy to deal with file sets consisting of raw image, XMP sidecar, DxO sidecar, and developed images.

BTW: I'm also using Duplicati as one of my two backup methods, but it's rather slow on large data sets (e.g. thousands of raw images).


StarGeek

Quote from: obetz on March 13, 2020, 01:17:28 PM
BTW: I'm also using Duplicati as one of my two backup methods, but it's rather slow on large data sets (e.g. thousands of raw images).

Yep.  I've split my backups into multiple groups that get spread out over a three day period.  Each group takes 1-2 hours each evening, even for batches that haven't had any changes.  But I just love the fact that I can revert to a version from months ago if I need to.
* Did you read FAQ #3 and use the command listed there?
* Please use the Code button for exiftool code/output.
 
* Please include your OS, Exiftool version, and type of file you're processing (MP4, JPG, etc).

Alan Clifford

I use rsync for backups.  So if a file hasn't changed, today's backup is a hardink to the previous backup.  If a file has changed, rsync uses the previous version on the server as a basis so that minimal data is sent over the internet.