Best way to naming RAW-files

Started by Jom, July 14, 2019, 09:15:11 PM

Previous topic - Next topic

Jom

I've been trying for a long time to create a universal RAW-file naming algorithm.
All offers found on the Internet from various photographers I consider defective.
Their naming systems are based on subjective preferences and do not take diversity of shooting processes.
Naming system must be based on hard constant parameters in metadata of particular file.

Basic requirements for file names:

1. Ability to fully restore file name if it was damaged.
Name will be associated with many processes and it should never change.
This is possible if all the data needed for naming will always be with the file. This data is metadata created by camera at shooting.

2. File name must be unique in the world.
Unfortunately, camera manufacturers don't think about it.
I checked the RAW-files of several professional popular cameras and did not find standalone data for the unique name.

As a result, I understood that there is not the same way for all cameras.

But the general principle of name creating exists.

The file name can be made unique using combination of:

date;
time;
camera model;
internal serial number of camera;
name of the RAW-file at birth.


Unfortunately, I have found that manufacturers do not write file name at birth (such as IMG_236) to metadata.
This is important for file differences and chronology within one second.
Also, not all cameras store a serial number or internal serial number.

The file name becomes long, but this is acceptable if you do not make a deep hierarchy of folders with long names.

Ideally, it would be good if when you receive a file from the camera you immediately write a new unique name into its metadata forever and restore it when the file name is damaged.

Now the result in theory looks like this:

2019/
    20191021/
            20191021_145325_CanonEOS5DMarkIV_ZA2562417_IMG_236.cr2
            or without camera name
            20191021_145325_ZA2562417_IMG_236.cr2
            or with version
            20191021_145325_ZA2562417_IMG_236_v02.jpg


You also need to remove spaces or other characters when name creating:

...Canon EOS 5D MarkIV... -> ...CanonEOS5DMarkIV...

I am interested in your thoughts on this matter.

Hayo Baan

Since the very beginning of my digital photography I have adapted a simple, but efficient naming convention, which I only needed to extend once (to cover case b below). Like you I wanted: file names to always be unique and derived from metadata. I also liked to be able to have the files sort based on time of shooting and I wanted to quickly see when the image was taken as well.

I ended up with the date in sortable order (YYYYMMDD) and time which, because you can shoot multiple images in a second, includes the subseconds (HHMMSSss), date and time separated by an underscore. This is quite unique, but not always:
a) Sometimes subseconds is not precise enough (e.g. the Nikon D4 only records 1/10 seconds, resulting every now an then with the same subseconds in a series of shots)
b) When shooting with multiple cameras/people it can happen that two shots have the exact same time, up to the subsecond)
I solved this as follows:
a) I add 1/100 to the subsecond in case there are two (or more, though I've never seen that happen) files of the same camera with exactly the same subsecond.
b) I add the abbreviated model name to the filename (e.g. _D500 for a Nikon D500, _5Dii for a Canon 5D mark II, i4s for an iPhone 4s, etc.).
True, this might not always be unique if I would ever use two versions of the same camera at the same time (but that could be solved by adding another distinguishing mark, derived e.g. from the serial number).

Notes:

  • Times recorded this way are always local time. This has its own problems (travel, lost hour during daylight savings changeover), but it is the "clearest" and most informative for the user. And anyway local time is the only thing recorded by all cameras (if you're lucky there's a timezone, but not always).
  • In above case of a) the subsecond change gets reflected in the metadata.
Hayo Baan – Photography
Web: www.hayobaan.nl

Jom

If you think very seriously about the naming of photo files (or any else) it turns out to be a deep philosophical question.
In reality, everything is unique itself and marking itself by itself.
But in practice, we need some kind of conditional naming system, which can not be without the area of agreements.

Even your and my systems (by the way, they are similar) are not universal on a global scale.
For example, we use the date and time only for our own convenience.

Can you give a complete example of your organization of files and folders?

Hayo Baan

Quote from: andreikorzhyts on July 15, 2019, 12:20:59 PM
If you think very seriously about the naming of photo files (or any else) it turns out to be a deep philosophical question.
In reality, everything is unique itself and marking itself by itself.
But in practice, we need some kind of conditional naming system, which can not be without the area of agreements.

Even your and my systems (by the way, they are similar) are not universal on a global scale.
Sure, not globally universal in that there could still be multiple photographs world-wide that have identical names. Going to UTC times would help at sorting them world-wide, but we'd still need something to make all – globally – unique, full serial number could be a potential tag to use (but not all cameras have that stored in a readable fashion). But hey, I don't care about "globally" unique, I care about "my" unique.

Quote from: andreikorzhyts on July 15, 2019, 12:20:59 PM
For example, we use the date and time only for our own convenience.
Yes, but what else should the naming convention support? I don't do this for someone else, only for myself. If need arises, (a selection of) my images can always be renamed to suit some other naming convention, e.g. for a client.

Quote from: andreikorzhyts on July 15, 2019, 12:20:59 PM
Can you give a complete example of your organization of files and folders?

All by images are set up like this
Personal pictures go in a folder by year, and then in a folder by "event"  which is always named YYYYMMDD description[, period in English], where period in English is optional (e.g. "generic" photos don't get that, they also get MMYY as 0000, e.g. most years have a folder called "YYYY0000 Misc"). Sometimes (e.g. on larger trips with visits to multiple locations), the "event" folder is again split into subfolders. These could be named freely e.g. "Day 03-05 Location", etc.).

My commissioned work goes into a folder which is subdivided into a folder for each client which itself is then again subdivided into "assignment", named similarly to "events" above.

Cheers,
Hayo
Hayo Baan – Photography
Web: www.hayobaan.nl

Alan Clifford

Apparently there have been no sha256 collisions yet so that could be a candidate for uniqueness

shasum -a 256 ahc_5510.nef

3b21a7f325090f1abf7044356e1de3e198802004d8d831ca25211e7726121c01  ahc_5510.nef

Jom

Thanks, Hayo.
YYYY0000 — I find this in a practical way for various non-standard cases, for example, with damaged dates and times.
Folder 00000000 will be in the root of the folder tree  :).
But you use spaces in folder names. You have no difficulty with the commands in ExifTool?
Spaces are not recommended from the technical side.
I think it is correct to use only letters, numbers and underscores.

Phil Harvey

ExifTool has no problems with spaces in folder names, but these names need to be quoted or the spaces need to be escaped on the command line.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Jom

QuoteApparently there have been no sha256 collisions yet so that could be a candidate for uniqueness
Yes, that's first thing about I was thinking a long time ago.
But SHA considers the full file with its metadata.
If you change the metadata, you must update the name.

If you do a SHA of the content only, it will be a conflict in the case of the overexposed file where all pixels 255.255.255 (I haven't checked, but it is logical).
But the overexposed files are not needed, so this conflict can be ignored.

Ok, next...
SHA as the name is not suitable for perception. We need the agreements I mentioned earlier. The best agreement is chronology.
In result (SHA from the content only):
20190523_124551_0d1776c13dc533df0fcb302de3faf56f722637a3_v2.cr2

If we compare it with this,
20191021_145325_CanonEOS5DMarkIV_ZA2562417_IMG_236_v02.cr2
we have lost perception and have not acquired shortness.


Jom

Still need to take into account the time zone.
20191021_145325_p0430_CanonEOS5DMarkIV_ZA2562417_IMG_236_v02.cr2
But not all cameras write it and it will have to add itself.

If you photographing on the orbit, it will be hell for you, several time zones in one session. :)

Jom

Quote from: Phil Harvey on July 16, 2019, 10:26:32 AM
ExifTool has no problems with spaces in folder names, but these names need to be quoted or the spaces need to be escaped on the command line.
Probably, I am too insure, but it's not just the ExifTool. You never know where your archive will be once, what kind of operating system it will be, what file system, how old it will be. etc. Therefore, I try not to use any symbols that were not in the dawn of the computer era. Only a-Z0-9_.

Hayo Baan

Quote from: andreikorzhyts on July 16, 2019, 11:18:52 AM
Quote from: Phil Harvey on July 16, 2019, 10:26:32 AM
ExifTool has no problems with spaces in folder names, but these names need to be quoted or the spaces need to be escaped on the command line.
Probably, I am too insure, but it's not just the ExifTool. You never know where your archive will be once, what kind of operating system it will be, what file system, how old it will be. etc. Therefore, I try not to use any symbols that were not in the dawn of the computer era. Only a-Z0-9_.

Well, I just happen to like folder names that read well, hence the spaces (my image files themselves have no spaces).

Apart from DOS (though even there it was possible, I think), all operating systems I have ever used (including OS/2!) never had big problems with spaces in names, and all command-lines of those OSs have autocompletion which automatically takes care of (escaping) the spaces as well. So it's really quite seemless to work with spaces in files.

The only place I found spaces in filenames a no-go is when developing software with "make"; there spaces really trip you up. There are likely solutions for that too, but I have not bothered to look into it and simply keep my development in files/directories without spaces :D
Hayo Baan – Photography
Web: www.hayobaan.nl

Jom

QuoteWell, I just happen to like folder names that read well, hence the spaces...
To me on the contrary, underscores help to read better. Often the empty space between the graphemes of letters is difficult to distinguish from the space, the underscore monotonously indicates the division of words. If poor vision and small text ...

Alan Clifford

Quote from: andreikorzhyts on July 16, 2019, 10:40:16 AM
QuoteApparently there have been no sha256 collisions yet so that could be a candidate for uniqueness

If you do a SHA of the content only, it will be a conflict in the case of the overexposed file where all pixels 255.255.255 (I haven't checked, but it is logical).
But the overexposed files are not needed, so this conflict can be ignored.


I was only being half serious with the suggestion.  However, regarding your point about the problem if all pixels are the same as in another photograph, then both photographs are actually the same photograph even if they were taken at different times and places.

Jom

Yes, I already wrote that this is a deep philosophical question.
This issue can be solved only by adopting restrictive agreements, for example, within the planet Earth (for date and time).
If the photos will be in another part of the Universe, you need to expand existing system or create an additional system.


Jom

#14
################
I am currently coming to this base of naming system for photos (Windows)
This is short statement. There are a lot of nuances which are considered by this system and I don't write them, but I am ready to provide arguments for any remarks.

################

Reasons

It is necessary to create a relatively unique works name of the minimum length to avoid collisions in the archive with the maximum variety of shooting conditions.

Agregements

0. All works have correct and undamaged data.
1. Rules are not strict and are only basis, the system can be changed at your own risk if necessary.
2. Must be a staging folder for collect of works before their organization.

_\

3. Folder and file structure must be at the root of storage. This is to minimize the length of full name (Windows assigns maximum 260 characters for full name).
4. Use only

a-Z0-9_

symbols (only latin letters).
5. Names and hierarchy of folders are based on chronology of works.

STORAGE \ YYYY \ YYYYMMDD \ WORK

6. Names of works consist of their constant parameters sufficient for relative global uniqueness.

20190718 — date.
032000 — time.
f0300 — time zone (you may need to add it yourself). f — forward, b — backward.
CanonEOS5DMarkIV —model (this can be neglected, because the probability of matching serial numbers from different manufacturers is negligible).
ZA2561817 — internal serial number or serial number of camera (choose the shortest).
IMG_0024 — the first name of shot (you may need to add it yourself, but some cameras write something like that in metadata, example IMG_0024.cr2 and FileNumber:100-0024)
v02 — version of postprocessing (for non-raw files only).

Examples

Full compliance with the base of naming system

D:\ 2019 \ 20190718 \ 20190718_032000_f0300_CanonEOS5DMarkIV_ZA2561817_IMG_0024.cr2
D:\ 2019 \ 20190718 \ 20190718_032000_f0300_CanonEOS5DMarkIV_ZA2561817_IMG_0024.psd
D:\ 2019 \ 20190718 \ 20190718_032000_f0300_CanonEOS5DMarkIV_ZA2561817_IMG_0024_v02.psd
D:\ 2019 \ 20190718 \ 20190718_032000_f0300_CanonEOS5DMarkIV_ZA2561817_IMG_0024.jpg
D:\ 2019 \ 20190718 \ 20190718_032000_f0300_CanonEOS5DMarkIV_ZA2561817_IMG_0024_v02.jpg


Individual options

D:\ 2019 \ 20190718 \ 20190718_032000_f0300_ZA2561817_IMG_0024.cr2
D:\ 2019 \ 20190718 \ 20190718_032000_f0300_ZA2561817_IMG_0024_v02.jpg

D:\ 2019 \ 20190718_Minsk \ 20190718_032000_f0300_CanonEOS5DMarkIVZA2561817IMG0024.cr2

D:\ 2019 \ 20190718_Photography_workshop \ 20190718032000_f0300ZA2561817IMG0024.cr2

D:\ 2019 \ 20190718 \ 20190718_032000f0300_IMG0024.cr2


One of the most vulnerable variant
D:\ 2019 \ 20190718 \ 20190718_IMG0024.cr2