Dear Phil,
first of all thanks to you and all the contributors for this absolute wonderful program. It helps me a lot organizing my >70000 pictures and it is really exciting what variety of information can be found in EXIF data (e.g. camera temperature, lighting value...).
When recursively finding all pictures on all my harddisks I ran into a problem that can be easily solved:
I use exiftool under Cygwin in Windows and a recursive search lead to directory loops because of Windows use of symbolics links in "Documents and Settings".
I solved this by skipping symlinks in function ScanDir. Here's my version of the first few lines:
# Scan directory for image files
# Inputs: 0) ExifTool ref, 1) directory name, 2) list ref to return file names
sub ScanDir($$;$)
{
my ($exifTool, $dir, $list) = @_;
opendir(DIR_HANDLE, $dir) or Warn("Error opening directory $dir\n"), return;
my @fileList = readdir(DIR_HANDLE);
closedir(DIR_HANDLE);
my $file;
$dir =~ /\/$/ or $dir .= '/';
foreach $file (@fileList) {
my $path = "$dir$file";
if (-d $path) {
if (-l $path) {
print STDERR "Ignoring link:", $path,"\n";
next;
}
next if $file =~ /^\./; # ignore dirs starting with "."
next if grep /^$file$/, @ignore;
$recurse and ScanDir($exifTool, $path, $list);
next;
}
The only change is the if block starting with "if (-l $path) {".
I saved this version under a new filename ("exiftool-nosymlinks"), but it probably would be nice to have a command line option (e.g. -dont-folllow-symlinks) to switch this behaviour on and off as it might be useful for others too...
Regards
Michael
Hi Michael,
Thanks for this report. I hadn't considered this problem and I'll have to think about it and do some testing. I'm thinking it may be better if I could just avoid processing the same directory twice, but I'm not sure how to make this determination.
- Phil
Dear Phil,
thanks for your fast reply. Checking if the dir was already visited might be a very good idea to prevent never ending loops in recursion and still be able to follow symlinks (perhaps you can put all visited dirs in a hash and check for the existence of the path in it).
But still it might be useful to disable the following of any symlinks as I don't know if there is a platform idependent way in Perl to know in which physical directory you are after accessing it via a symlink and using that for the check.
And windows has a lot of links pointing to the same location:
On my 64 bit german language system I have the following links reaching C:\Users\Public\Pictures\USA:
/cygdrive/c/Documents\ and\ Settings -> /cygdrive/c/Users
/cygdrive/c/Dokumente\ und\ Einstellungen -> /cygdrive/c/Users
/cygdrive/c/ProgramData/Documents -> /cygdrive/c/Users/Public/Documents
/cygdrive/c/ProgramData/Dokumente -> /cygdrive/c/Users/Public/Documents
/cygdrive/c/Users/Public/Documents/Eigene\ Bilder -> /cygdrive/c/Users/Public/Pictures
/cygdrive/c/Users/Public/Documents/My\ Pictures -> /cygdrive/c/Users/Public/Pictures
so that I can reach this directory via 23 different paths:
/cygdrive/c/Documents\ and\ Settings/All\ Users/Documents/Eigene\ Bilder/USA
/cygdrive/c/Documents\ and\ Settings/All\ Users/Documents/My\ Pictures/USA
/cygdrive/c/Documents\ and\ Settings/All\ Users/Dokumente/Eigene\ Bilder/USA
/cygdrive/c/Documents\ and\ Settings/All\ Users/Dokumente/My\ Pictures/USA
/cygdrive/c/Documents\ and\ Settings/Public/Documents/Eigene\ Bilder/USA
/cygdrive/c/Documents\ and\ Settings/Public/Documents/My\ Pictures/USA
/cygdrive/c/Documents\ and\ Settings/Public/Pictures/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/All\ Users/Documents/Eigene\ Bilder/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/All\ Users/Documents/My\ Pictures/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/All\ Users/Dokumente/Eigene\ Bilder/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/All\ Users/Dokumente/My\ Pictures/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/Public/Documents/Eigene\ Bilder/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/Public/Documents/My\ Pictures/USA
/cygdrive/c/Dokumente\ und\ Einstellungen/Public/Pictures/USA
/cygdrive/c/ProgramData/Documents/Eigene\ Bilder/USA
/cygdrive/c/ProgramData/Documents/My\ Pictures/USA
/cygdrive/c/ProgramData/Dokumente/Eigene\ Bilder/USA
/cygdrive/c/ProgramData/Dokumente/My\ Pictures/USA
/cygdrive/c/Users/All\ Users/Documents/Eigene\ Bilder/USA
/cygdrive/c/Users/All\ Users/Documents/My\ Pictures/USA
/cygdrive/c/Users/All\ Users/Dokumente/Eigene\ Bilder/USA
/cygdrive/c/Users/All\ Users/Dokumente/My\ Pictures/USA
/cygdrive/c/Users/Public/Documents/Eigene\ Bilder/USA
The default of the "find" utility is not to follow symlinks at all. Might be a good option for exiftool too...
Regards
Michael
Quote from: MichaelRath on January 07, 2011, 07:30:09 PM
(perhaps you can put all visited dirs in a hash and check for the existence of the path in it).
Yes, but how is this done? I need to do some research.
If I walk the directory tree to "/a/b/c/b/c", how do I know that this is the same directory as "/a/b/c"? (Using system-independent functions only -- ie. no system calls).
Quote
But still it might be useful to disable the following of any symlinks as I don't know if there is a platform idependent way in Perl to know in which physical directory you are after accessing it via a symlink and using that for the check.
Exactly. I should have read this before typing my response above. Maybe this is the only reasonable alternative, but I really hate adding new options, so I avoid doing this whenever possible.
- Phil
Edit: I did a quick search, and there is a system-dependent "readlink" function built into Perl that I might be able to use. It gives a fatal error if used on systems which don't support symlinks, but I can trap this. I will look into this.
The readlink function didn't pan out -- too much work generating the actual directory names for hashing.
But I have an idea which gives you the feature you want without adding a new option. I will add a feature which uses the existing -i (-ignore) option to disable following of symlinks:
exiftool -i SYMLINKS ...
I'm happy with this, and the default behaviour of exiftool doesn't change, which is also good for backward compatibility.
- Phil
Hi Phil,
this is a good idea (not having to add another option and keeping backwards compability) and it is all I need. I think the probability that anyone uses "SYMLINKS" as a directory name is really low...
Thank you very much.
Regards
Michael