Hey! I made it some time ago for myself, but ultimately I found some people need such tool, so I'm sharing it:
https://github.com/Krzysiu/ImageDataDupes (https://github.com/Krzysiu/ImageDataDupes)
One day I will ask GPT to rewrite it in Python, so I'll be able to compile it. There's more things to do, all described in the link under "todo".
Main points:
* this tool finds duplicates of images using Exiftool $imagedatamd5, which means that it looks for identical image data, metadata may differ. So file with and without GPS will be marked as duplicate.
* it just lists files, no changes to files are made, nor deletions - i.e. it's safe!
* in short, if you already have PHP and Exiftool, you run it by php digest.php dir
where dir is optional parameter for path (or starting path, in recursive mode)
Preview (these things on gray bg are "flags" - metadata blocks that are (or aren't) present, we can see base file and ~copy.cr2 lacks "I-2", which means "two fields in ITPC".
(https://private-user-images.githubusercontent.com/2560298/418357817-53d05729-1b0a-47af-a908-b38f728fad9a.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NDA5Mzc5NDIsIm5iZiI6MTc0MDkzNzY0MiwicGF0aCI6Ii8yNTYwMjk4LzQxODM1NzgxNy01M2QwNTcyOS0xYjBhLTQ3YWYtYTkwOC1iMzhmNzI4ZmFkOWEucG5nP1gtQW16LUFsZ29yaXRobT1BV1M0LUhNQUMtU0hBMjU2JlgtQW16LUNyZWRlbnRpYWw9QUtJQVZDT0RZTFNBNTNQUUs0WkElMkYyMDI1MDMwMiUyRnVzLWVhc3QtMSUyRnMzJTJGYXdzNF9yZXF1ZXN0JlgtQW16LURhdGU9MjAyNTAzMDJUMTc0NzIyWiZYLUFtei1FeHBpcmVzPTMwMCZYLUFtei1TaWduYXR1cmU9MzYxMjU3ZGVlMmE1MjAxYzZiNjU0YjhmNjM1ZjA0MmEwY2U0NTM4NmYwNjRjN2Y0NzNiNzRkMzE5Y2MzMTliNCZYLUFtei1TaWduZWRIZWFkZXJzPWhvc3QifQ.TvfhVV1pNGUfQcnDWRFC5Xx6Af49g3I3Qp-5-7jrRFE)