Revisiting Samsung SEFT

Started by Neal Krawetz, May 31, 2024, 03:00:23 PM

Previous topic - Next topic

Neal Krawetz

Hi Phil,

I'm beginning to dive into the Samsung SEFT trailer.
I have a few new tags for you:

  struct {
        uint16_t Type,AltType; // all have a type, some names associated with two types
        const char *Name;
        } KnownNames[] = {
        // Tested against 3 months of FotoForensics data: 2024-01 to 2024-03
        { 0x8c0, 0, "Auto_Enhance_Info" },
        { 0x001, 0, "Auto_Enhance_Unprocessed" }, // jpeg
        { 0xa41, 0, "BackupRestore_Data" },
        { 0x9e0, 0, "Burst_Shot_Info" },
        { 0x9e1, 0, "BurstShot_Best_Photo_Info" },
        { 0xc61, 0, "Camera_Capture_Mode_Info" }, // common
        { 0xd01, 0, "Camera_Scene_Info2" },
        { 0xd01, 0, "Camera_Scene_Info3" },
        { 0xd01, 0xd21, "Camera_Scene_Info" }, // 60% 0xd01, 40% 0xd21
        { 0xb30, 0, "Camera_Sticker_Info" },
        { 0xda1, 0, "Captured_App_Info" },
        { 0xcc1, 0, "Color_Display_P3" }, // common
        { 0xba2, 0, "Copy_Available_Edit_Info" },
        { 0xba1, 0, "deco_doodle_bitmap" }, // PNG; user drew on the picture
        { 0xba1, 0, "deco_sticker_bitmap" }, // PNG; user pasted picture into picture
        { 0xba1, 0, "deco_text_bitmap" }, // PNG; user put text on the picture
        { 0xb90, 0, "Document_Scan_Info" },
        { 0xbd0, 0, "Dual_Relighting_Bokeh_Info" },
        { 0x001, 0, "DualShot_1" }, // jpeg
        { 0x001, 0, "DualShot_2" }, // jpeg
        { 0x001, 0, "DualShot_3" }, // not seen, assumed jpeg
        { 0x001, 0, "DualShot_4" }, // not seen, assumed jpeg
        { 0x001, 0, "DualShot_5" }, // jpeg
        { 0x001, 0, "DualShot_6" }, // not seen, assumed jpeg
        { 0x001, 0, "DualShot_7" }, // not seen, assumed jpeg
        { 0x001, 0, "DualShot_8" }, // not seen, assumed jpeg
        { 0x001, 0, "DualShot_9" }, // not seen, assumed jpeg
        { 0xab4, 0, "DualShot_Core_Info" },
        { 0xab1, 0, "DualShot_DepthMap_1" },
        { 0xab3, 0, "DualShot_Extra_Info" },
        { 0xab0, 0, "DualShot_Meta_Info" },
        { 0xd31, 0, "Food_Blur_Effect_Info" },
        { 0x910, 0, "Front_Cam_Selfie_Info" }, // common
        { 0xce1, 0, "Gallery_DC_Data" },
        { 0xa01, 0, "Image_UTC_Data" }, // common
        { 0xb51, 0, "Intelligent_PhotoEditor_Data" },
        { 0xbe0, 0, "Livefocus_JDM_Info" },
        { 0xaa1, 0, "MCC_Data" }, // common
        { 0xa30, 0, "MotionPhoto_Data" },
        { 0xba1, 0, "Original_Path_Hash_Key" }, // binary
        { 0x8e0, 0, "Panorama_Shot_Info" },
        { 0xd91, 0, "PEg_Info" },
        { 0xcd2, 0, "Photo_HDR_Info" },
        { 0xba1, 0, "PhotoEditor_Re_Edit_Data" }, // JSON, common
        { 0xc21, 0, "Portrait_Effect_Info" },
        { 0x9f0, 0, "Pro_Mode_Info" },
        { 0xc71, 0, "Pro_White_Balance_Info" },
        { 0xbf0, 0, "Remaster_Info" },
        { 0xc51, 0, "Samsung_Capture_Info" }, // common
        { 0xbc0, 0, "Single_Relighting_Bokeh_Info" },
        { 0xb41, 0, "SingeShot_DepthMap_1" }, // "Singe" instead of Single! (Their typo, not mine!)
        { 0xb41, 0, "SingeShot_DepthMap_2" }, // not seen, assumed
        { 0xb41, 0, "SingeShot_DepthMap_3" }, // not seen, assumed
        { 0xb41, 0, "SingeShot_DepthMap_4" }, // not seen, assumed
        { 0xb41, 0, "SingeShot_DepthMap_5" },
        { 0xb41, 0, "SingeShot_DepthMap_6" },
        { 0xb41, 0, "SingeShot_DepthMap_7" },// not seen, assumed
        { 0xb41, 0, "SingeShot_DepthMap_8" },
        { 0xb41, 0, "SingeShot_DepthMap_9" },// not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_1" },
        { 0xb41, 0, "SingleShot_DepthMap_2" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_3" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_4" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_5" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_6" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_7" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_8" }, // not seen, assumed
        { 0xb41, 0, "SingleShot_DepthMap_9" }, // not seen, assumed
        { 0xb40, 0x001, "SingleShot" }, // 001 is jpeg
        { 0xb60, 0, "UltraWide_PhotoEditor_Data" },
        { 0xd11, 0, "Video_Snapshot_Info" },
        { 0xc81, 0, "Watermark_Info" },
        { 0,0,NULL }

I'm beginning to dive into the nightmare of the DOFS subformat.
I think the first uint32 is a version.
Version 3 is easy to parse (more details coming after I validate my decoding).
Version 2 is weird.
I haven't seen any other versions (so far).
The DOFS is needed for decoding the width and height for the DepthMaps and deco bitmaps.

Also, the "_1", "_2", etc. seem to be related to the DOFS structures. If it lists multiple elements, then the first element is "_1", the second is "_2", etc. I don't know what they do after 9, but then again, the highest I've seen so far is "_8".

Neal Krawetz

DOFS: First uint32 is a version.

Version 2 uses fixed-length values. I assume that, if you know the structure, then you know the purpose of each field. Having said that, I've worked out where the null-terminated strings are located: 0x10, 0x9c, 0x11c, 0x1a4, 0x1c4, 0x1e4, 0x224, 0x424, 0x78a. Most of these appear to be driver build versions, but I haven't worked out the purposes.

Version 3 uses a saner format. First comes the header:
  • uint32: version = 3
  • uint32: unknown, zero
  • uint32: length of DOFS
  • uint16: number of records

Then it loops over the records. Each record contains:
  • uint8: type (this is important for processing the value)
  • uint8: field name length (flen)
  • flen bytes: field name
  • uint16: value length (vlen)
  • vlen bytes: data, format is based on the type

The types for the data:
  • 0x0b: String (Usually null terminated, but not always. Do not assume null-terminated; vlen bytes)
  • 0x05: uint32 (vlen should be 4)
  • 0x09: float (vlen should be 4 bytes)
I have not seen any other types.

Neal Krawetz

Oh! That typo turns out to be important!

SingleShot_DepthMap_1 is a bitmap. ("Single" spelled correctly.) The dimensions are found in some other field, like Livefocus_JDM_Info.

SingeShot_DepthMap_1 is a self-contained PNG. ("Singe", not "Single". I guess they, ahem, compressed the spelling?)

Neal Krawetz

Going back to the SEFT index table:

The last 4 bytes are "SEFT". With some files, they may be followed by newlines (0x0a or 0x0d) or null padding. Basically, read from the end and skip any nulls or whitespace. If the first characters are SEFT, then you hit the table.

uint32 at EOF-4 = "SEFT"
uint32 at EOF-8 = length of the SEFT index table (little endian). Just because of their use of negative offsets and size of data, I don't think it will ever be larger than 65536, so assuming "len len 0 0 SEFT" is probably a safe assumption.

Now, counting backwards from the "S" in SEFT, go back the length. This is the start of the SEFT directory. Counting forward:
uint32: length of the index table. This position + the length = right before SEFT.
uint32: number of index table elements.
E.g.:
0028A380   48 6B 00 00  00 08 00 00  00 00 00 41  0B 22 7D 17  Hk.........A."}.
0028A390   00 50 8A 00  00 00 00 41  0B D2 F2 16  00 20 48 00  .P.....A..... H.
0028A3A0   00 00 00 41  0B B2 AA 16  00 69 5A 00  00 00 00 C0  ...A.....iZ.....
0028A3B0   0B 49 50 16  00 36 04 00  00 00 00 41  0B 13 4C 16  .IP..6.....A..L.
0028A3C0   00 1A 29 00  00 00 00 01  00 F9 22 16  00 C3 22 16  ..)......."...".
0028A3D0   00 00 00 01  0A 36 00 00  00 23 00 00  00 00 00 A1  .....6...#......
0028A3E0   0A 13 00 00  00 13 00 00  00 6C 00 00  00 53 45 46  .........l...SEF
0028A3F0   54                                                  T
SEFT is at the end.
SEFT says the index begins 0x0000006c bytes before "SEFT".
Jumping back, the start of the index is at that letter "k" at position 0x0028A381.
The size of the index is 0x0000006b bytes. And this places us right back at the start of SEFT.

This index contains 8 items (0x00000008 at position 0x0028A385).

Each element in the index is 12 bytes. (So, the index size better be divisible by 12! 0x6c / 0xc = 9, it checks!)

Each element contains:
  uint16: Unknown, always zero
  uint16: record type (rectype)
  uint32: record offset; negative offset to the record (recoffset)
  uint32: record length (recsize)

In the above code example, the first entry is:
  00 00 41 0B 22 7D 17 00 50 8A 00 00
  rectype = 0x0b41
  recoffset = 0x0177d22
  recsize = 0x00008a50 bytes
So starting from the start of the index (the "k" in this example), go back 0x0177d22 bytes. Position 0x28a381 - 0x177d22 = 0x11265f!

00112650   C7 DD A7 FD  73 8B FA 50  00 FF D9 00  00 41 0B 14  ....s..P.....A..
00112660   00 00 00 53  69 6E 67 65  53 68 6F 74  5F 44 65 70  ...SingeShot_Dep
00112670   74 68 4D 61  70 5F 35 89  50 4E 47 0D  0A 1A 0A 00  thMap_5.PNG.....
This position drops us at a 4-byte value: 0x00000014. That's the length of the field name. Then comes the field name: "SingeShot_DepthMap_5". NOTE: This name is NOT null-terminated.

What about the recsize? The recsize bytes contain: 4 bytes for the name length + name + data + four bytes (type of SEFH). In this example, the data after the name is a PNG image. The image ends at position 0x11265f + 0x8a50 bytes = 0x11b0af.
0011B090   8F FF E5 A7  2F FE 2F 97  D1 44 ED AB  70 1C 86 00  ...././..D..p...
0011B0A0   00 00 00 49  45 4E 44 AE  42 60 82 00  00 41 0B 14  ...IEND.B`...A..
Perfect -- that's the end of the PNG!
The last four bytes reiterate the data type: 0x0000 (always zero) and 0x0b41. NOTE: Sometimes it reiterates the type, other times it's "SEFH" (not sure what that means, but it's at the end of strings).
(That next 0x14 value? That's the start of the next field and is referenced by the SEFT index table.)

Back to ExifTool processing:

"ExifTool -g -a -u" current lists these in groups by data type. I think that's misleading.
Samsung Trailer 0x0b41 Name     : SingeShot_DepthMap_5
Single Shot Depth Map           : (Binary data 35380 bytes, use -b option to extract)
Samsung Trailer 0x0b41 Name     : SingeShot_DepthMap_8
Single Shot Depth Map           : (Binary data 18436 bytes, use -b option to extract)
Samsung Trailer 0x0b41 Name     : SingeShot_DepthMap_6
Single Shot Depth Map           : (Binary data 23117 bytes, use -b option to extract)
Samsung Trailer 0x0bc0 Name     : Single_Relighting_Bokeh_Info
Samsung Trailer 0x0bc0          : (Binary data 1042 bytes, use -b option to extract)
Samsung Trailer 0x0b41 Name     : SingeShot_DepthMap_10
Single Shot Depth Map           : (Binary data 10493 bytes, use -b option to extract)
Embedded Image Name             : SingleShot
Embedded Image 2                : (Binary data 1450673 bytes, use -b option to extract)
Samsung Trailer 0x0a01 Name     : Image_UTC_Data
Time Stamp                      : 2023:03:05 14:04:19.486-07:00
Samsung Trailer 0x0aa1 Name     : MCC_Data
MCC Data                        : Brazil (724)
The names appear to always be unique, but the types are not always unique.

I think the output should by name-centric. Something like:
SingeShot_DepthMap_5 : (Binary data that extracts the PNG)
SingeShot_DepthMap_8 : ...
Single_Relighting_Bokeh_Info : ...
SingeShot_DepthMap_10 : ...
...
Image_UTC_Data : 2023:03:05 14:04:19.486-07:00
MCC_Data : Brazil (724)

"What about the type?" The type is only needed to help decode the binary data. ExifTool normally doesn't show the types from EXIF or MakerNotes data, so it doesn't need to show the types here.

Neal Krawetz

Minor correction:

The first uint32 of the index (in the previous example, 0x0000006b = 'k'): That's not an offset. (I now think the size match in the previous example was coincidental.) I've found lots of examples where it doesn't match the index size.

Instead, I think that's a version number.
Values I've seen:
0x6b = k = 107 (most common value in 2023)
0x6a = j = 106 (2nd most common value in 2023)
0x69 = i = 105
0x67 = g = 103 (a few sightings in 2023)

Phil Harvey

Hi Neal,

Thanks for all of this.

I've noted the new SEFT names that you have discovered, but am not going to make any code changes right now.  I think you're right that these should be name-centric, but that would be a big change that will have to wait.

I'll also look into decoding the DOFS, but that too will have to wait until I have more time.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).