Is this possible to manipulate/add/change metadata tracks?

Started by Cheloute, November 16, 2022, 09:54:15 AM

Previous topic - Next topic

Cheloute

Hi,

Sorry for the length of this post, but I need to explain what's my goal to ask my question...

I own both Pro1 and Pro2 cameras. Both are stereoscopic 360 cameras from insta360, but the "newer" model allows to stabilize the image with what they call "FlowState". The newer model has an integrated GPS module, while there's a possibility to plug an external GPS module to Pro 1.

FlowState is a feature all the consumer insta360 cameras have, and also Pro2 and Titan. But Pro1 doesn't. No matter how hard I try, I can't understand it because it's not something performed by the camera, it's a software feature applied during the stitching phase after having extracted all the video clips to stitch together, using the gyro data.
So, as I'm curious, I'm wondering why this is not possible to apply this FlowState on Pro1 footage, and I decided to make some tries.
The official stitching app is divided into 2 programs (more or less). A GUI interface to configure the stitching parameters and perform some verification, and a command line program to interpret those parameters and perform the stitching.
What I first did was to launch a stitch job using a Pro 1 footage, and compared it to another stitch job using a Pro 2 footage.
Pro1 gyro data originally used by the official insta360 stitching program are stored in an external gyro.dat binary file, and there's no more information about what these data look like. But executing exiftool on one of the 6 clips this camera produces gave me a list of 3 metadata tracks: a video track, an audio track and a gyro data track.
On the other hand, Gyro and GPS data are stored into 2 different metadata tracks, so 4 tracks in total : video, audio, gyro and gps, according to exiftool.
In both case, Track3 (gyro data) is in CAMM format, and is similar on both cameras, except the fact that Pro1 footage gives a <Acceleration> field Pro2 doesn't show.
So my first try was to use the Pro2 stitching parameters to launch the command line program to execute the job, but... if the job ends correctly and the video is stitched as it should be (so, correctly), the final footage is not stabilized because "gyro data are not found". Well. My parameters are OK, I correctly specified which is the clip containing the metadata, change the format to camm, etc etc. So, if it can't locate these data, I suppose it's because not all the data are available or formatted as the software is expecting them.
As Pro2 generates a fourth track containing GPS data, I algo plugged the GPS receiver to Pro1 and made another test.. But in this case, GPS data are stored in Track3, with gyro data, and not in a specific track.

So... my questions are :
  • Is is possible to manually add a metadata track to make my Pro 1 tracks looks like Pro 2 tracks?
  • Pro2 is supposed to have a 9 axis gyro, while Pro1 has a 6 axis gyro. Nevertheless, I can't see into Pro2 metadata nothing that shows me 9 different axis, is it normal? Not everything is extracted from camm metadata with exiftool? Or am I loosing smoething?

I let an extract of what these data look like :

Pro 1
...
<Track3:SampleTime> 0s</Track3:SampleTime>
<Track3:SampleDuration> 0s</Track3:SampleDuration>
<Track3:AngularVelocity>-0.0039936788380146 0.0157533567398787 -0'.0104818996042013</Track3:Acceleration>
<Track3:SampleTime> 0.00s</Track3:SampleTime>
<Track3:SampleDuration> 0.00s</Track3:SampleDuration>
<Track3:Acceleration>-0.282324224710464 0.579003930091858 9.749755859375</Track3:Acceleration>
<Track3:SampleTime> 0.00s</Track3:SampleTime>
<Track3:SampleDuration> 0.00s</Track3:SampleDuration>
<Track3:GPSDateTime>2022:11:16 00:00:00Z</Track3:GPSDateTime>
<Track3:GPSMeasureMode>Unknown (1)</Track3:GPSMeasureMode>
<Track3:GPSLatitude>0 deg 0&#39; 0.00&quot; N</Track3:GPSLatitude>
<Track3:GPSLongitude>0 deg 0&#39; 0.00&quot; E</Track3:GPSLongitude>
<Track3:GPSAltitude>0 m</Track3:GPSAltitude>
<Track3:GPSHorizontalAccuracy></Track3:GPSHorizontalAccuracy>
<Track3:GPSVerticalAccuracy></Track3:GPSVerticalAccuracy>
<Track3:GPSVelocityEast>0</Track3:GPSVelocityEast>
<Track3:GPSVelocityNorth>0</Track3:GPSVelocityNorth>
<Track3:GPSVelocityUp>0</Track3:GPSVelocityUp>
<Track3:GPSSpeedAccuracy>0</Track3:GPSSpeedAccuracy>
...

And Pro2:
...
<Track3:SampleTime> 0s</Track3:SampleTime>
<Track3:SampleDuration> 0s</Track3:SampleDuration>
<Track3:AngularVelocity>-0.0039936788380146 0.0157533567398787 -0'.0104818996042013</Track3:Acceleration>
...
<Track4:SampleTime> 0.00s</Track3:SampleTime>
<Track4:SampleDuration> 0.00s</Track3:SampleDuration>
<Track4:GPSDateTime>2022:11:16 00:00:00Z</Track3:GPSDateTime>
<Track4:GPSMeasureMode>Unknown (1)</Track3:GPSMeasureMode>
<Track4:GPSLatitude>0 deg 0&#39; 0.00&quot; N</Track3:GPSLatitude>
<Track4:GPSLongitude>0 deg 0&#39; 0.00&quot; E</Track3:GPSLongitude>
<Track4:GPSAltitude>0 m</Track3:GPSAltitude>
<Track4:GPSHorizontalAccuracy></Track3:GPSHorizontalAccuracy>
<Track4:GPSVerticalAccuracy></Track3:GPSVerticalAccuracy>
<Track4:GPSVelocityEast>0</Track3:GPSVelocityEast>
<Track4:GPSVelocityNorth>0</Track3:GPSVelocityNorth>
<Track4:GPSVelocityUp>0</Track3:GPSVelocityUp>
<Track4:GPSSpeedAccuracy>0</Track3:GPSSpeedAccuracy>
...

Thanks for reading all of this  ;)
Cheers

Phil Harvey

ExifTool does not have the ability to create new tracks in video files.

But regarding the 9-axis gyro: If you post a (small) sample I'll take a look to see if there is anything new that ExifTool could be decoding.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Cheloute

Hi,

Thanks Phil for your quick answer!
I published the output of the following command in my drive :

exiftool -ee -u -U -b -G3 -X video.mp4

Pro2 footage was larger, that's why the file is bigger.. If you need any other kind of output, cut ouput or directly original mp4 files, please ask!

Url:Google Drive

Thanks

Phil Harvey

I need to see the original MP4 file to see if there is something ExifTool isn't extracting.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Cheloute

Ok sorry. I updated the drive, same url.
The dat file is supposed to contain the gyro data from Pro footage, but it seems that same info is already present in Track3.
The Pro2 footage contains all the 9 axis gyro data, but I can't see where.

Thanks

Phil Harvey

I got the samples, thanks.  The Pro is conforming to the CAMM specification, and writes camm2 (gyro), camm3 (acceleration), and camm6 (gps) metadata.  Here is what those parts of the -ee -v3 output look like:

Track3 Type='camm' Format='camm', Sample 1 of 9104 (16 bytes)
  1d9adb: 00 00 02 00 68 dd 82 bb 2f 0d 81 3c 46 bc 2b bc [....h.../..<F.+.]
SampleTime = 0
SampleDuration = 1e-06
camm2 (SubDirectory) -->
- Tag 'camm' (16 bytes):
  1d9adb: 00 00 02 00 68 dd 82 bb 2f 0d 81 3c 46 bc 2b bc [....h.../..<F.+.]
+ [BinaryData directory, 16 bytes]
| AngularVelocity = -0.0039936788380146 0.0157533567398787 -0.0104818996042013
| - Tag 0x0004 (12 bytes, float[3]):
|   1d9adf: 68 dd 82 bb 2f 0d 81 3c 46 bc 2b bc             [h.../..<F.+.]

Track3 Type='camm' Format='camm', Sample 2 of 9104 (16 bytes)
  1d9aeb: 00 00 03 00 cd 8c 90 be 9a 39 14 3f 00 ff 1b 41 [.........9.?...A]
SampleTime = 1e-06
SampleDuration = 1e-06
camm3 (SubDirectory) -->
- Tag 'camm' (16 bytes):
  1d9aeb: 00 00 03 00 cd 8c 90 be 9a 39 14 3f 00 ff 1b 41 [.........9.?...A]
+ [BinaryData directory, 16 bytes]
| Acceleration = -0.282324224710464 0.579003930091858 9.749755859375
| - Tag 0x0004 (12 bytes, float[3]):
|   1d9aef: cd 8c 90 be 9a 39 14 3f 00 ff 1b 41             [.....9.?...A]

Track3 Type='camm' Format='camm', Sample 3 of 9104 (60 bytes)
  1d9afb: 00 00 06 00 00 00 00 00 00 00 00 00 01 00 00 00 [................]
  1d9b0b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  1d9b1b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  1d9b2b: 00 00 00 00 00 00 00 00 00 00 00 00             [............]
SampleTime = 2e-06
SampleDuration = 0.003187
camm6 (SubDirectory) -->
- Tag 'camm' (60 bytes):
  1d9afb: 00 00 06 00 00 00 00 00 00 00 00 00 01 00 00 00 [................]
  1d9b0b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  1d9b1b: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
  1d9b2b: 00 00 00 00 00 00 00 00 00 00 00 00             [............]

The camm2 and camm3 records are 16 bytes (12 bytes plus 4-byte header) and the camm6 record is 60 bytes (56 bytes plus 4-byte header), which is all according to specification.

However, the Pro2 writes camm1, camm2 and camm6, like this:

Track3 Type='camm' Format='camm', Sample 108 of 21209 (72 bytes)
   27ec8: d0 de 01 00 aa 72 ca 01 00 00 00 00 56 49 01 00 [.....r......VI..]
   27ed8: aa 72 ca 01 00 00 00 00 5f 31 01 00 aa 72 ca 01 [.r......_1...r..]
   27ee8: 00 00 00 00 69 6e 01 00 aa 72 ca 01 00 00 00 00 [....in...r......]
   27ef8: 00 d1 01 00 aa 72 ca 01 00 00 00 00 00 00 01 00 [.....r..........]
   27f08: aa 72 ca 01 00 00 00 00                         [.r......]
SampleTime = 0.215
SampleDuration = 1e-06
camm1 (SubDirectory) -->
- Tag 'camm' (72 bytes):
   27ec8: d0 de 01 00 aa 72 ca 01 00 00 00 00 56 49 01 00 [.....r......VI..]
   27ed8: aa 72 ca 01 00 00 00 00 5f 31 01 00 aa 72 ca 01 [.r......_1...r..]
   27ee8: 00 00 00 00 69 6e 01 00 aa 72 ca 01 00 00 00 00 [....in...r......]
   27ef8: 00 d1 01 00 aa 72 ca 01 00 00 00 00 00 00 01 00 [.....r..........]
   27f08: aa 72 ca 01 00 00 00 00                         [.r......]
+ [BinaryData directory, 72 bytes]
| PixelExposureTime = 30044842
| - Tag 0x0004 (4 bytes, int32s[1]):
|    27ecc: aa 72 ca 01                                     [.r..]
| RollingShutterSkewTime = 0
| - Tag 0x0008 (4 bytes, int32s[1]):
|    27ed0: 00 00 00 00                                     [....]

Track3 Type='camm' Format='camm', Sample 1 of 21209 (32 bytes)
    10f0: 00 00 02 00 84 70 d1 3b 63 14 1d 3c 58 a0 8b bc [.....p.;c..<X...]
    1100: 74 61 03 00 00 00 78 3d 00 10 7a bf 00 00 51 be [ta....x=..z...Q.]
SampleTime = 0
SampleDuration = 0.002
camm2 (SubDirectory) -->
- Tag 'camm' (32 bytes):
    10f0: 00 00 02 00 84 70 d1 3b 63 14 1d 3c 58 a0 8b bc [.....p.;c..<X...]
    1100: 74 61 03 00 00 00 78 3d 00 10 7a bf 00 00 51 be [ta....x=..z...Q.]
+ [BinaryData directory, 32 bytes]
| AngularVelocity = 0.00639158673584461 0.00958738010376692 -0.0170442312955856
| - Tag 0x0004 (12 bytes, float[3]):
|     10f4: 84 70 d1 3b 63 14 1d 3c 58 a0 8b bc             [.p.;c..<X...]

Track4 Type='camm' Format='camm', Sample 1 of 428 (60 bytes)
    1000: b8 00 06 00 86 e6 52 e3 2d af d5 41 00 00 00 00 [......R.-..A....]
    1010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
    1020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
    1030: 00 00 00 00 00 00 00 00 00 00 00 00             [............]
SampleTime = 0
SampleDuration = 0.099965
camm6 (SubDirectory) -->
- Tag 'camm' (60 bytes):
    1000: b8 00 06 00 86 e6 52 e3 2d af d5 41 00 00 00 00 [......R.-..A....]
    1010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
    1020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
    1030: 00 00 00 00 00 00 00 00 00 00 00 00             [............]

The camm6 record is the correct size, but camm2 is twich the size it should be.  Looking at the binary, it seems there is a camm3 record attached to the end of the camm2 data.  I can decode this, but I don't know if this is correct according to the specification.

Then there is camm1, which should be 12 bytes total, but is 72.  That's 60 extra bytes.  It looks like the other 60 bytes are just repeats of the camm1 record with only the first 2 (reserved) header bytes changing.  I can extract these too, but there is no new information here.

If I extract these additional camm records, you will get the Accelerometer data, but that's it for any extra useful information.  The Pro already gives the Accelerometer.  If you count GPS latitude/longitude/elevation as 3 axes, and AngularVelocity and Acceleration as 6 more, then the Pro2 does give you 9 axes, but that is already what you had with the Pro.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Cheloute

Hi,

Impressive... ok, i don't need request you to change your tool, i needed to understand how this was working and why the insta360 stitcher app wasn't working to apply flowstate to Pro1, i have the answer: data are present in a Pro1 footage but not located where it's looking for them when forcing it with Pro2 params.
Ok.
Another question then.. is it possible to modify these metadata in Pro1 footage to organize them as if it were a Pro2 footage? Maybe not using exiftool, but manipulating the binary data? I mean, my goal is to open flowstate to Pro1, and obviously i can't change the official app.. but i may develop a dirty process that modify these files to reorganize these data.. and i'll keep on trying

Cheloute

Hi,

I thought a little about it, and think I need to clearly understand how do both mp4 container and metadata work before trying anything.
I really like the -ee -v[num] output, I didn't see that in the manual (sorry...) but it's very helpful to understand how this is working.
I think I'll try to identify each part of the container and directly modify the binary to create another consistent container and see what happens. It probably doesn't work, but useful for my understanding  :)

UPDATE: I found this document: https://www.cimarronsystems.com/wp-content/uploads/2017/04/Elements-of-the-H.264-VideoAAC-Audio-MP4-Movie-v2_0.pdf which is exactly what I was looking for to understand the basis I need to manipulate my mp4. With this document and the -v[num] param, I think I know how to proceed to rebuild my video as I need to apply flowstate!

Thanks for your help

Phil Harvey

Rebuilding the video will be difficult.  To split the camm records into separate samples you will need to modify the stts, stsc, stsz and stco lists in the sample table for the metadata track.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Cheloute

Quote from: Phil Harvey on November 17, 2022, 08:15:00 AMRebuilding the video will be difficult.  To split the camm records into separate samples you will need to modify the stts, stsc, stsz and stco lists in the sample table for the metadata track.

- Phil

I'm seeing that.. I also wonder why the mdat tag is not before moov in Pro1, and if that's important or not for my purpose.. probably no, but whi knows..
For now, any try lead me to an unreadable video file :D but i'm still hoping!

Cheloute

Quote from: Phil Harvey on November 17, 2022, 08:15:00 AMRebuilding the video will be difficult.  To split the camm records into separate samples you will need to modify the stts, stsc, stsz and stco lists in the sample table for the metadata track.

- Phil

Hi Phil, I need some additional detsils about how to decide camm data..
I managed to write a java code to move mdat and moov boxes updating moov offsets, but i can't find a lot of documentation to decode camm data.. I'm using Mp4Parser to manipulate my mp4 file, but there's no CammBox so I'm trying to implement mine, but i just can't understand how is transposed the google documentation in the hex dump (i had a well written pdf to understand the rest of tags, but this one wasn't explained  :'(
Well, i'm trying to figure it how with exiftool -ee -v[num], but not sure to undersand it correctly.. would you mind to give me a hand to understand it correctly?

Thanks

Phil Harvey

I presume you have seen the CAMM specification.

And you can see how the metadata track containing the CAMM information is organized using the ExifTool -v3 option.

What the specification may not mention is that each CAMM record is a fixed length (you'll have to determine the length yourself by adding the sizes of each structure element), and apparently there may be multiple CAMM records stored within a single sample.  So if the sample size is larger than expected according to the first CAMM record in the sample, you need to check after the first record to see if there are more.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

Cheloute

Quote from: Phil Harvey on November 18, 2022, 01:03:11 PMI presume you have seen the CAMM specification.

And you can see how the metadata track containing the CAMM information is organized using the ExifTool -v3 option.

What the specification may not mention is that each CAMM record is a fixed length (you'll have to determine the length yourself by adding the sizes of each structure element), and apparently there may be multiple CAMM records stored within a single sample.  So if the sample size is larger than expected according to the first CAMM record in the sample, you need to check after the first record to see if there are more.

- Phil

Sorry, I'm still unable to understand. I read the CAMM specs several times (it's pretty short), and studied the output of exiftool -ee -v3 with the hex dump side by side, but there's something I can't understand.
Track 3 (in my sample) contains CAMM Metadata, ok. That's easy to "see", it's written into the hexdump.
But how do I know where in the mdat atom are the camm data? That's what i can't see.

I tried to open another mp4 from Pro2 I made today to see the "7 differences" and exiftool tells me the following:

---- Extract Embedded ----
Track3 Type='camm' Format='camm', Sample 1 of 3465 (32 bytes)
    10f0: 00 00 02 00 8f e4 62 bd 94 9e eb bc 52 e6 82 bc [......b.....R...]
    1100: 74 61 03 00 00 80 d2 3d 00 c0 77 bf 00 40 6e be [ta.....=..w..@n.]
SampleTime = 0
SampleDuration = 0.002
camm2 (SubDirectory) -->
- Tag 'camm' (32 bytes):
    10f0: 00 00 02 00 8f e4 62 bd 94 9e eb bc 52 e6 82 bc [......b.....R...]
    1100: 74 61 03 00 00 80 d2 3d 00 c0 77 bf 00 40 6e be [ta.....=..w..@n.]
+ [BinaryData directory, 32 bytes]
| AngularVelocity = -0.0553937517106533 -0.0287621393799782 -0.015978965908289
| - Tag 0x0004 (12 bytes, float[3]):
|     10f4: 8f e4 62 bd 94 9e eb bc 52 e6 82 bc             [..b.....R...]
Track3 Type='camm' Format='camm', Sample 2 of 3465 (32 bytes)
    1110: 00 00 02 00 89 2a 5a bd 55 43 07 bd 8f e4 62 bc [.....*Z.UC....b.]
    1120: 74 61 03 00 00 00 d1 3d 00 f0 76 bf 00 40 69 be [ta.....=..v..@i.]
SampleTime = 0.002
SampleDuration = 0.002
camm2 (SubDirectory) -->
- Tag 'camm' (32 bytes):
    1110: 00 00 02 00 89 2a 5a bd 55 43 07 bd 8f e4 62 bc [.....*Z.UC....b.]
    1120: 74 61 03 00 00 00 d1 3d 00 f0 76 bf 00 40 69 be [ta.....=..v..@i.]
+ [BinaryData directory, 32 bytes]
| AngularVelocity = -0.05326322093606 -0.0330231972038746 -0.0138484379276633
| - Tag 0x0004 (12 bytes, float[3]):
|     1114: 89 2a 5a bd 55 43 07 bd 8f e4 62 bc             [.*Z.UC....b.]

How does exiftool know that my camm data start at offset 10F0? How does it know there are 3465 samples ?
And how does it know the last camm data is actually the last one?

Thanks

Cheloute

Ok, I'll reply myself... in the stco atom of the camm track  ;D

Cheloute

Hi,

Well, I'm now able to parse my MP4 shot with Insta360 Pro1, reorganizing MDAT and MOOV tag (updating offsets), and creating a second Camm track with only the GPS Data, as a Pro2 does.
But I still have an error writing the new sample tables I can't find.
When I launch exiftool with -ee -v3 params, I can't see any GPS data but the error message "wrong sample size" instead. But I'm not sure about what it's complaining.
As I only have GPS Data in this camm track, I created a stsz box setting the sample size to 60 (as suggested by the camm specification) and nothing for the sample sizes.
My GPS records are supposed to be sized with 60 bytes (that's also the info I get from exiftool parsing the original file), and the stsz tag size appears to be correct, too.

I created the new stco box from the original stco box from track3, maintaining only offsets pointing to a GPS record. I calculated the corresponding stsc box accordingly.

And to get the stts box, I iterate on the original stts box, summing all the delta until I detect a GPS data sample. I reset delta and start again until the end. Count is always 1, as I only have GPS data samples on this track.

So..how could I check what's wrong exactly?