News:

2023-03-15 Major improvements to the new Geolocation feature

Main Menu

Unable to read metadata from large audiobooks

Started by vance003, March 21, 2015, 02:27:59 PM

Previous topic - Next topic

vance003

I am in the process of converting audiobooks from Audible's AAX format to more generic M4B files. Most of the metadata I am transferring using the program FFmpeg, since it'll also help with the conversion, but I use ExifTool (version 9.82 on Windows 7 64-bit) to extract the cover art. Although AAX use a proprietary format for the recorded sound, the file metadata is the same as for M4A files, and in general ExifTool has no problems at all with them. What I am finding, however, is that if a file is around 1.5 gigabytes or larger, ExifTool cannot read its metadata --- that's true for AAX, M4A, and M4B files. I have a few files that are around 1.3 gb, and ExifTool has no problem with them, but at some point between 1.3 and 1.5 gb it gives up the ghost. I can confirm with the program MP3Tag that for the 1.5 gb files (both AAX and M4B) the verbal tags are indeed present, but they are simply invisible to ExifTool.

Using exiftool -v3 on one of the large AAX files yields:

  ExifToolVersion = 9.82
  FileName = 3.aax
  Directory = ..
  FileSize = 1615619801
  FileModifyDate = 1423202266
  FileAccessDate = 1426961611
  FileCreateDate = 1426960503
  FilePermissions = 33206
  FileType = MP4
  MIMEType = video/mp4
  FileType (SubDirectory) -->
  - Tag 'ftyp' (28 bytes):
      0008: 61 61 78 20 00 00 00 01 61 61 78 20 4d 34 42 20 [aax ....aax M4B ]
      0018: 6d 70 34 32 69 73 6f 6d 00 00 00 00             [mp42isom....]
  + [BinaryData directory, 28 bytes]
  | MajorBrand = aax
  | - Tag 0x0000 (4 bytes, undef[4]):
  |     0008: 61 61 78 20                                     [aax ]
  | MinorVersion = .
  | - Tag 0x0001 (4 bytes, undef[4]):
  |     000c: 00 00 00 01                                     [....]
  | CompatibleBrands = aax M4B mp42isom
  | - Tag 0x0002 (20 bytes, undef[20]):
  |     0010: 61 61 78 20 4d 34 42 20 6d 70 34 32 69 73 6f 6d [aax M4B mp42isom]
  |     0020: 00 00 00 00                                     [....]
  Movie (SubDirectory) -->
  - Tag 'moov' at offset 0x002c (18866552 bytes)
  MovieDataSize = 1596753197
  MovieDataOffset = 18866604
  MovieData
  - Tag 'mdat' at offset 0x11fe1ac (1596753197 bytes)


Using exiftool -v3 on one of the large M4B files yields:

  ExifToolVersion = 9.82
  FileName = 3.m4b
  Directory = ..
  FileSize = 1615943100
  FileModifyDate = 1426957873
  FileAccessDate = 1426961885
  FileCreateDate = 1426960536
  FilePermissions = 33206
  FileType = M4A
  MIMEType = audio/mp4
  FileType (SubDirectory) -->
  - Tag 'ftyp' (16 bytes):
      0008: 4d 34 41 20 00 00 02 00 69 73 6f 6d 69 73 6f 32 [M4A ....isomiso2]
  + [BinaryData directory, 16 bytes]
  | MajorBrand = M4A
  | - Tag 0x0000 (4 bytes, undef[4]):
  |     0008: 4d 34 41 20                                     [M4A ]
  | MinorVersion = .
  | - Tag 0x0001 (4 bytes, undef[4]):
  |     000c: 00 00 02 00                                     [....]
  | CompatibleBrands = isomiso2
  | - Tag 0x0002 (8 bytes, undef[8]):
  |     0010: 69 73 6f 6d 69 73 6f 32                         [isomiso2]
  Free =
  - Tag 'free' (0 bytes):
  Movie (SubDirectory) -->
  - Tag 'moov' at offset 0x0028 (17240421 bytes)
  Free =
  - Tag 'free' (2007 bytes):
   1071195: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
   10711a5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
   10711b5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
   10711c5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
   10711d5: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [................]
      [snip 1927 bytes]
  MovieDataSize = 1598700616
  MovieDataOffset = 17242484
  MovieData
  - Tag 'mdat' at offset 0x1071974 (1598700616 bytes)


In contrast, using the command on files 1.3 gb or less unleashes a torrent of metadata, including Author, Album, Description, Genre, etc. (If you need me to, I can post the output for one of the 1.3 gb files.)

Are there any workarounds for handling these big files? Thanks.

Phil Harvey

I'm away this week so I can't look into this in detail right now.  Unfortunately it may take a sample file for me to reproduce the problem for testing, but we should wait until I can investigate further before trying that.

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

vance003

Thanks. If you need a copy of a file, let me know. I would prefer not to post a public link to it, since these recordings are NOT public domain, and they're just too big to email. But there are other ways of getting you the info, if you need it.

Phil Harvey

I'm back now and could use a file for testing.  If you can upload it and email me with the URL it would be great.  My email is philharvey66 at gmail.com

- Phil
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

vance003

I am now sending you an email linking to two files in Dropbox, one an Audible AAX and the other a converted M4B. The email will be from a different email address from the one I use on this board, but the user ID is still vance003. Thanks!

Phil Harvey

I got the files, thanks.

Unfortunately though, there is nothing I can do to fix this because the files you sent simply do not contain any metadata.  This is a problem with the writer, not with ExifTool.  I confirmed this by opening the files in Adobe Bridge, and by doing a binary dump.

- Phil

(P.S: Atlas Shrugged is the novel that put me off reading.  In my teens I was an avid sci-fi reader, until I came up against this beast -- I just couldn't grind my way through it.  It was the last novel I read for about 30 years.)
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

vance003

Quote from: Phil Harvey on April 03, 2015, 09:58:53 AM
Unfortunately though, there is nothing I can do to fix this because the files you sent simply do not contain any metadata.  This is a problem with the writer, not with ExifTool.  I confirmed this by opening the files in Adobe Bridge, and by doing a binary dump.

???

That's weird. If I use mp3tag with them, it shows lots of metadata --- Title, Artist (a/k/a author), Comment, Cover Art, etc. It's there.

I had read that AAX files use M4A format for their metadata, but it's possible that this standard has been "tweaked" by Audible. Nevertheless, the tags should be there. Would it help if I uploaded a couple of slightly smaller files where ExifTool does show the info?

Oh, and thanks for helping with this!

Phil Harvey

#7
Odd.  Try using the ExifTool -v3 option on the files (good and bad), and see if you can discover anything.

In both files you sent, the movie data (which contains no metadata as far as I know) went right to the end of the files, so there was no metadata afterwards.  I didn't look all that closely at the file header, but there wasn't much there.

If you leave your files up on the server, and tell me what information mp3tag shows, then I'll take another look at them tomorrow.

- Phil

Edit: I got curious, so I dropped in to work again today.

You're right.  I didn't look far enough into the header.  ExifTool wa skipping this metadata header because the atom is very large (> 16 MB).  I don't know why Adobe Bridge didn't return anything.  I will change ExifTool to allow QuickTime metadata atoms up to 32 MB and warn if it encounters a larger atom.

ExifTool 9.91 should be able to read metadata from these files.

Thanks for pointing this out, and for providing the samples.
...where DIR is the name of a directory/folder containing the images.  On Mac/Linux/PowerShell, use single quotes (') instead of double quotes (") around arguments containing a dollar sign ($).

vance003

Version 9.91 works like a charm! Many thanks for tackling this, and thanks for your great program.