Fork of https://github.com/mv2devnul/taglib

Mark VandenBrink 5d65517579 cleanup continues, formatting, etc 12 anni fa
.gitignore cd4937747a Initial check in 12 anni fa
LICENSE cd4937747a Initial check in 12 anni fa
README.md d62d4bf303 documentation, cleanup, reworking mpeg.lisp to be faster 12 anni fa
audio-streams.lisp 5d65517579 cleanup continues, formatting, etc 12 anni fa
id3-frame.lisp 47b6eeb390 cleanup 12 anni fa
iso-639-2.lisp e831a401b5 whitespace cleanup 12 anni fa
logging.lisp 0c4935dcb9 documentation, cleanup, etc 12 anni fa
mp3-tag.lisp 47b6eeb390 cleanup 12 anni fa
mp4-atom.lisp 5d65517579 cleanup continues, formatting, etc 12 anni fa
mp4-tag.lisp e831a401b5 whitespace cleanup 12 anni fa
mpeg.lisp 5d65517579 cleanup continues, formatting, etc 12 anni fa
packages.lisp 5d65517579 cleanup continues, formatting, etc 12 anni fa
taglib-tests.asd dd54b78d43 documentation, cleanup, reworking mpeg.lisp to be faster 12 anni fa
taglib-tests.lisp 5d65517579 cleanup continues, formatting, etc 12 anni fa
taglib.asd e831a401b5 whitespace cleanup 12 anni fa
utils.lisp 5d65517579 cleanup continues, formatting, etc 12 anni fa

README.md

Copyright (c) 2013, Mark VandenBrink. All rights reserved.

A pure Lisp implementation for reading MPEG-4 audio and MPEG-3 audio tags and audio information.

Mostly complete. Your mileage may vary. Most definitely, NOT portable. Heavily dependent on Clozure CCL.

Note: There a lot of good (some great) audio file resources out there. Here are a few of them that I found useful:

  • l-smash: Exhaustively comprehensive MP4 box parser in C.
  • taglib: Clean library in C++.
  • mplayer: For me, the definitive tool on how to crack audio files.
  • eyeD3: Great command line tool.
  • MP3Diags: Good GUI-based-tool. Tends to slow, but very thorough.
  • MediaInfo: C++, can dump out all the info to command line and also has a GUI.
  • The MP4 Book: I actually didn't order this until well into writing this code. What a maroon. It would have saved me TONS of time.

Notes II:

  • Depends on quicklisp packages: LOG5, and ALEXANDRIA. See taglib.asd.
  • As the author(s) of taglib state in their comments, parsing ID3s is actually pretty hard. There are so many broken taggers out there that it is tough to compensate for all their errors.
  • The parsing of MP3 audio properties (mpeg.lisp) is far from complete, especially when dealing with odd case WRT Xing headers.
  • I've parsed just enough of the MP4 atoms/boxes to suit the needs of this tool. l-smash appears to parse all boxes. Maybe one day this lib will too.
  • WRT error handling: in some cases, I've made them recoverable, but in general, I've went down the path of erroring out when I get problems.
  • I've run this tool across my 19,000+ audio collection and compared the results to some of the tools above, with little to no variations. That said, I have a pretty uniform collection, mostly from ripping CDs, then iTunes purchases/matched, and the Amazon matched. YMMV
  • Parsing the audio info in an MP3 is hideously inefficient and needs a rewrite (mpeg.lisp). There is a global parameter in audio-streams.lisp called get-audio-info that controls whether parse-mp4-file/parse-mp3-file try to extract this info. To speed things up, you can bind this this parameter to nil (eg: (let ((audio-streams:get-audio-info nil)) (parse-...)). As an example, when we're not getting audio-info, parsing an MP3 takes microsends. When we are, it takes seconds.

    TAGLIB-TESTS> (time (dotimes (i 10) (mp3-test1)))
    (DOTIMES (I 10) (MP3-TEST1))
    took 1,640,067 microseconds (1.640067 seconds) to run.
        15,628 microseconds (0.015628 seconds, 0.95%) of which was spent in GC.
    During that period, and with 4 available CPU cores,
     1,636,000 microseconds (1.636000 seconds) were spent in user mode
         8,000 microseconds (0.008000 seconds) were spent in system mode
    121,941,600 bytes of memory allocated.
    1 minor page faults, 0 major page faults, 0 swaps.
    NIL
    TAGLIB-TESTS> (let ((audio-streams:*get-audio-info* nil)) (time (dotimes (i 10) (mp3-test1))))
    (DOTIMES (I 10) (MP3-TEST1))
    took 11,195 microseconds (0.011195 seconds) to run.
    During that period, and with 4 available CPU cores,
      8,000 microseconds (0.008000 seconds) were spent in user mode
          0 microseconds (0.000000 seconds) were spent in system mode
    575,520 bytes of memory allocated.
    NIL
    
  • For now, USE-MMAP in features is purely experimental. Seems pretty flakey, but that's probably because I'm using ccl:: methods without much regard to sanity...

And now for some sample invocations and outputs:

(let (foo)
    (unwind-protect
        (setf foo (parse-mp4-file "01 Keep Yourself Alive.m4a"))
    (when foo (stream-close foo)))    ; make sure underlying open file is closed

	(mp4-tag:show-tags foo))

Yields:

01 Keep Yourself Alive.m4a
sample rate: 44100.0 Hz, # channels: 2, bits-per-sample: 16, max bit-rate: 314 Kbps, avg bit-rate: 256 Kbps, duration: 4:03
    album: Queen I
    album-artist: Queen
    artist: Queen
    compilation: no
    disk: (1 1)
    genre: 80 (Hard Rock)
    title: Keep Yourself Alive
    track: (1 11)
    year: 1973

The show-tags methods also have a "raw" capability. Example:

(let (foo)
    (unwind-protect
        (setf foo (parse-mp3-file "Queen/At the BBC/06 Great King Rat.mp3"))
    (when foo (stream-close foo)))    ; make sure underlying open file is closed

	(mp3-tag:show-tags foo :raw t))

Yields:

Queen/At the BBC/06 Great King Rat.mp3: MPEG 1, Layer III, VBR, sample rate: 44,100 Hz, bit rate: 128 Kbps, duration: 5:60
Header: version/revision: 3/0, flags: 0x00: 0/0/0/0, size = 11,899 bytes; No extended header; No V21 tag
    Frames[9]:
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 0, version = 3, id: TIT2, len: 15, NIL, encoding = 0, info = <Great King Rat>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 25, version = 3, id: TPE1, len: 6, NIL, encoding = 0, info = <Queen>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 41, version = 3, id: TPE2, len: 6, NIL, encoding = 0, info = <Queen>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 57, version = 3, id: TALB, len: 11, NIL, encoding = 0, info = <At the BBC>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 78, version = 3, id: TRCK, len: 4, NIL, encoding = 0, info = <6/8>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 92, version = 3, id: TPOS, len: 4, NIL, encoding = 0, info = <1/1>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 106, version = 3, id: TYER, len: 5, NIL, encoding = 0, info = <1995>
        frame-text-info: flags: 0x0000: 0/0/0/0/0/0, offset: 121, version = 3, id: TCON, len: 5, NIL, encoding = 0, info = <(79)>
        frame-txxx: flags: 0x0000: 0/0/0/0/0/0, offset: 136, version = 3, id: TXXX, len: 33, NIL, <Tagging time/2013-08-08T16:38:38>

I have a semi-complete logging strategy in place that is primarily used to figure out what happened when I get an unexpected error parsing a file. To see the output of ALL logging statements to STANDARD-OUTPUT, you can do the following:

(with-logging ()
    (test2::test2))

To see only the MP4-ATOM related logging stuff and redirect logging to to a file called "foo.txt":

(with-logging ("foo.txt" :categories (categories '(mp4-atom::cat-log-mp4-atom)))
    (taglib-tests::test2))

See logging.lisp for more info.

If you really want to create a lot of output, you can do the following:

(with-logging ("log.txt")
    (redirect "q.txt" (test2 :dir "somewhere-where-you-have-all-your-audio" :raw t)))

For my 19,000+ files, this generates 218,788,792 lines in "log.txt" and 240,727 lines in "q.txt".