Force reindex of media files
  • Vote Up0Vote Down muhoomuhoo
    Posts: 41Member
    I'm having a terrible time trying to force Airtime to reindex/reimport a bunch of files.

    I have some 2500 files in my /home/drop directory. That is watched in Airtime.

    Only 387 of them are showing up in Airtime's file list! Where are the rest of them?

    I tried the trick of removing and re-adding the watch dir in the Airtime interface, and also doing the same from airtime-import. It doesn't help.

    So I tried using airtime-import -m to move them from /home/drop. That appeared to work-- alll the files got physically moved from /home/drop into /srv/airtime/stor/imported and /srv/airtime/stor/organize ,etc, but they still never appeared in the database! So I moved them back to /home/drop . I tried restarting the airtime-media-monitor, several times.

    I've been thrashing at this for some time, and getting very annoyed and frustrated. Where is, and what would be, the command to force the whole directory to be reimported, and, most importantly, reindexed so it shows up in the playlist builder? There are thousands of files, and they're all valid ogg or mp3 files, and yet, Airtime doesn't even appear to be trying to add them.
  • 12 Comments sorted by
  • Please tar the logs for media-monitor and post them here.

    /var/log/airtime/media-monitor/
    Airtime Pro Hosting: http://airtime.pro
  • Vote Up0Vote Down muhoomuhoo
    Posts: 41Member
    Ah, I see. Bad ID3 tags in only one file will cause the whole import to fail!

    It'd be nice if there were an exception wrapper that caused it to just skip over bad files. I looked at the source, and rather than modify it, I just ran the following bash hack on my folder of files:

    find . -name "*.mp3" |while read i; do
        echo "$i"
        mid3iconv -e CP1251 --remove-v1 "$i"
        if test $? -gt 0;  then
            mv "$i" ../broken/
        fi
    done
  • Could you post the lines from the log that mentioned where it was failing? We fixed some similar problems for the upcoming patch, and I'm interested if these are the same ones.
    Airtime Pro Hosting: http://airtime.pro
  • Vote Up0Vote Down muhoomuhoo
    Posts: 41Member
    Sure:

    2012-07-09 11:48:10,652 DEBUG - [MainThread] [channel.py : __init__()] : LINE 70 - u
    sing channel_id: 1
    2012-07-09 11:48:10,653 DEBUG - [MainThread] [channel.py : _open_ok()] : LINE 484 -
    Channel open
    2012-07-09 11:48:11,026 INFO - [MainThread] [media_monitor.py : <module>()] : LINE 1
    29 - Added watch to /srv/airtime/stor/
    2012-07-09 11:48:11,026 INFO - [MainThread] [media_monitor.py : <module>()] : LINE 1
    30 - wdd result 2
    2012-07-09 11:48:11,163 ERROR - [MainThread] [media_monitor.py : <module>()] : LINE 143 - Exception: 'utf8' codec can't decode bytes in position 11-13: invalid data
    2012-07-09 11:48:11,173 ERROR - [MainThread] [media_monitor.py : <module>()] : LINE 144 - traceback: Traceback (most recent call last):
      File "/usr/lib/airtime/media-monitor/media_monitor.py", line 133, in <module>
        wdd = notifier.watch_directory(dir)
      File "/usr/lib/airtime/media-monitor/airtimefilemonitor/airtimenotifier.py", line 194, in watch_directory
        return self.wm.add_watch(directory, self.mask, rec=True, auto_add=True)
      File "/usr/lib/airtime/airtime_virtualenv/lib/python2.6/site-packages/pyinotify.py", line 1887, in add_watch
        for rpath in self.__walk_rec(apath, rec):
      File "/usr/lib/airtime/airtime_virtualenv/lib/python2.6/site-packages/pyinotify.py", line 2075, in __walk_rec
        for root, dirs, files in os.walk(top):
      File "/usr/lib/airtime/airtime_virtualenv/lib/python2.6/os.py", line 284, in walk
        if isdir(join(top, name)):
      File "/usr/lib/airtime/airtime_virtualenv/lib/python2.6/posixpath.py", line 68, in join
        path +=  b
      File "/usr/lib/airtime/airtime_virtualenv/lib/python2.6/encodings/utf_8.py", line 16, in decode
        return codecs.utf_8_decode(input, errors, True)
    UnicodeDecodeError: 'utf8' codec can't decode bytes in position 11-13: invalid data

    2012-07-09 11:48:39,680 INFO - [MainThread] [media_monitor.py : <module>()] : LINE 78 -

    Also, the above file was an OGG file, not  an mp3 file. There were some MP3s with dodgy tags, but the error which crashes the media monitor does NOT appear to be related to ID3, but rather to ogg!

    It turns out this is a surprisingly hard problem. I'm trying every damn automated command line id3 conversion tool I can find, and none seem to be doing anything other than destroying mp3 files.

    I'm trying id3iconv right now, let's see how this one goes.
    Post edited by muhoo at 2012-07-10 03:04:27
  • Thanks for posting this,

    We actually have *not* fixed this for our upcoming patch. However looking at the stack trace it looks like one of our libraries (pyinotify) is crashing while doing "os.walk" on your filesystem.

    Do you know what encoding your filenames are being stored as? The stack trace suggests that they aren't in utf8 format since it could not decode using this format.

    What is the output of "locale" on your system?
    Airtime Pro Hosting: http://airtime.pro
  • Hi!

    I'm having a similar problem, I'm tring to watch a removable disk with 65000+ files, but the media monitor crashes with an error similar to the one reported by muhoo.

    I'm attaching the media-monitor.log file, my locale reports:

    root@microserver:~# locale
    LANG=en_US.UTF-8
    LANGUAGE=en_US:en
    LC_CTYPE="en_US.UTF-8"
    LC_NUMERIC=it_IT.UTF-8
    LC_TIME=it_IT.UTF-8
    LC_COLLATE="en_US.UTF-8"
    LC_MONETARY=it_IT.UTF-8
    LC_MESSAGES="en_US.UTF-8"
    LC_PAPER=it_IT.UTF-8
    LC_NAME=it_IT.UTF-8
    LC_ADDRESS=it_IT.UTF-8
    LC_TELEPHONE=it_IT.UTF-8
    LC_MEASUREMENT=it_IT.UTF-8
    LC_IDENTIFICATION=it_IT.UTF-8
    LC_ALL=

    The music files are a bunch of messy mp3, ogg and other things that came from everywhere, with or without ID3, any ID3 version and encoding, any bitrate, anything. The disk is a VFAT filesystem.

    I'm trying the mid3iconf script proposed by muhoo too, I'll report as soon as it will end to convert all that files.

    Thansk for the great job! :)

    incastratamente,
    Francesco P.

  • Hello,

    So what is happening is that you have files with invalid file name encodings on your filesystem. The quick way to fix this is to run:

    sudo convmv r -f iso8859-1 -t utf8 /path/to/watch/directory

    Take a look at the output, and verify the changes that it will make are correct. Then actually do the change by simply adding --no-test option like so

    sudo convmv --no-test -r -f iso8859-1 -t utf8 /srv/airtime/stor
    Airtime Pro Hosting: http://airtime.pro
  • It seems to be something related to Ubuntu 12.04 and Python 2.7:

    https://bugs.launchpad.net/ubuntu/+source/duplicity/+bug/989496

    I've partially solved changing the file

    /usr/lib/airtime/airtime_virtualenv/lib/python2.7/encodings/utf_8.py

    so that the function decode() looks like:
    def decode(input, errors='strict'):
        return codecs.utf_8_decode(input, 'replace', True)
    I'm just forcing the "replace" parameter. I don't know if in this way paths are imported as broken, so that files shows up in the library but can not be played. At least, the media monitor doesn't seems to crash anymore!

    tentatamente,
    Francesco P.

  • Reply to @Francesco+P.+Sileno:

    That is one way to make the errors disappear, but I'm not sure if that actually fixes the problem.

    I'd strongly suggest you use the convmv command instead.
    Airtime Pro Hosting: http://airtime.pro
  • Also post #7 from the bug report says the same thing:


    Incorrect filename encoding.
    Airtime Pro Hosting: http://airtime.pro
  • Thanks Martin, I reverted utf_8.py to its original state and tryed the mvconv command, but it seems that it is still reporting excpetion - but this thime without crashing. I'm checking again, there are files that even convmv doesn't seems to recognize...

    Now I'm having another problem: when I add that huge directory as a watch, after a while media monitor start logging this messages:

    2012-07-27 14:05:45,343 ERROR - [Thread #4] [api_client.py : get_response_from_server()] : LINE 177 - Error Authenticating with remote server: HTTP Error 500: Internal Server Error
    2012-07-27 14:05:45,343 ERROR - [Thread #4] [api_client.py : get_response_from_server()] : LINE 183 - Error connecting to server, waiting 5 seconds and trying again.
    Then after another while it logs some more files, and starts again with that errors. In a whole night it wasn't able to import that 65000+ files. I'm attaching full log from a fresh start.

    Is it worth opening a new thread on this subject, or are this two problems related?

    progreditamente,
    Francesco P.

  • Just to publicly report what I've achieved so far:

    - I ran the mvconv command suggested by Martin.

    - when I get the "HTTP Error 500: Internal Server Error", I have to change the status of the watched directory from exists = FALSE to exists = TRUE, table cc_music_dirs in the database. And everything starts working back.

    - with the change in utf_8.py that I wrote before, the "import" of the watched directory works to the end.

    - I added a small sleep() in AirtimeNotifier::update_airtime(), to avoid excessive load on the machine (HP Microserver).

    - I then wrote the small PHP script that I attach here to verify that imported files are stored with the right path/name.

    In my case, all the 65484 files stored in the database seem to have been correctly imported!

    completatamente,
    Francesco P.