Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid match and poor confidence #63

Closed
balnagy opened this issue Nov 17, 2014 · 7 comments
Closed

Invalid match and poor confidence #63

balnagy opened this issue Nov 17, 2014 · 7 comments

Comments

@balnagy
Copy link

balnagy commented Nov 17, 2014

Hi,

I would like to use this library to find the position only in one song, but it's not working at all for the song I'm using. I don't know if it's the problem of the song or the algorithm, but even the tests are failing with invalid match and zero confidence.

The song is very popular (Taylor Swift - Shake It Off), but it's licensed. You can try to get it with a few commands.

youtube-dl https://www.youtube.com/watch?v=nfWlot6h_JM
ffmpeg -i Taylor\ Swift\ -\ Shake\ It\ Off-nfWlot6h_JM.mp4 Taylor\ Swift\ -\ Shake\ It\ Off-nfWlot6h_JM.wav

Then what I did was that I just modified test_dejavu.sh to scan wav files and then execute. I used wav files, because mp3 had a strange length, but wav seems to be ok.

Can you help me to fix this issue?

Thanks!

@worldveil
Copy link
Owner

Please shared the modifications to the script so I or others can run? Just so the issue is completely reproducible.

@balnagy
Copy link
Author

balnagy commented Nov 18, 2014

Sure, I will try and add more instructions.

  1. I modified this line: https://github.com/worldveil/dejavu/blob/master/test_dejavu.sh#L11 to
python dejavu.py fingerprint ./mp3/ wav
  1. I removed all the other mp3 files from the mp3 directory, so only my wav file stayed there.
  2. I created a clean database, named dejavu_test2 in the local MySQL.
  3. I created a dejavu.cnf config file like this:
{
    "database": {
        "host": "127.0.0.1",
        "user": "root",
        "passwd": "", 
        "db": "dejavu_test2"
    }
}
  1. Then I run ./test_dejavu.sh

Results

  • Confidence_*sec.png has only 0 values
  • matching_perc_*sec.png has only invalid values

Thanks!

@worldveil
Copy link
Owner

@balnagy ah I see. There is no problem with Dejavu, but the testing framework has a bug where if the name of your track on disk includes an underscore (_), then the match will always be invalid because the strings compared will be different.

If you look in the results/dejavu-tests.log generated by the testing suite (as you should!), you'll see it is predicting the correct song, but the track name excludes the part following the last underscore:

file: Taylor Swift - Shake It Off-nfWlot6h_JM_69_3sec.wav
song: Taylor Swift - Shake It Off-nfWlot6h
song_result: Taylor Swift - Shake It Off-nfWlot6h_JM
invalid match

But if I extract a random 3 second segment from the Talyor Swift track to mytest.wav and use the command line tool, everything works fine:

$ python dejavu.py recognize file mytest.wav 
{'match_time': 0.3390800952911377, 'song_id': 1, 'confidence': 1326, 'song_name': 'Taylor Swift - Shake It Off-nfWlot6h_JM', 'offset': 646L}

The problem is here. I didn't originally write the testing suite, but had a contributor kind enough to make it. It needs some love, though. Long term, this obviously needs to be fixed.

In the short term, I might use instead 3-4 underscores (____) as a separator, which is even more hackish (ugh), but is a temporary fix. In even shorter term, you could change the filename part nfWlot6h_JM to nfWlot6hJM by removing the underscore.

@balnagy
Copy link
Author

balnagy commented Nov 20, 2014

@worldveil, wow, thanks. Now I repeated the test after renaming the file to 1.wav, so I have higher confidence (40-700), but the offset still doesn't match. I could imagine the song is repeatative, but none of the 5 samples matches, which I think very unlikely.

DEBUG:root:--------------------------------------------------
DEBUG:root:file: 1_170_5sec.wav
DEBUG:root:song: 1
DEBUG:root:song_result: 1
DEBUG:root:correct match
DEBUG:root:query duration: 0.599
DEBUG:root:confidence: 146
DEBUG:root:song start_time: 170
DEBUG:root:result start time: 94.0
DEBUG:root:inaccurate match
DEBUG:root:--------------------------------------------------

Song: https://www.youtube.com/watch?v=nfWlot6h_JM&t=170s
Result: https://www.youtube.com/watch?v=nfWlot6h_JM&t=94s

@worldveil
Copy link
Owner

@balnagy, apologies, I haven't had much time to look into these issues lately. Any progress or thoughts?

@balnagy
Copy link
Author

balnagy commented Dec 16, 2014

It's very hard to debug such a problem, so I gave up and implemented my own algorithm just to find the offset, since I know the song. And it's kind of weird, since if the offset is not precise, then it makes the whole result questionable.

Where would you start debugging?

@worldveil
Copy link
Owner

Not necessarily. The way music is produced now, many of the sounds are direct copies or looped clips, meaning that it might actually be legitimately ambiguous as to which loop the fingerprints matched to.

I would start with the test case script and ensure the algorithm is actually messing up and not just the test suite. The test suite was contributed by someone, that, while I applaud the effort, leaves some room for improvement.

If that isn't it, you might just need to tweak the parameters of the hashing to ensure better offset matching.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants