-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aligning offsets with bars #5
Comments
yes, that's pretty much the idea. the NN is not "bar-aware", it simply chunks audio into ~50ms segments, which are further grouped together into chunks of ~6 second non-overlapping segments. Each 6 second segment is an input to the NN, which provides an output of a chord label for each 50ms segment of this 6-second group. Any contiguous chord label are merged together as label for a longer segment e.g. 20 contiguous label of
this sounds like a nice improvement! one way I can think of is to use a Bar estimation algorithm (I don't have much experience with this, so I couldn't recommend one, but maybe popular libraries e.g. Essentia will have something of the sort), run the NN on the bar, then pick-out the most prominent chord from the NN output as the single chord label for the bar. |
Hi CJ, thanks for your explanation :) it helped me understand your code a little bit better, allowing me to implement a basic version of bar based guessing. Made PR #6, hope you will like it :) |
My question here only applies to songs where both the tempo and the time signature are known, but that should be most of the songs out there.
Imagining a song in 4/4 with 120 qpm that changes chord every 4 bars, you would have a change every 8 seconds (quarter is 60s/120=0.5s, bar is 0.5s * 4 = 2s, chord length is 2s * 4 = 8s). So the ideal output would be for example:
In reality time offsets in the predictions are a bit wonky, that is probably because in real sound there is not really an exact time when a chord starts. I have also tested this on a .wav render of a midi.
If the tempo is low and bars are long then durations can be sort of quantised "with a wrench hit" by approximating to the closest bar, but when the tempo is high enough (100+?) the timing error becomes too big, making it impossible to pin exactly when in the score the chord is changed.
I don't know much about how your NN works, but perhaps this is because the wave is analysed "continuously"? could it be made to analyse segments that are aligned with bars instead? In the previous song, for example, could the prediction function be made to guess what chord is there from 0.0 to 2.0, then from 2.0 to 4.0, etc?
Thanks!
The text was updated successfully, but these errors were encountered: