Oh this is very nice, I hadn't seen it before. A few random thoughts:
- The Vamp Plugin Pack for Mac finally got an ARM/Intel universal build in its 2.0 release last year, so hopefully the caveat mentioned about the M1 Mac should no longer apply
- Most of the Vamp plugins in the Pack pre-date the pervasive use of deep learning in academia, and use classic AI or machine-learning methods with custom feature design and filtering/clustering/state models etc. (The associated papers can be an interesting read, because the methods are so explicitly tailored to the domain)
- Audacity as host only supports plugins that emit time labels as output - this obviously includes beats and chords, but there are other forms of analysis plugins can do if the host (e.g. Sonic Visualiser) supports them
- Besides the simple host in the Vamp SDK, there is another command-line Vamp host called Sonic Annotator (https://vamp-plugins.org/sonic-annotator/) which is even harder to use, equally poorly documented, and even more poorly maintained, but capable of some quite powerful batch analysis and supporting a wider range of audio file formats. Worth checking out if you're curious
(I'm the main author of the Vamp SDK and wrote bits of some of the plugins, so if you have other questions I may be able to help)
Also, if you have access to the audio tracks from Rock Band (for example you see plenty of examples of Beatles isolated this-and-that on YouTube thanks to Rock Band).
I've tried a few sites that show up in a search for ai stem separation. Some work pretty well for rock music.
If I recall correctly, https://vocalremover.org/ worked pretty well. Though, it's pretty limited in the free tier and only allows payment via patreon. I never tried the paid version because I don't have a patreon account and don't want one.
Logic and Serato do 4 with a few clicks. The results in both are pretty good, definitely good enough for what this person is trying to do although with one less track I guess.
https://www.karaoke-version.co.uk will happily give you a song with any combination of instruments you like, as long as it's in their catalogue, and as long as you give them money
Soundslice (https://www.soundslice.com/) is a fully baked version of this idea. Tabs synced with videos, with a full notation/tab editor and tons of built-in music practice tools.
Not sure if the author will eventually show up here, but I'm curious if they managed to get it working at scale and if they ran into other challenges?
One 'feature' that immediately came to mind for me is automatic transposing for use with a capo. Many hobby guitarists cannot play barre chords for an entire track, especially if they don't know it already. Transposing is already a thing for vocal karaoke and quite common. Some players may be skilled enough to transpose in their head to take advantage of the capo, but juggling the lyrics, instrument, and transposing at once is quite taxing mentally.
What a cool thread! I like how you put the specifics of your workflow and especially details of the commands you used! Particularly with the vamp commands, because as you say, they are somewhat inscrutably named/documented.
I started dabbling with vamp as well a couple years ago, but lost track of the project as my goals started ballooning. Although the code is still sitting (somewhere), waiting to be resuscitated.
I have had an idea for many years of the utility of having chord analysis further built out such that a functional chart can be made from it. With vamp most of/all the ingredients are there. I think that's probably what chordify.com does, but they clearly haven't solved segmentation or time to musical time, as their charts are terrible. I don't think they are using chordino, and whatever they do use is actually worse.
I got as far as creating a python script which would convert audio files in a directory into different midi files, to start to collect the necessary data to construct a chart.
For your use case, you'd probably just need to quantize the chords to the nearest beat, so you could maybe use:
vamp-aubio_aubiotempo_beats, or
vamp-plugins_qm-barbeattracker_bars
and then combine those values with the actual time values that you are getting from chordino.
I'd love to talk about this more, as this is a seemingly niche area. I've only heard about this rarely if at all, so I was happy to read this!
> I think that's probably what chordify.com does [...] I don't think they are using chordino
I think they were initially using the Chordino chroma features (NNLS-Chroma) but a different chord language model "front end". Their page at https://chordify.net/pages/technology-algorithm-explained/ seems to imply they've since switched to a deep learning model (not surprisingly)
hey I built this exact concept months ago. beat detection. video generation. automated video creation checkout the videos I uploaded at https://youtube.com/@nevertwenty
I haven't but this thread is inspiring me to. Should I? If this post gets enough attention I will. If anyone wants to do this together send me an email brandon@olivers.army
Really good work. I like the way the author breaks down the procedures using open source tools. Nice and thanks for sharing.
May I add Chord ai for someone who wants to see similar projects? It's a paid app supported by ai. Personally, it helped mre where I could not figure out the chord progressions myself.
I don't know if this is off topic but I searched a while back for a tool to do a summary of research papers. I found out some and I was really flabbergasted by the progress of these tools with ai. When I graduated from uni 22 years ago, these things were only in the science action movies. Oh well...
Pretty offtopic to the article but somehow related: does anybody know opensource or at least not subscription based solutions to things like Songsterr [1], whose mobile app is really nice to learn playing a song with an instrument?
I'm not very familiar with Songsterr but from what I can see at that link I would recommend downloading Guitar Pro files from ultimate-guitar.com and displaying them in Tuxguitar (https://www.tuxguitar.app/) which is free and open source
Very cool indeed! Does anyone know how it's possible for Vamp to extract guitar chords from audio? What if there are multiple guitars, like lead and bass, or lead and rhythm?
That particular algorithm doesn't care whether the instruments are guitar or otherwise. There are other algorithms in vamp that would deal with individual notes. But in terms of separating tracks, vamp doesn't do that. There are some new ML-based solutions for this though. So you could separate them and run vamp on those outputs.
But to get the chords I don't think you need to worry about that.
To add to this, it’s vanishingly seldom that you would have a finished track where multiple parts of the ensemble are playing different chords, rather than each part making up the same overall chord structure.
Very cool! I think an interesting variation could be to try and generate ASS subtitles for the video with the chords and/or karaoke. There are some very cool transition and vector graphic effects possible there.
Oh this is very nice, I hadn't seen it before. A few random thoughts:
- The Vamp Plugin Pack for Mac finally got an ARM/Intel universal build in its 2.0 release last year, so hopefully the caveat mentioned about the M1 Mac should no longer apply
- Most of the Vamp plugins in the Pack pre-date the pervasive use of deep learning in academia, and use classic AI or machine-learning methods with custom feature design and filtering/clustering/state models etc. (The associated papers can be an interesting read, because the methods are so explicitly tailored to the domain)
- Audacity as host only supports plugins that emit time labels as output - this obviously includes beats and chords, but there are other forms of analysis plugins can do if the host (e.g. Sonic Visualiser) supports them
- Besides the simple host in the Vamp SDK, there is another command-line Vamp host called Sonic Annotator (https://vamp-plugins.org/sonic-annotator/) which is even harder to use, equally poorly documented, and even more poorly maintained, but capable of some quite powerful batch analysis and supporting a wider range of audio file formats. Worth checking out if you're curious
(I'm the main author of the Vamp SDK and wrote bits of some of the plugins, so if you have other questions I may be able to help)
any dylan beattie post must be accompanied with a recommendation for his hit single, You Give REST a Bad Name
https://www.youtube.com/watch?v=nSKp2StlS6s
This made my day. Thank you.
> I’ve created 5-channel mixes of all the backing tracks so we can fade out specific instruments if somebody wants to play them live
How was this done? This seems like an even more difficult task to do well than what’s described in the article
Probably either creating stems from karaoke multitracks (e.g. [0]) or using Spleeter [1] 5-stem mode
[0] https://www.karaoke-version.com/
[1] https://github.com/deezer/spleeter
Also, if you have access to the audio tracks from Rock Band (for example you see plenty of examples of Beatles isolated this-and-that on YouTube thanks to Rock Band).
is there an ai model that can do this at the moment?
I've tried a few sites that show up in a search for ai stem separation. Some work pretty well for rock music.
If I recall correctly, https://vocalremover.org/ worked pretty well. Though, it's pretty limited in the free tier and only allows payment via patreon. I never tried the paid version because I don't have a patreon account and don't want one.
There's an openvino plugin for audacity that can do music separation but it only supports 4 tracks at the moment (drums, bass, vocals, other).
https://github.com/intel/openvino-plugins-ai-audacity/blob/m...
I'm using https://github.com/adefossez/demucs to split drums, bass, voice and "everything else".
Works pretty well for my personal/hobbyist use (quality also depends on genre and instruments used - synth stuff tends to bleed into voice a bit).
Steinberg spectral Layers is one of the commercial ones. It sounds really good.
Ultimate vocal remover is a common one
Try moises
Logic and Serato do 4 with a few clicks. The results in both are pretty good, definitely good enough for what this person is trying to do although with one less track I guess.
https://www.karaoke-version.co.uk will happily give you a song with any combination of instruments you like, as long as it's in their catalogue, and as long as you give them money
Excellent, also check out the https://alphatab.net library which would let you render guitar pro tracks for the video.
Around 2013 I built a guitar tab synced to youtube video proof of concept thing and promptly let it rot, should have done more with it!
Soundslice (https://www.soundslice.com/) is a fully baked version of this idea. Tabs synced with videos, with a full notation/tab editor and tons of built-in music practice tools.
Not sure if the author will eventually show up here, but I'm curious if they managed to get it working at scale and if they ran into other challenges?
One 'feature' that immediately came to mind for me is automatic transposing for use with a capo. Many hobby guitarists cannot play barre chords for an entire track, especially if they don't know it already. Transposing is already a thing for vocal karaoke and quite common. Some players may be skilled enough to transpose in their head to take advantage of the capo, but juggling the lyrics, instrument, and transposing at once is quite taxing mentally.
Cool project!
What a cool thread! I like how you put the specifics of your workflow and especially details of the commands you used! Particularly with the vamp commands, because as you say, they are somewhat inscrutably named/documented.
I started dabbling with vamp as well a couple years ago, but lost track of the project as my goals started ballooning. Although the code is still sitting (somewhere), waiting to be resuscitated.
I have had an idea for many years of the utility of having chord analysis further built out such that a functional chart can be made from it. With vamp most of/all the ingredients are there. I think that's probably what chordify.com does, but they clearly haven't solved segmentation or time to musical time, as their charts are terrible. I don't think they are using chordino, and whatever they do use is actually worse.
I got as far as creating a python script which would convert audio files in a directory into different midi files, to start to collect the necessary data to construct a chart.
For your use case, you'd probably just need to quantize the chords to the nearest beat, so you could maybe use:
vamp-aubio_aubiotempo_beats, or vamp-plugins_qm-barbeattracker_bars
and then combine those values with the actual time values that you are getting from chordino.
I'd love to talk about this more, as this is a seemingly niche area. I've only heard about this rarely if at all, so I was happy to read this!
> I think that's probably what chordify.com does [...] I don't think they are using chordino
I think they were initially using the Chordino chroma features (NNLS-Chroma) but a different chord language model "front end". Their page at https://chordify.net/pages/technology-algorithm-explained/ seems to imply they've since switched to a deep learning model (not surprisingly)
hey I built this exact concept months ago. beat detection. video generation. automated video creation checkout the videos I uploaded at https://youtube.com/@nevertwenty
Very nice, have you posted any info on your setup?
I haven't but this thread is inspiring me to. Should I? If this post gets enough attention I will. If anyone wants to do this together send me an email brandon@olivers.army
Really good work. I like the way the author breaks down the procedures using open source tools. Nice and thanks for sharing. May I add Chord ai for someone who wants to see similar projects? It's a paid app supported by ai. Personally, it helped mre where I could not figure out the chord progressions myself. I don't know if this is off topic but I searched a while back for a tool to do a summary of research papers. I found out some and I was really flabbergasted by the progress of these tools with ai. When I graduated from uni 22 years ago, these things were only in the science action movies. Oh well...
Pretty offtopic to the article but somehow related: does anybody know opensource or at least not subscription based solutions to things like Songsterr [1], whose mobile app is really nice to learn playing a song with an instrument?
[1] https://www.songsterr.com/
I'm not very familiar with Songsterr but from what I can see at that link I would recommend downloading Guitar Pro files from ultimate-guitar.com and displaying them in Tuxguitar (https://www.tuxguitar.app/) which is free and open source
Very cool indeed! Does anyone know how it's possible for Vamp to extract guitar chords from audio? What if there are multiple guitars, like lead and bass, or lead and rhythm?
That particular algorithm doesn't care whether the instruments are guitar or otherwise. There are other algorithms in vamp that would deal with individual notes. But in terms of separating tracks, vamp doesn't do that. There are some new ML-based solutions for this though. So you could separate them and run vamp on those outputs.
But to get the chords I don't think you need to worry about that.
To add to this, it’s vanishingly seldom that you would have a finished track where multiple parts of the ensemble are playing different chords, rather than each part making up the same overall chord structure.
I still remember his “The Art of Code” talk, highly recommended: https://www.youtube.com/watch?v=6avJHaC3C2U
Very cool! I think an interesting variation could be to try and generate ASS subtitles for the video with the chords and/or karaoke. There are some very cool transition and vector graphic effects possible there.
A multi lane Rocksmith like interface, that dynamically adjust the difficulty of the chords based on the performers ability would be amazing.
"This one goes to Alan Holdsworth"
lol
Even Rick Beato is out of a job now.
That's akin to declaring that James Gosling or Guido van Rossum are out of a job now that we have LLMs in our IDEs.