This project deals with clustering of harmonic patterns. The main goals are using large-scale datasets and discovering interesting patterns.
This webpage is a work in progress. Don't hesitate to email me if you don't find all the necessary code here, I'll send it to you.
All the code in this project is in python. Compared to Matlab, it is more efficient, scalable, free to distribute, etc. Easy to catch up if you don't know anything about it! Suggestion, to test this code, use the application iPython. Our code was mostly developed in python 2.5 with scipy/numpy. Having the scikits library ANN speed sthings up a lot! but is not required. This tutorial may be incomplete, and the code might have bugs! please write me an email (tb2332 @ columbia . edu) if you have any trouble using it.
You will also need an EchoNest API account. It's free, you simply have to register. There is a call limit per minute.
CREATE / GATHER DATA
this part, you upload songs that you possess to the Echo Nest API and
receive thir analysis for it. This analysis is saved as a Matlab file,
one per song (yes, python can deal with Matlab files). Matfiles
contains per beat analysis, meaning one 12-dimensional vector
representing chromas for each of the beat identified by the Echo Nest.
import features as FEAT
for songpath in allsongpath:
From a set of Matfiles (previous section), we will train a codebook using vector quantization.
Download: model.py oracle_matfiles.py initializer.py trainer.py
We assume that all matfiles created above are in some subdirectory of "matdirectory".
The experiments will be saved in subdirs of "./expdir".
We will train a codebook of 100 codewords, each codeword representing 2 bars encoded as 8 beats.
To initialize we create codebook.mat (command line):
python initalizer.py -pSize 8 -usebars 2 -oraclemat matdirectory 100 codebook.mat
To train for 10K iterations = 10K songs randomly shown from matdirectory (command line):
python trainer.py -pSize 8 -usebars 2 -lrate 1e-3 -oraclemat matdirectory -expdir ./expdir codebook.mat
TEST MODEL - ENCODE A SONG
To see the encoding using a codebook and the distortion:
python encode_song.py -pSize 8 -usebars 2 song_enfeats.mat codebook.mat
This project is described in the following paper:
T. Bertin-Mahieux, R. Weiss and D. Ellis, Clustering beat-chroma patterns in a large music database, In Proceedings of the 11th International Conference on Music Information Retrieval (ISMIR), 2010. [pdf]