CS5489 -Assignment 2 -Game Music Tagging
Hello, dear friend, you can consult us at any time if you have any questions, add WeChat: daixieit
CS5489 -Assignment 2 -Game Music Tagging
Due date: see Assignment 2 on Canvas
Goal
In this assignment, the task is to annotate or tag a music clip with descriptive (semantic) keywords. This kind of content-based tagging system could be useful to musicians and sound engineers who want to automatically
organize their sound library, or search for sound or music by keyword.
Dataset of 80s Game Music
This dataset contains video game music from an 80s-era programmable sound generator based on FM
(frequency modulation). The music was for Sega and MSX PC games. Each song is annotated with emotion tags. The data is publicly available -please do not search for it and cheat.
Methodology
Semantic annotation is a multi-label classification problem, where each label corresponds to one sound tag and is a binary classification problem. The labels can co-occur (multiple labels can be assigned to the same sound), which makes it different from multi-class classification (where only one label can be assigned). Sound is a
temporal process, so the important thing is how to define the feature space for representing the sound, before learning the binary classifiers. You are free to choose appropriate methods (e.g., feature extraction method,
dimensionality reduction, and clustering methods)to help define a suitable feature space for sound annotation. You are free to use methods that were not introduced in class, as long as you present the details in the report. You can also consider the co-occurence of the labels to help with the multi-label classification.
Evaluation of Tagging
For evaluation, you will predict the presencelabsence of tags for each test sound. The evaluation metric is "Mean column-wise AUC". AUC is the area under the ROC curve, which plots FPR vs TPR."Mean column- wise" computes the average of the AUCs for the tags. To compute AUC, you will need to predict the score of each label(e.g.,decision function value,probability, etc.)rather than the label.
Evaluation on Kaggle
You need to submit your test predictions to Kaggle for evaluation. 50% of the test data will be used to show your ranking on the live leaderboard. After the assignment deadline, the remaining 50% will be used to calculate
your final ranking. The entry with the highest final ranking will win a prize! Also the top-ranked entries will be asked to give a short 5 minute presentation on what they did.
To submit to Kaggle you need to create an account, and use the competition invitation that will be posted on
Canvas. You must submit your Kaggle account name to the "Kaggle Username" assignment on Canvas 1 week before the Assignment 2 deadline. This is to prevent students from creating multiple Kaggle accounts to gain unfair advantage.
Note: You can only submit 2 times per day to Kaggle!
What to hand in
You need to turn in the following things:
1. This ipynb file Assignment2.ipynb with your source code and documentation. You should write about all the various attempts that you make to find a good solution. You may also submit python scripts as
source code, but your documentation must be in the ipynb file.
2. Your final csv submission file to Kaggle.
3. The ipynb file Assignment2-Final.ipynb ,which contains the code that generates the final submission file that you submit to Kaggle. This code will be used to verify that your Kaggle submission is
reproducible.
4.Your Kaggle username (submitted to the "Kaggle Username" assignment on Canvas 1 week before the Assignment 2 deadline)
Files should be uploaded to Assignment 2 on Canvas.
Grading
The marks of the assignment are distributed as follows:
· 45%- Results using various feature representations, dimensionality reduction methods, classification methods,etc.
· 30%-Trying out feature representations (e.g. adding additional features, combining features from different sources) or methods not used in the tutorials.
· 20%-Quality of the written report. More points for insightful observations and analysis.
· 5%-Final ranking on the Kaggle test data (private leaderboard). If a submission cannot be reproduced by the submitted code, it will not receive marks for ranking.
· Late Penalty: 25 marks will be subtracted for each day late.
Note: This is an individual assignment. Every student must turn in their own work!
In [1]:
%matplotlib inline impor mapob nne # eu utpu ag oma matpotlib nlne.baknd nlineset maplotlb forats svg impor mapotbpypo a pt impor apotb from nmpy mpor * from kearn mpor * from cpy mpor as randomseed( 00 impor v from cpy mpor o impor pke from Pthondspa mpor Audo dspla import ospah |
In [2]:
def showAudo info myfile = muscp3/ + nfo fnm' + mp3 if ospathexistsmyfle display(Audio(myfile)) else: prn “*** mp fe” +myf + "coud no b fou ** def oadpck le fnme f = open fname rb out = pickle.load(f) f.close() return out |
Load the Data
The training and test data are stored in various pickle files. Here we assume the data is stored in the musicdata directory. The below code will load the data, including tags and extracted features.
In [3]:
tran tag = load pcke m uscdata/tra agpicke3 tran mfccs = oadpicke m usicdata/tranmfccspckle3' tranmels = oadpicke 'musicdata/ra melspicke3 tran no = oadpicke mu scdaa/tran nopcke3 = oad pcke musicdata/tes mccspickle3' = oa pckle musicdaa/testmelspicke3 = oad pickle( uscaa/test infopcke3 |
Here are the things in the dataset:
· train info - info about each sound in the training set.
· train mels -the Mel-frequency spectrogram for each sound in the training set. Mel-frequency is a logarithmically-transformed frequency with better perceptual distance. More details here
(https://towardsdatascience.com/learning-from-audio-the-mel-scale-mel-spectrograms-and-mel-frequency- cepstral-coefficients-f5752b6324a8).
· train mfccs -MFCCs(Mel-frequency cepstrum coefficients) are dimensionality-reduced version of the Mel-frequency spectrogram. Specifically, the log is applied to the magnitudes, and then a Discrete Cosine
Transform is applied at each time.
· train tags -the descriptive tags for each sound in the training set.
· test info -info about each sound in the test set.
· test mels -the Mel Spectrogram for each sound in the test.
· test mfccs -the MFCC features for each sound in the test.
Here is the one song in the training set, as well as the tags and other info. To play the audio, we assume the
mp3s are available in the musicmp3 directory.
In [4]:
i = 4 showAudio(train info[ii]) print tran tags[i) prnt tran nfoi) |
0:00/0:00
['fluttered',’calm']
{'id':'eb7jboiu','tags':['fluttered','calm'],'top tag':'fluttered','fname':
'eb7jboiu'}
Here is the Mel-frequency spectrogram, which shows the frequency content over time. The spectrogram is
stored in an B x T matrix, where B is the number of bins,and T is the temporal length.The left plot shows
the original Mel spectrogram (with time increasing to the right). The right plot shows the log magnitude, which can better visualize the differences. Here we use B=128 Mel-bins.
In [5]:
print tranmels[ishape ptfgurefigsze= 73 pltsubplot( 121 plt.imshow(train mels[ii].T); pltxlabel 'time pltylabel 'me bin ) pttitle 'Me pectrogram pltsubplot( 122 plt.imshow(log(train mels[ii].T)) ptxabel tme pltylabel mel bin ) ptttle log Me pectrogram plt.tight layout() |
(143,128)
In [6]:
prnt tranmfccsiishape pltfigurefigsize= 83 pltsubplot( 121 plt.imshow(train mfccs[ii].T) pltxlabel time ) pltylabel mfcc bin' ptsubpot 2,2 plt.plot(train mfccs[ii]) pltxlabel( 'tme pltyabe 'mfcc vaue ) plt.tight layout() |
(143,20)
Data Pre-processing -Delta MFCCs
The first thing you might notice is that the MFCC vectors are time-series. One trick to include time-series
information into a vector representation is to append the difference between two consecutive feature vectors.
This way, we can include some relationship between two time steps in the representation.
In [7]:
# cmpue dea MFCs def comput deltamfccs mfccs dmfccs =[] for m n mccs tmp = m:-m 0 -1 dm = hsakm 0 -tmp dmfccs.append(dm) return mfcs |
In [8]:
train dmfccs = compute delta mfccs(train mfccs)
test dmfccs = &nbs
2023-03-21