Sunteți pe pagina 1din 2

ECS731 Music Analysis and Synthesis: Labs 2017

There are four lab sessions for ECS731, which will help you gain practical experience of some of the
topics covered in lectures and prepare for the assessed coursework. A brief description of each lab
is given below, and there are data files available on QMplus. Solutions may be written in Matlab or
Octave (which I used in preparing the exercises). Theres no unique solution to any of the exercises;
please explore and discuss possible approaches and see what works best.

Lab 1 Introduction to Matlab and Audio


If you need to brush up on basic Matlab, see the tutorial notes on QMplus. Otherwise start immedi-
ately on the audio-related exercises.
In the first part of this lab, we examine the waveform and magnitude spectrum of a single piano tone.
Read in the file pianoTone.wav and plot the time domain waveform, with time in seconds on the
horizontal axis. You should see that there is a sharp attack around 70ms, followed by a gradual decay
over the next 2 seconds. Starting at 70ms, calculate and plot the log magnitude spectrum for a frame
of 8192 samples at the beginning of the tone (i.e. from 0.07 seconds), and another (on the same axis
but in a different colour) at the end of the tone (from 1.8 seconds). What differences do you notice
between the two plots?
Now write a loop to plot log magnitude spectra for 5 successive frames of 8192 samples starting from
70ms, using 5 different colours. Zoom in to see the frequencies up to 1kHz only.
Find the frequency of the first partial (fundamental frequency) using the first frame. If you have time,
find the frequencies of the first 20 partials. (You will need to think about how to identify the correct
peaks. Look carefully at the plot and note that the partials are not equally spaced.)
Now load the file vibrato.wav and use the autocorrelation function (xcorr) to find the fundamental
period of the signal, and convert this to fundamental frequency. Divide the signal into frames to track
how the fundamental frequency changes over time. Plot the results.
Finally load the file stab.wav, listen to it, and plot its magnitude spectrum. What musical information
(if any) do you think you could extract from the spectrum?

Lab 2 Onset Detection Functions


Implement the following onset detection functions (ODFs) as described in lectures: RMS energy, HFC,
SF, CD, RCD, PD and WPD. Rather than processing the audio separately for each function, calculate
them together, storing the resulting ODFs in an N 7 matrix, where N is the number of frames, and
there are 7 ODFs. Then normalisation and peak picking can be performed in a few lines of code, e.g.
% octave issues warning messages here, but this gives a correct result:
odf = (odf - mean(odf)) ./ std(odf); % standardise: zero mean, unit std dev
d = diff(odf); % positive followed by negative diff indicates a peak
z = zeros(1,7); % diff(odf) is one row shorter than odf
isPeak = (odf > threshold) & ([z; d] > 0) & ([d; z] < 0);
[frame odfNum] = find(isPeak);
Compare the performance of the ODFs on the given audio examples. Note that the performance is
strongly dependent on the peak picking, thresholds and any other postprocessing (e.g. comparison
with median filtered values) which you perform. The Matlab file evaluate.m is supplied to help with
evaluation, which you may use if you wish.

Lab 3 - YIN
Implement the YIN function for analysing the pitch of the given file containing monophonic singing.
Start with the normalised difference function, then add quadratic interpolation to obtain a more
accurate pitch estimate. What is a suitable threshold value for distinguishing voiced from unvoiced
frames? If you have time, implement temporal smoothing and automatic segmentation (finding note
boundaries).
If you are interested, you can read about the research involving this and other recordings of Happy
Birthday here:
www.eecs.qmul.ac.uk/~simond/pub/2014/MauchFrielerDixon_IntonationInUnaccompaniedSinging.pdf
Lab 4 - ADRess
In this lab you will implement an offline version of the Azimuth Discrimination and Resynthesis
(ADRes) source separation algorithm. Given a stereo audio file, the resolution of the azimuth
space, a position d in the azimuth space where d , and a radius h < around the position
d, the algorithm outputs a stereo audio file containing the separated sources in the given range of
positions. If you have time, try to visualise the data and work out (manually or automatically) where
sources are present in the azimuth plane.
The main loop consists of the following steps: get the next frame of audio data; apply a window
function; compute the FFT of the frame; compute the magnitude spectrum of the source(s) present
at the chosen azimuth position(s); reconstruct the complex spectrum using the original phase values;
perform an inverse FFT; apply a window function; overlap-add the frame to the output. Some audio
files are provided for testing. (ADRess should be able to separate some of the sources in these files.)

S-ar putea să vă placă și