Monday, September 2, 2013

Activity 12: Playing Notes by Image Processing


Another great application of image processing is the ability to convert image into sounds. Say, I have a musical sheet in an image format. I can use morphological filters to isolate the musical notes, use the location of these notes, convert the location of these notes to frequency, and essentially play these notes!

Here, I have a song sheet for Twinkle Twinkle Little Star, one of the easiest songs to play on the piano.


Here is the note musical note placed in one single line to make image processing easier. This at least normalizes the y- coordinate plane.





Then I use the code snippet below:
[Row,Column] = size(MS);
SE1 = CreateStructureElement('vertical_line',3);
SE2 = CreateStructureElement('horizontal_line',2);
A = ErodeImage(~MS, SE1); //Removes Horizontal lines
B = ErodeImage(A, SE2);  //Removes Vertical Lines
imwrite(mat2gray(B),'C:\Users\Phil\Desktop\Academic Folder\Academic Folder 13-14 First Sem\AP 186\Activity 12\MorpedTwinkle.png');
BlobImage = SearchBlobs(B);
NumberofBlogs = max(SearchBlobs(B));
IsCalculated = CreateFeatureStruct(%f);
IsCalculated.Centroid = %t

BlobStatistics = AnalyzeBlobs(BlobImage,IsCalculated);
ct = 1
for i=1:NumberofBlogs
    xc(ct) = BlobStatistics(i).Centroid(1);
    yc(ct) = BlobStatistics(i).Centroid(2);
    Area(ct) = size(find(BlobImage==i),2);
    ct = ct + 1;
end

to remove the horizontal and vertical lines on the song sheet. I thus get the location of each notes, using the remaining blobs.


I then plot the remaining plot in its x and y coordinates, and do the following conversion there:



As one can see, both graphs are similar which is exactly what I need. The frequencies are then played using the code snippet used:

function n = note(f, t)
n = sin (2*%pi*f*t);
endfunction;

//Notes 1st Octave
R = 0;
G0 = 196.00*2;
A0 = 220.00*2;
B0 = 246.94*2; 
C0 = 261.63*2;
D0 = 293.66*2;
E0 = 329.63*2;
F0 = 349.23*2;
G0 = 392.00*2;

//Notes 2nd Octave
G1 = 196.00*4;
A1 = 220.00*4;
B1 = 246.94*4; 
C1 = 261.63*4;
D1 = 293.66*4;
E1 = 329.63*4;
F1 = 349.23*4;
G1 = 392.00*4;

//Durations 
EN = soundsec(0.125) //Eigth note
QN = soundsec(0.25); //quarter note
HN = soundsec(0.5); //Half note
FR = soundsec(1.); //Full note

I was able to convert the y - length successfully using the code snippet below:
But the x-length conversion needs clean-up. Up to now I was able to successfully convert the x-length to matrix, but when I convert the numbers into soundsec(), the matrices have different sizes which I have yet to bypass.

//Converts Y length to note value
yn = yc;
sens = 0.02; //Sets sensitvity  
yn(find(yn<0.15 +sens & yn> 0.15-sens)) = C0;
yn(find(yn<0.382 +sens & yn> 0.382-sens)) = G0;
yn(find(yn<0.436 +sens & yn> 0.436-sens)) = A1;
yn(find(yn<0.317 +sens & yn> 0.317-sens)) = F0;
yn(find(yn<0.271 +sens & yn> 0.271-sens)) = E0;
yn(find(yn<0.214 +sens & yn> 0.214-sens)) = D0;

//Converts X length to note value

for k=1:length(xc)-1
    xc(k) = xc(k) - xc(k+1)
end
xn = xc
xn(find(xn>=0.033)) = 0.5;
xn(find(xn<0.033)) = 0.25;
game = [];
for j=1:length(yn)
    game($+1,:) = note(yn(j),soundsec(0.5));
    end
SOUND = matrix(game', 1, length(game));
sound(SOUND);

Anyhow, here is the resulting sound in .mp4 format:




References:
1. Jing, Soriano - Playing notes by Image Processing, 2013

I would give myself a 10 for this activity due to the fact that I was able to use Morphological Filters to play notes in Scilab. I would love to update the code to be able to

1. Play any musical sheet, which means that I don't have to edit the song sheet
2. Increase the dictionary of the code. That means include eighth notes, rests, sharps, etc.
3. Ability to use the area of the blob to also determine the length. This particularly useful for half notes, which I t put to use...


No comments:

Post a Comment