Documente Academic
Documente Profesional
Documente Cultură
MUSICAL UNIVERSE
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
Cultural and Educational Technology Institute (CETI)
Research and Innovation Center in Information, Communication and Knowledge Technolo-
gies (ATHENA)
58 Tsimiski Str., 67100, Xanthi, Greece
{gpavlid, tsiafaki, fotarny, kballa, chamzas}@ceti.gr
http://www.ceti.gr
A. Pikrakis
Dept. of Informatics and Telecommunications
University of Athens (UOA)
Panepistimioupolis, TYPA Buildings, 15784, Athens, Greece
pikrakis@di.uoa.gr
http://www.di.uoa.gr/dsp
INTRODUCTION
As the 3D technologies become more and more common to the wide public and gain
even more attention, acceptance and appreciation, there is an increasing demand for
more sophisticated user interfaces that can use metaphors of real life and keep the user
far apart from the low level information and data structures. Nowadays, the users de-
mand to have access to the data in a more human-like and comprehensive style and
not by using typical text-based interfaces. Additionally, there is an increasing demand
for interfaces that reflect the content of the data and are able to provide with content-
based retrieval and representations. These facts complemented by a continuous devel-
opment of 3D display hardware, either in the form of passive or active technologies,
forms an open field for research and development on new means of man-machine in-
terfaces on algorithmic and software level.
The developed system in this work is an innovative interactive digital music collec-
tion visualization system, and is, actually, a proposal for future user interfaces and
front-end virtual environments for multimedia and multi-dimensional databases that
store cultural data, such as the musical archives and collections. Digital music can be
accompanied by complementary information and metadata that describe its content.
The way the accompanied information is extracted from the musical data as well as
inserted in the collection is not a subject of this work. The focus in this work is on the
3D visualization environment, which is specifically designed for digital music collec-
tions, that is able for multi-scale representations and direct access to all the digital
_____________________________________________________________________
1/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
content. The system provides a virtual space based mainly on a taxonomy on musical
genre categories and sub-categories.
Specifically, the interface provides with 3D data clustering and visualization accord-
ing to metadata and content-based descriptions that have been already extracted and
stored within the archive. The system is based on two main sub-systems:
• the clustering sub-system, which uses Self-Organizing Maps (SOM) (Kohonen
1982, 1995, 2001) to cluster the data and construct the 3D virtual space in
terms of 3D coordinates of the data, and
• the visualization and user interaction sub-system, which uses OpenGL graph-
ics capabilities to draw the virtual space on the user’s screen and provide with
the required interactivity
The clustering sub-system was trained and evaluated using a collection of Greek tradi-
tional music, formed in CETI as a part of a national R&D project named “Polymnia”,
containing about 1000 files. The collection is in the form of an XML native database
that includes the total amount of digital information: links to digital musical files and
audio samples, accompanied information (such as scores, references, analysis work)
and metadata (such as the genre, the beat and the meter, or other automatically ex-
tracted low level characteristics).
Capturing the enchanting beauty of the universe in all its colorful formations and
structures and using it combined with today’s 3D technologies in order to bring it to a
common computer system as an enhanced, friendly and appealing user interface envi-
ronment for multi-dimensional data visualization was the driving force in this work.
Contemporary digital music archives can be though of as multi-dimensional databases
consisting of both the musical content and its description. Descriptions of musical
data can be multiple and multi-dimensional. Due to the nature of these descriptions
that, usually, describe a piece of music in terms of multiple consecutive millisecond
fractions, a characteristic can take the form of a numerical vector of as high as a thou-
sand dimensions. Combining the fact that each of these descriptions can only capture
one or a small amount of acoustic features of the music, one can image how many
dimensions have to be used to provide with an overall signature-like description of a
piece of music. Nevertheless, provided that such descriptions exist in the musical col-
lection, this work describes a way of exploiting high-dimensional descriptions to pro-
vide with a visualization, interaction and data access interface using the visual meta-
phor of the universe.
The visualization and interaction scenario in the proposed virtual space is as follows:
• a “planet” is the data unit of the system and represents a piece of digital music
of the collection. It is placed within a “planetary system” that consists of simi-
lar and directly related musical pieces in terms of a selected low-level charac-
teristic (like the beat or the meter or a combination of both). The planet not
only provides a link to the digital musical file but also to the accompanied in-
formation and metadata. Furthermore,
_____________________________________________________________________
2/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
• a “planetary system”, that groups similar planets, belongs to a “galaxy” that
consists of planetary systems with a common high-level characteristic (like the
musical genre), but are distinguishable through a lower-level characteristic.
Finally,
• the galaxies, all together, form the “musical universe”. The universe describes,
visually, through its organized representation, the total amount of the informa-
tion stored in the digital music collection.
In the following paragraphs a brief review of relative works is given and a detailed
description of the system is provided.
Among the many works in the field of audio data organization and visualization we
are referencing here just two. These two works have played a very important role in
the conception of the Musical Universe:
• In 2001, Pampalk in his diploma thesis “Islands of music”, based on the work
of Freuhwirth and Rauber (2001) as well as the work of Rauber and
Freuhwirth (2001), developed a system that extracts musical features and uses
Self Organizing Maps to construct a 2D visualization environment using the
visual metaphor of islands in the sea. His environment displays a pseudo-3D
landscape where island formations represent groups of musical pieces that
have similar common characteristics. The placement of the various islands and
their relative distances indicate how the characteristics that have been chosen
to represent the music can distinguish the data.
• In 2001 Tzanetakis et.al proposed a set of features to describe the musical
genre in musical collections and shown how their genre classification scheme
can be accompanied by visualization either as a histogram, that they named
GenreGram, or as a virtual space that they named the GenreSpace. Gen-
_____________________________________________________________________
3/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
reSpace is a 3D interactive space that represents the classes formed after genre
classification and after the reduction of the dimensions of the data to 3.
The Musical Universe, in its pilot form, consists of two main sub-systems:
1. Clustering sub-system for Universe formation
2. Enhanced user interface sub-system
1. Data structure formation: the information that the system uses in order to pro-
duce the appropriate clustering is a multi-dimensional combination of selected
metadata from the musical database. Several such combinations have been
tested, each of which produced different clustering results. For the purposes of
this work, the final dataset that was seven-dimensional and consisted of the
following features:
a. the textual musical genre (represented by an index)
1
http://www.polymnia.gr
2
http://exist.sourceforge.net
_____________________________________________________________________
4/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
b. the mean value of the genre feature
c. the meter
d. the beat
e. the beat detection certainty
The data undergo a Principal Component Analysis (PCA) (Pearson 1901,
Smith 2002) that reduces the dimensions to three. It is significant to note here
that the dataset can consist of several thousands of entries, as a music collec-
tion can grow enormously.
_____________________________________________________________________
5/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
clusters according to the spatial distribution of the data and was explicitly set
not to exceed the three clusters. This way the neurons that represent each of
the musical pieces that belong to a specific genre are clustered according to
their content-based descriptions into sub-genre classes. At a final step, the ac-
tual musical pieces of the database are mapped to the three-dimensional virtual
space according to a best matching neuron topology representation: the data
are labeled according to the neurons that best represent them and, thus, the
musical galaxies and planetary systems are formed. Each galaxy’s data posi-
tions are stored and are used to draw the galaxies in the visualization environ-
ment. Such a clustering for a particular genre of Greek traditional music is
shown in Figure 2. In this figure, the dots represent musical pieces and the
stars represent centroids of the classes, which are represented by different gray
levels. As seen in this figure, the unsupervised clustering performed by the k-
means algorithm can efficiently recognize the different data characteristics.
_____________________________________________________________________
6/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
If the user selects a galaxy, the user interface displays the sub-categorization already
done within a galaxy (Figure 5). That is, it presents the planetary systems with similar
characteristics which are distinguishable by appropriate coloring. Again the interac-
tion follows the same principles: the user can navigate and rotate to get a preferred
view of the data, read the titles on top of the planets, get all the information regarding
a song of his choice as well as a link to the actual digital musical file (Figure 6) and
get the total table of contents in the shown galaxy in textual form as shown in Figure
7. This way the user reaches intuitively to the final step of this visualization scenario
which is to get the information about a song in a digital musical collection.
_____________________________________________________________________
7/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
CONCLUSIONS
REFERENCES
Freuhwirth and M. Rauber, A. (2001), Self organizing maps for content-based music
clustering, in Proceedings of the 12th Italian Workshop on Neural Nets
(WIRN01), Vietri sul Mare, Italy.
Kohonen, T. (1982), Self-organizing formation of topologically correct feature maps,
Biological Cybernetics, 43, pp. 59-62.
Kohonen, T. (1995), Self-organizing maps, Springer-Verlag, Berlin, Germany.
_____________________________________________________________________
8/9
Third International Conference of Museology & Annual Conference of AVICOM
Mytilene, June 5 – 9, 2006
G. Pavlidis, D. Tsiafakis, F. Arnaoutoglou, K. Balla, C. Chamzas
_____________________________________________________________________
Kohonen, T. (2001), Self-organizing maps, Springer Series in Information Sciences,
30, 3rd edition, Berlin, Germany.
Martinez, J. eds. (2004), MPEG7 Overview, Available from
http://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htm, accessed 29
April 2006.
McQueen, J. B. (1967), Some Methods for classification and Analysis of Multivariate
Observations, Proceedings of 5-th Berkeley Symposium on Mathematical Statis-
tics and Probability, Berkeley, University of California Press, 1, pp. 281-297.
Pampalk, E (2001), Islands of music, analysis, organization and visualization of mu-
sic archives, Diploma Thesis, Vienna, Austria.
Pampalk, E., Rauber, A. and Merkl, W. (2002), Content-based organization and visu-
alization of music archives, in Proceedings of the ACM Multimedia Conference
(ACM MM02), pp. 570 - 579.
Pearson, K. (1901), On Lines and Planes of Closest Fit to Systems of Points in Space,
Philosophical Magazine, 2/6, pp. 559–572.
Pikrakis, A., Antonopoulos J. and Theodoridis S. (2004), Music meter and tempo
tracking from raw polyphonic audio, Proceedings of the 5th International Con-
ference on Music Information Retrieval (ISMIR 2004), Barcelona, Spain.
Rauber, A. and Freuhwirth, M. (2001), Automatically analyzing and organizing music
archives, in Proceedings of the 5th European Conference on Research and Ad-
vanced Technology for Digital Libraries (ECDL 2001).
Smith, L. (2002), A tutorial on Principal Component Analysis, Available from
http://csnet.otago.ac.nz/cosc453/student_tutorials/principal_components.pdf,
accessed at 29 April 2006.
Tzanetakis, G., Essl, G. and Cook, P. (2001), Automatic musical genre classification
of audio signals, International Symposium on Music Information Retrieval (IS-
MIR 2001), Indiana, USA.
Tzanetakis, G. and Cook, P. (2002), Musical genre classification of audio signals,
IEEE Transactions on Speech and Audio Processing, 10/5, pp. 293-302.
_____________________________________________________________________
9/9