On the Concept of Mobility
Thewordsmobility andnetwork arefoundtogetherinmanycontexts. Theissuealone
ofmodelinggeographicalusermobilityinwirelessnetworkshascountlessapplications.
Just to name a few:
† resource allocation, typically bandwidth;
† pervasive computing, in a world where every object is a computing device that
can be of help for persons or other devices;
† service continuity: in the Internet realm, this is the case of the various mobile
IP solutions, but we can also focus on the lower-level issue of physical hand-ofi,
possibly between heterogeneous networks;
† location-aware services, from navigation, to advertising;
† disruption-tolerant networks, for example in vehicular, military, social, or coop-
erative applications.
Depending on one’s background, the concept is investigated with very difierent
tools and aims. Given also the huge number of works, therefore, it would simply not
be practical to cite the \most relevant" contributions to a \fleld" so di–cult to deflne.
Nevertheless, for highlighting once more the broadness of the subject, let us cite [1],
whereNokia pointsoutthescientiflcand commercial needof considering evencultural
factors when answering the \who is doing what and where" question.
It may seem that now there is not much to add to the concept of mobility, but it is
not so. Actually, the last decade saw also a growing interest in code mobility, i.e. the
9
10 ON THE CONCEPT OF MOBILITY
possibility for software applications (or parts thereof) to migrate and keeps working in
difierent devices and environments.
The usual idea is that of stateful agents, which can carry with them all the code
(i.e. the program itself) and the data they need. A notable real-life and success-
ful application is distributed computing, which under certain hypothesis can void the
needofexpensivesupercomputers. Thegeneralrationaleissplittingaverydemanding
computingtask(asproteinfoldingorintegerfactorization)intoalargenumberofinde-
pendentsub-problems, each addressable bylimited-power machines, weakly connected
(typically through the Internet, the quintessence of a wired network).
Oneofthemostextremevisions, however,isprobably ActiveNet [2], areplacement
forInternet. Notjustafancyname,butratherathoroughreworkingofpassivepacket-
switched networks that would evolve packets into \capsules", a sort of micro-agents.
Eventually, any participating node (from routers to user devices) would perform some
computation,accordingtotheinstructionsembeddedinthecapsulereceivedorrelayed.
However appealing this proposal might be, we decided to concern ourselves with
present-day issues. In particular, we organized this thesis in two distinct and indepen-
dent parts. We of course refer the reader to the respective introductory sections for
the details, but let us anticipate that:
† Part I
deals with audio flngerprinting, and a special emphasis is put on the applica-
tion of broadcast monitoring and on the implementation aspects
1
. Although the
problem is tackled from many sides, one of the most prominent di–culties is
the high computing power required for the task. We thus devised and operated
a distributed-computing solution, which is described in detail. Tests were con-
ducted on the computing cluster available at the Department of Engineering of
the University of Ferrara.
1
Part of this work is being used commercially.
11
† Part II
focuses instead on wireless networks. Even if the approach is quite general, the
stress is on WiFi networks. More speciflcally, we tried to evaluate how mobile-
users’ experience can be improved. Two tools are considered. In the flrst place,
we wrote a simulator and used it to estimate the impact of pricing strategies
in allocating the bandwidth resource, flnding out the need for such solutions.
Secondly, we developed a high-level simulator that strongly advises to deepen
the topic of user cooperation for the selection of the \best" point of access, when
many are available. We also propose one such policy.
Part I
Mobility of Applications:
a Framework for
Audio Fingerprinting
13
Chapter 1
What is Audio Fingerprinting?
Besides the unquestionable scientiflc interest, the availability of vast, digital archives
of music and the related commercial interests are pushing many efiorts into the fleld
of automatic audio recognition. The opportunity of exploiting more computing power,
and at steadily decreasing costs, certainly plays a key role as well. Indeed, chances
are that what a decade ago would have been practical only by means of specialized
hardwarecouldnowbeaccomplishedinsoftware(sometimesevenonportabledevices).
Though related to speech recognition, this truly multiform and demanding topic is
actually addressed in distinct ways. Audio recognition may aim to discern whether or
not two pieces are in fact the same, regardless of their outer appearance (i.e. coding,
distortions). More generally, it comes into play when we are interested in reliably
identifying an unknown excerpt of e.g. music, given a large set of references. A flrst
glimpse of its rationale and possible applications is given in Fig. 1.1.
Thetaskofautomatedaudiorecognitionusedtobeaccomplishedbyusinginvasive
watermarking techniques. However, this requires either permanent human interven-
tion, or that a single watermarked source accounts for every possible instance of a
given pattern. Neither method is feasible: the former needs an active listener who
correctly marks pieces, whereas the latter is clearly not realistic. A difierent approach
is therefore needed.
A working, and actually very good, solution is the so-called audio flngerprinting
[3,4], known also as robust audio hashing. Its purpose is to allow electronic devices to
15
16 WHAT IS AUDIO FINGERPRINTING? 1.0
Figure 1.1: A simplifled overview of what audio flngerprinting is.
identify perceptual similarities between audio contents. The term \flngerprint" recalls
the fact that every piece of audio bears unique features, as detectable by listeners.
The usual system in which a flngerprinting algorithm operates, sees the building
of a large flngerprint database, used as a reference source for identifying unknown
flngerprints (compare also Fig. 2.1). Note that the search task is heavily demanding,
since we are supposed to flnd the most similar (not simply exact) match in a huge
amount of data (e.g. some 100000 song flngerprints).
Such tight search time requirements (a gain factor of roughly 100, at least, should
generallybeattainedoverrealtime)anditshighcomplexityalsoledtoputtingconsid-
erable efiort into the matching problem. In [5], for example, gene-sequence matching
algorithms are borrowed from biology, allowing for a lightning-fast identiflcation of an
event-based flngerprint, computable over the output of any feature extraction algo-
rithm.
1.0 17
Onourside, wefocusedourattentiononasimpleyetveryrobustalgorithm, whose
basic operation was flrst described by Haitsma et al. in [6]. The distinctive feature
it extracts is intimately related to the audio signal energy. More speciflcally, it takes
into account how the energy difierence among a set of sub-bands varies in time. In
other words, it evaluates the second-order derivative of the energy spectrogram. The
reasons that drove our choice and the details of the algorithm are provided later in
2.1.
Weanalyze in this workmanyessential aspects of algorithm [7]. Wechoseit as the
core of a more general framework for audio flngerprinting, which we will describe in
detail. Focusingespeciallyontheapplicationofbroadcastmonitoring, wealsoprovide
efiective strategies for improving the overall system and discuss the results of plentiful
experiments, based on a database of approx. 100000 songs. By broadcast monitoring
we mean the continuous tracking of an audio source such as a radio channel. Musical
TVandsatellitechannelsmayalsobeprocessedinasimilarfashion,evenifacombined
approachincludingtheirvideocomponentwouldprobablybeabetterchoice. Butlet’s
take a look at how audio flngerprinting has been performed so far.
Aswepointedout,theinterestinthefleldhasbeensteadilygrowingintheseyears,
thus producing quite a large amount of good contributions. Signiflcant reviews can be
found in the cited [3,4] and also in [8], but they date back to 2005 and cannot account
for the latest developments.
First of all, we stress that applications are really manifold. For example, copyright
issues could be easily addressed in p2p flle-sharing networks or video sharing services,
while meta-data retrieval and broadcast monitoring [5,9] can be achieved. A music
consumer could fast check whether his or her large collection already contains a given
song, too [10]. Another nice application is the so-called \query by humming" [11],
an early work in multimedia searching. A similar algorithm is now freely available
on-line [12].
According to the researchers background and goals, the topic can be addressed in
many difierent ways. The flrst step invariably involves isolating a sequence of these
18 WHAT IS AUDIO FINGERPRINTING? 1.0
\features"inthepieceofaudio,andthelongerthepieceofaudio,themorethefeatures.
This set of features is said to be the \flngerprint" of the piece.
Most of all, the flngerprint must be su–ciently distinguishable, so that two flnger-
prints can be reliably told apart or regarded as similar [13]. Along with the retrieving
technique, it must prove also robust, i.e. exhibit high reliability even when various
kinds of distortions occur, such as equalizing, white noise, pitching and so on. Finally,
it must be fast to compute from e.g. a PCM (Pulse Code Modulation) signal, and also
achieveasigniflcantreductionfactorinsize,withrespecttotheoriginalPCMsamples.
On the basis of these considerations, we can also speak of \robust hashing" since,
similarly to standard hashing techniques, audio flngerprinting aims to produce a com-
pact and fast-to-check-and-retrieve representation of a complex audio signal. Usually,
as in our case, there is the need to work only on the waveform representation of the
audio signal, with no further semantic aid (as a score would be). Needless to say,
regular hash functions cannot be employed, since they rely on the precise bit sequence
of the particular digitalization of the audio signal.
A large selection of difierent audio features can be extracted, usually chosen on
the basis of heuristic considerations, with a few exceptions [14]. It is also possible to
combine more of them into one single identiflcation model [15], aiming to mutually
compensate eventual weaknesses. However, this can be computationally expensive, if
not prohibitive beyond small databases. Other less common approaches include Prin-
cipalComponentAnalysis (PCA),as in [16]and [17], or statisticalmodelingcombined
with Hidden Markov Models (HMMs), for example [18]. Although efiective, it is often
di–cult to render them e–cient enough for real-time employment.
The usual feature extracting technique involves a combined time-frequency analy-
sis,mainlyperformedthroughFouriertransforms. Afewworksproposeinsteadwavelet
transforms, which may be a more rewarding choice [19{21]. A combination of both a
FFT and a DCT is then employed in [22]. This paper is particularly interesting since
it builds on [6], as our approach does, but introduces a further processing stage (which
involves the DCT), in order to improve the indexed search reliability. The technique