Models, Algorithms and Architectures for Video Analysis in Real-time

Gratis L'anteprima di questa tesi è scaricabile gratuitamente in formato PDF.
Per scaricare il file PDF è necessario essere iscritto a Tesionline. L'iscrizione non comporta alcun costo: effettua il Login o Registrati.

Mostra/Nascondi contenuto.

Chapter 1
Introduction
1.1 Preface
This thesis is meant to be the  nal report of three years of research in the con-
text of the Doctoral Curriculum  Dottorato di Ricerca in Ingegneria
dell Informazione (XIV Ciclo) on the topic of video analysis in real-time.
High speed processing of videos is a key need for many  elds;  rst, multi-
media applications in which the videos are growing as relevance. Think for
instance to the videos through the Internet web: standards as the MPEG-1,
MPEG-2, MPEG-4 and the upcoming MPEG-7 are video co-decs
(COmpressor-DECompressorS) very frequently used to broadcast videos
through the web. In fact, the bandwidth limitation of current web infrastruc-
tures prevents from the transmission of a huge video as it is. Compression
before transmitting it and decompression to view it  at the other side is more
ef cient since it allows less bandwidth consumption. In the MPEG standards
(especially in the more recent ones) the main part of the co-dec algorithm is
the shape coding of the objects that are moving in the scene: this is, indeed, a
typical video analysis task.
A second very large  eld of application of the video analysis is the pure
information extraction from the video itself. The  level of the information to
be extracted characterizes the video analysis application. Those applications
range from the shot detection (low level of information) to the object detection
and tracking (medium level) to the scene understanding and modeling (high
level). For example, the shot detection task is used to segment a video into
scenes, where a scene is a sub-sequence of the video (i.e. a sequence of
consecutive frames) with a homogeneous context. This is a very useful task
for indexing videos and for context-based information retrieval from videos.
The object detection and tracking from a sequence of images is probably
18 CHAPTER 1. INTRODUCTION
the more spread  eld of video analysis applications. It is a key process for
video-based traf c analysis and management systems, for video-surveillance
and security systems, for target detection and pointing in military applica-
tions, and for many other applications. Therefore, the researches on video
analysis reported in the literature are basically on this topic.
Lastly, the scene understanding and modeling task uses the information
from the lower levels to model the scene (and, typically, also the objects
present in the scene) in order to understand the behaviour of the objects or
to represent the scene with a higher level of description.
All the above-mentioned applications typically require a real-time (or quasi
real-time) execution and are characterized by a huge amount of data to be pro-
cessed. For instance, the real-time processing of a video at the standard PAL
(25 frames/sec) at a low resolution of 320x240 pixels. If we have color images
(that is 3 channels for each pixel by using the RGB color space), each frame
will require 320x240x3 bytes and it must be processed in 40 msec. With this
low resolution a simple transfer of data will require, indeed, a bandwidth of
5.49 MB/sec!!! Studying and, consequently, improving the performance and
the ef ciency of such systems is one of the main topic of the research de-
scribed in this thesis. The study has focused both on the hardware and on
the software point of view, trying to propose solutions that  t both with an
embedded specialized system and with a general-purpose one.
Besides improving the performance of video analysis applications, during
this research new computational models and algorithms for video analysis
has been analyzed and de ned. In particular, this research has developed
novel algorithms for motion detection and moving object segmentation from
cluttered and hostile environments, such as outdoor scene in which the sudden
changes of the light conditions, the frequent occlusions of moving objects by
means of buildings, poles, and so on, and the presence of shadows, are very
limiting factors.
Moreover, this research covers also motion analysis in  high speed videos,
that is videos in which the objects are moving with a very high speed and in
which the noise often renders the images almost unusable. This last topic is
very promising and little research has been done (for now) on it by the com-
puter vision community.
This interest on the video analysis results in the last decades in the wide
diffusion of international journals, conferences and workshops on this and
related topics. Moreover, a lot of funds has addressed this topic. In fact, the
research described in this thesis has been supported by the following funding:
• Fund for Progetto di Ricerca Orientata  Estrazione di informazioni vi-
suali complesse in tempo reale: modelli computazionali e tecniche di
1.2. RESEARCH GOALS 19
elaborazione di immagini (namely a project for oriented research with
the title  Complex visual information extraction in real-time: computa-
tional models and image processing techniques )
• Financial support to young researcher for the research on  Analisi di se-
quenze di immagini per sorveglianza e controllo del traf co ( Analysis
of image sequences for surveillance and traf c control )
• Contract for the  Analysis of Camera Car Video of Formula 1 , sup-
ported by the Ferrari SpA - Gestione Sportiva
• Partial support and collaboration with the Department of Electrical and
Computer Engineering of the University of California, San Diego
(UCSD), Computer Vision and Robotics Research (CVRR) laboratory,
headed by the Prof. Mohan M. Trivedi, to work on the project ATON
(Autonomous Agents for On-Scene Networked Incident Management)
• Partially funded by  PROGRAMMA STRATEGICO PER LA MO-
BILITA NELLE AREE METROPOLITANE - BOLOGNA from the
Italian Ministry of Public Works
• Fund for Progetto di Ricerca di Interesse Nazionale, supported by the
MIUR (Ministero dell Istruzione, dell Universit ‘a e della Ricerca) with
the title  Sistemi Web ad elevata qualit ‘a del servizio (namely a national
project for  Web systems with high quality of service )
1.2 Research Goals
Having in mind this preface, we will now depict the goals of this research. As
already stated, the  rst goal is the de nition of computational models for sat-
isfying the highly demanding requirements of real-time video analysis. The
 rst solution studied exploits the natural speed of hardware systems in per-
forming time-consuming operations. In fact, we initially explored the possi-
ble architectures able to improve the performance of certain algorithms, for
example, by parallelizing the computation. We studied and developed dedi-
cated architectures for real-time (frame rate) video processing onto a FPGA
(Field Programmable Gate Array) board. The scope was to evaluate the hard-
ware solution for a vision-based traf c control system that must satisfy the
real-time constraints.
This dedicated solution has been discarded due to two main reasons: the
 rst is the still high cost of recon gurable devices (as the FPGAs are) and the
20 CHAPTER 1. INTRODUCTION
lack of availability of such resources at our lab; the second is the availabil-
ity nowadays of cheap and powerful general-purpose systems able to reach
almost the same performance of specialized systems. For this reason, we
focused on models able to improve the performance on general-purpose sys-
tems. It is well known that in such systems the bottleneck is represented
by the memory hierarchy that delays the CPU execution. In particular, in
the computer architecture community many efforts have been done to study,
model and improve the performance of the cache memories. For this reason,
this work will present a comprehensive study of the locality, the obtainable
performance and the possible improvements of a cache for image processing
and multimedia applications, with particular focus on the video processing
applications.
In the general-purpose context, besides the performance analysis and im-
provement, novel algorithms and approaches for the motion detection have
been studied. The goal is to develop new techniques for moving object seg-
mentation and for object and feature tracking that can result in a further im-
provement both of the ef ciency and the ef cacy. For this goal a complete
system, called Sakbot (Statistical And Knowledge-Based Object Tracker),
has been developed and deeply tested in many different contexts and applica-
tions, from the traf c analysis to the video-surveillance.
1.3 Video Analysis Requirements
Video analysis applications in real-time are necessarily performed on-line,
that is with the images directly (live) feeding from a camera. In this context
we can have, basically, the two situations reported in Fig. 1.1. The setup
sketched in Fig. 1.1(a) is the case in which the camera is directly connected
to the computer by means of a frame grabber or another acquisition device.
In this case the real-time constraints are due to the video standard used and to
the speed of the acquisition device. In the second case (Fig. 1.1(b)) the data
acquired by the camera are processed by a video server that has the scope of, if
necessary, compressing/decompressing the video data, performing some pre-
processing task and assuring a user-friendly interface for the application. The
video server will then send to client computers through a web architecture the
video data to be visualized or furtherly processed. With these premises, in the
second situation the performance is degraded also by the web s bandwidth.
We can summarize the factors that drive the requirements for video anal-
ysis applications into four classes:
1) data type: whether we have color images or not, at which resolution and
at which frame rate are relevant information on the application we are
1.4. STRUCTURE OF THE THESIS 21
(a) Local processing (b) Distributed processing
Figure 1.1: Local vs remote processing in a video analysis application
going to study and develop. In the case of video analysis this implies
huge amount of data and large bandwidth required. Moreover, in the
case of distributed applications the data type will in uence the co-dec
functioning too;
2) hardware available: which hardware is available in the system is very
important. Besides being of great relevance for the performance of the
system (see the above considerations on the acquisition device), the
hardware can be used to improve the ef ciency of the system. Think
for instance to the MPEG decoder/encoder boards that are currently
spreading in the home PCs. As a conclusion, the requirements of the
application can be relaxed by devolving some processing to specialized
hardware;
3) local/distributed processing: as reported in Fig. 1.1 the video analysis
task has different requirements depending on the type of processing;
4) type of the information to be extracted: as already stated, the level, the
amount and the complexity of the information to be extracted by the
application (i.e. the  nal aim of the application) are key factors for
evaluating the computational load required.
1.4 Structure of the Thesis
This thesis has been divided in three main parts, in accordance with the steps
in which this research has been conducted. The  rst part will describe the
study of embedded, special-purpose systems as a solution to real-time video
analysis. We will  rst describe the development of a Real Time Convolver
(RTC) with a parallelized systolic architecture. The system has been improved
22 CHAPTER 1. INTRODUCTION
by functionally partitioning it onto a multi-FPGA device. The performance
and the limits of the proposal will be depicted.
Moreover, the FPGA solution is proposed for an UTC (Urban Traf c Con-
trol) system called VTTS (Vehicular Traf c Tracking System). In this case,
the low level module for daytime condition is depicted and deeply detailed,
proposing a multi-FPGA implementation and its performance in our prototy-
pal board.
The second part of the thesis will focus on the multimedia cache research.
In this part, that has been the main part of this three-year research, a cache
tuned to multimedia and image processing application has been studied. An
a-priori analysis of the feasibility by means of locality study has allowed a
comprehensive development of novel prefetching techniques able to improve
the overall performance of the cache of up to 140%!!! The cache has been
tested on a complete benchmark including both multimedia and image pro-
cessing algorithm.
The last part is, indeed, the largest since it includes two of the more prof-
itable topics of our research: motion detection and study of shadow detection
algorithms. In the  rst chapter of this part the Sakbot system already men-
tioned will be deeply described, with particular focus on the shadow detec-
tion algorithm. The second chapter will, instead, resume in part the previous
one to present a comprehensive empirical evaluation and comparison of the
state-of-the-art on moving shadow detection. A two-layer taxonomy will be
introduced and more than 21 papers dealing with this topic will be classi ed.
Four of them (including the one that we developed for Sakbot) have been
implemented in software and compared by means of novel quantitative and
qualitative metrics.
Lastly, preliminary results of  high-speed video analysis with the aim of
computing the angle that the steering wheel of a Formula 1 s car does will
be presented in the last chapter of this part. This topic is, indeed, very new
and only preliminary results are available. Nonetheless, this topic seems very
promising and will be, hopefully, a very relevant topic for the future research.
Part I
Architectures and Models with
Embedded Systems
Chapter 2
Introduction
2.1 Preface
Hardware dedicated solutions and re-con gurable/user dedicated CCMs (Cus-
tom Computing Machines) are the topics of a worldwide intense research. In
particular, great effort has been made to implement dedicated architectures for
image processing algorithms [1][2][3]. This is due to the high computational
load, the large amount of resources needed and (sometimes) the real-time
constraints typical of these applications. Several different solutions have been
proposed, for example dedicated VLSI chips (e.g. Plessley s PDSP 16488) or
DSPs optimized for image processing (e.g. Texas Instrument s TMS320C80).
In this research, Field Programmable Gate Arrays (FPGAs)-based solution is
adopted (as in [1][2][4] and many others) since it seems the most promising
choice in applications where processing speed - typical of dedicated solutions
- has to be matched with low-cost,  exible systems capable of performing sev-
eral different tasks. Moreover, FPGAs are ISPDs (In-System Programmable
Devices) since re-programmability is assured at run-time (or quasi run-time).
Finally, FPGAs do not suffer parallel processing scalability problems as for
DSPs. Recently, the rapid development of FPGAs made possible the imple-
mentation of many real time image processing algorithms into a single FPGA
chip. For example, Greenbaum and Baxter in [2] enhance the work exposed in
[1] to bring 2-D block motion estimation into a single FPGA Xilinx XC4013
with several off-chip memories. Furthermore, the rapid growth of FPGA com-
plexity and device size (e.g. Xilinx Virtex family [5]) and the contemporary
decrease of a device prize make FPGAs solution more and more attractive and
suitable.
The work presented in this chapter is one of the result of a research activ-
ity aimed at implementing dedicated architectures for image processing. In
26 CHAPTER 2. INTRODUCTION
particular, the research activity focuses on recon gurable devices like Field
Programmable Gate Arrays (FPGAs) [6] as the most promising ones in ap-
plications where processing speed typical of dedicated solutions has to be
matched with low-cost,  exible systems capable of performing several differ-
ent tasks.
The  rst part of this chapter will present our proposal to the implemen-
tation of a Real-Time Convolver (RTC) on a FPGA board. In particular, we
focused our attention on the 2-D convolution, where a input image with size
M ×N has to be convoluted with a K × R kernel to obtain an output image
where each pixel depends on a K × R window of neighboring pixels in the
input image [7]. Results have been presented in [8].
Instead, the second part of this chapter will address the study of hardware
dedicated (embedded) solution to the problem of traf c management. Traf c
 ow monitoring based on computer vision aims to extract information about
the traf c  ow from traf c scenes acquired with cameras. This information
is required to substantially support traf c management policies with regularly
updated data such as the number of vehicles passing on a road per time unit,
vehicles turning rates at intersections, queue length measurement, and many
others. Results are described in [9] and in [10].
2.2 Prototypal Board
Both the systems mentioned in the preface have been developed using the
VHDL language. The  nal prototype has been simulated and implemented on
a multi-FPGA board designed for rapid prototyping [11]. For this purpose, in
this section the main characteristics of this board will be described in order to
refer to speci c implementation issues in the following sections. In particular,
they will be highlighted the characteristics that limit the degrees of freedom
in the mapping and the routing of the prototype.
The prototypal board we used is the GigaOps G800 Spectrum board [11],
sketched in Fig. 2.1. The main blocks of this board are:
• The actual computation is performed by pairs of Xilinx XC4010E FP-
GAs, connected in modules called XMODs: in Fig. 2.1 four modules
(MOD0 thru MOD3) are shown. The two FPGAs in each module are
called YPGA and XPGA (from the name of the bus they are connected
with). Both these FPGAs have two memory ports: one connected only
to a 2 MBytes DRAM and one connected both to a 2 MBytes DRAM
and to a 128 KBytes SRAM device. XPGA and YPGA communicate
through a bus switch on the  rst memory port. This switch works on
2.2. PROTOTYPAL BOARD 27
Figure 2.1: Block diagram of the GigaOps G800 prototypal board
two virtual busses: a 16-bit data bus and a 10-bit address bus. It is im-
portant to stress that only YPGAs are connected to YBUS, i.e. to the
input/output data bus
• A module called SCVIDMOD (S-VIDEO, COMPOSITE, VIDEO MOD-
ULE), that decodes/encodes video signals (PAL or NTSC). This module
interfaces to YBUS for data input and output
• An input FPGA (here called VLPGA) connected to the VESA local bus
of the PC hosting the board. The VLPGA is interfaced with the HBUS
and the YBUS. It contains all the registers needed for correct board op-
eration (e.g. the CLKMODE register, that sets frequencies of the clocks
distributed on the board)
• An output FPGA (here called VMC) connected to SCVIDMOD. This is
an additional FPGA, directly interfaced with the video output and the
XBUS
• Three main busses that allow connections among the various blocks of
the board. These busses are:
 YBUS, a 32-bit I/O bus connected with VLPGA, VMC and the YP-
GAs of the XMODs
 HBUS, a 16-bit bus used to con gure and to load the FPGAs
 XBUS, a 64-bit bus normally used as four 16-data busses. Each of
these busses is connected only to the XPGAs and to the VMC.
The main data path of our applications is the following: pixels generated
by the video decoder are passed through the YBUS both to VLPGA and to the
YPGAs of the XMODs. These modules process data and pass the results either
28 CHAPTER 2. INTRODUCTION
to the VMC through YBUS or to the XPGAs through the bus switches. In the
latter case, the XPGAs can perform a further computation or simply pass the
results to the VMC through the 64-bit XBUS. In both cases, the VMC outputs
the results of its processing upon data coming from the XBUS or YBUS.

Anteprima dalla tesi:

Models, Algorithms and Architectures for Video Analysis in Real-time

CONSULTA INTEGRALMENTE QUESTA TESI

La consultazione è esclusivamente in formato digitale .PDF

Acquista

Informazioni tesi

Autore:	Andrea Prati
Tipo:	Tesi di Dottorato
Dottorato in	Dottorato in Ingegneria dell'Informazione
Anno:	2002
Docente/Relatore:	Rita Cucchiara
Istituito da:	Università degli Studi di Modena e Reggio Emilia
Dipartimento:	Dipartimento di Scienze dell'Ingegneria
Lingua:	Inglese
Num. pagine:	179

FAQ

Come consultare una tesi

Per consultare la tesi è necessario essere registrati e acquistare la consultazione integrale del file, al costo di 29,89€.
Il pagamento può essere effettuato tramite carta di credito/carta prepagata, PayPal, bonifico bancario.
Confermato il pagamento si potrà consultare i file esclusivamente in formato .PDF accedendo alla propria Home Personale. Si potrà quindi procedere a salvare o stampare il file.
Maggiori informazioni

Perché consultare una tesi?

Ingiustamente snobbata durante le ricerche bibliografiche, una tesi di laurea si rivela decisamente utile:

perché affronta un singolo argomento in modo sintetico e specifico come altri testi non fanno;
perché è un lavoro originale che si basa su una ricerca bibliografica accurata;
perché, a differenza di altri materiali che puoi reperire online, una tesi di laurea è stata verificata da un docente universitario e dalla commissione in sede d'esame. La nostra redazione inoltre controlla prima della pubblicazione la completezza dei materiali e, dal 2009, anche l'originalità della tesi attraverso il software antiplagio Compilatio.net.

Clausole di consultazione

L'utilizzo della consultazione integrale della tesi da parte dell'Utente che ne acquista il diritto è da considerarsi esclusivamente privato.
Nel caso in cui l’utente che consulta la tesi volesse citarne alcune parti, dovrà inserire correttamente la fonte, come si cita un qualsiasi altro testo di riferimento bibliografico.
L'Utente è l'unico ed esclusivo responsabile del materiale di cui acquista il diritto alla consultazione. Si impegna a non divulgare a mezzo stampa, editoria in genere, televisione, radio, Internet e/o qualsiasi altro mezzo divulgativo esistente o che venisse inventato, il contenuto della tesi che consulta o stralci della medesima. Verrà perseguito legalmente nel caso di riproduzione totale e/o parziale su qualsiasi mezzo e/o su qualsiasi supporto, nel caso di divulgazione nonché nel caso di ricavo economico derivante dallo sfruttamento del diritto acquisito.

Vuoi tradurre questa tesi?

L'obiettivo di Tesionline è quello di rendere accessibile a una platea il più possibile vasta il patrimonio di cultura e conoscenza contenuto nelle tesi.
Per raggiungerlo, è fondamentale superare la barriera rappresentata dalla lingua. Ecco perché cerchiamo persone disponibili ad effettuare la traduzione delle tesi pubblicate nel nostro sito.

Scopri come funziona »

DUBBI? Contattaci

Contatta la redazione a
[email protected]

Ci trovi su Skype (redazione_tesi)
dalle 9:00 alle 13:00

Oppure vieni a trovarci su

Parole chiave

computer vision

image processing

video surveillance

motion analysis

architectures dedicated

compressed video transmission

Tesi correlate

Non hai trovato quello che cercavi?

Abbiamo più di 45.000 Tesi di Laurea: cerca nel nostro database

Oppure consulta la sezione dedicata ad appunti universitari selezionati e pubblicati dalla nostra redazione

Ottimizza la tua ricerca:

individua con precisione le parole chiave specifiche della tua ricerca
elimina i termini non significativi (aggettivi, articoli, avverbi...)
se non hai risultati amplia la ricerca con termini via via più generici (ad esempio da "anziano oncologico" a "paziente oncologico")
utilizza la ricerca avanzata
utilizza gli operatori booleani (and, or, "")

Idee per la tesi?

Scopri le migliori tesi scelte da noi sugli argomenti recenti

Come si scrive una tesi di laurea?

A quale cattedra chiedere la tesi? Quale sarà il docente più disponibile? Quale l'argomento più interessante per me? ...e quale quello più interessante per il mondo del lavoro?

Scarica gratuitamente la nostra guida "Come si scrive una tesi di laurea" e iscriviti alla newsletter per ricevere consigli e materiale utile.

Leggi la guida

La tesi l'ho già scritta,
ora cosa ne faccio?

La tua tesi ti ha aiutato ad ottenere quel sudato titolo di studio, ma può darti molto di più: ti differenzia dai tuoi colleghi universitari, mostra i tuoi interessi ed è un lavoro di ricerca unico, che può essere utile anche ad altri.

Il nostro consiglio è di non sprecare tutto questo lavoro:

È ora di pubblicare la tesi

Scopri di più

Models, Algorithms and Architectures for Video Analysis in Real-time

Anteprima dalla tesi:

Models, Algorithms and Architectures for Video Analysis in Real-time

CONSULTA INTEGRALMENTE QUESTA TESI

La consultazione è esclusivamente in formato digitale .PDF

Informazioni tesi

FAQ

Come consultare una tesi

Perché consultare una tesi?

Clausole di consultazione

Vuoi tradurre questa tesi?

DUBBI? Contattaci

Parole chiave

Tesi correlate

Non hai trovato quello che cercavi?

Ottimizza la tua ricerca:

Idee per la tesi?

Come si scrive una tesi di laurea?

La tesi l'ho già scritta,ora cosa ne faccio?

Login

La tesi l'ho già scritta,
ora cosa ne faccio?