Tesi etd-03202025-001521

Tipo di tesi

Tesi di laurea magistrale

Autore

DAVOLI, DAVIDE

URN

etd-03202025-001521

Titolo

Avatar 3D: Dalla Cattura Multi-Camera alla Ricostruzione di Avatar 3D Animabili.

Titolo in inglese

Full-Body Avatars: From Multi-View Capture to Animatable 3D Avatar Reconstruction.

Struttura

Dipartimento di Ingegneria

Corso di studi

Ingegneria informatica

Commissione

Nome Commissario	Qualifica
VEZZANI ROBERTO	Primo relatore
GARATTONI LORENZO	Correlatore

Parole chiave

Avatar 3D
Computer Vision 3D
Gaussian Splatting
Human Body Modeling
Reconstruction 3D

Data inizio appello

2025-04-14

Disponibilità

Embargo di 3 anni

Data di rilascio

2028-04-14

Riassunto analitico

Questa tesi presenta un framework per la generazione di avatar 3D fotorealistici e animabili a partire da registrazioni video multi-camera di attori. Il processo inizia con la progettazione e l'installazione di un sistema multi-camera composto da otto telecamere sincronizzate, che catturano gli attori mentre eseguono varie azioni da diverse angolazioni.

Il filmato viene poi utilizzato per ottenere la posa e la forma del soggetto in ogni istante temporale, generando una geometria approssimativa del corpo attraverso un modello parametrico 3D. Successivamente, la geometria ottenuta guida la ricostruzione di una rappresentazione 3D fotorealistica e animabile del soggetto. Questo viene realizzato tramite il metodo "3D Gaussians Splatting", in cui ogni Gaussiana è collegata al modello parametrico sottostante. In particolare, ogni Gaussiana è attaccata a un triangolo della mesh del corpo, seguendo il movimento e la deformazione del triangolo lungo la sequenza di input. Questa formulazione consente alle Gaussiane 3D di catturare l'aspetto dettagliato della persona, mantenendo le capacità di animazione del modello parametrico sottostante. La rappresentazione delle Gaussiane viene quindi renderizzata e ottimizzata per corrispondere ai fotogrammi di input, garantendo un allineamento accurato con le immagini catturate.

Inoltre, questa formulazione, differenziabile rispetto al modello parametrico del corpo, consente di aggiornare i parametri del corpo utilizzando le Gaussiane 3D renderizzate come segnale di supervisione. Sfruttando questa capacità, la tesi propone un metodo per stimare il modello corporeo parametrico tramite un termine fotometrico ottenuto dalle Gaussiane. Per ottenere ciò, le Gaussiane devono essere ben allineate con la relativa faccia della mesh a cui sono associate, garantendo una retropropagazione accurata del segnale fotometrico alla geometria sottostante. Rispetto alle tecniche di stima precedenti, che utilizzano principalmente segnali sparsi per l'ottimizzazione, il nostro metodo dimostra un allineamento superiore delle mesh con i fotogrammi di input.

La tesi sfrutta quindi queste mesh stimate fotometricamente per ricostruire l'avatar 3D da zero, dimostrando miglioramenti significativi nella ricostruzione finale. Questo miglioramento è attribuito alla maggiore precisione del processo di stima.

Una volta ricostruito, l'avatar 3D può essere renderizzato da punti di vista arbitrari e animato manipolando i parametri del modello corporeo 3D sottostante.

Abstract

This thesis introduces a novel framework for generating photorealistic and animatable 3D avatars from multi-camera video recordings of actors. The process begins with the design and setup of a multi-camera system, comprising eight synchronized cameras, to capture actors performing various actions from multiple angles. The captured footage is then used to track the subject's pose and shape, generating a coarse geometry of the body through a 3D parametric body model. Finally, the tracked geometry is used to drive the reconstruction of a photorealistic and animatable 3D representation of the subject. This is accomplished through 3D Gaussians Splatting, where each Gaussian is rigged to the underlying body model. Specifically, each Gaussian is attached to a triangle of the body mesh, following the motion and deformation of the parent triangle across the input sequence. This formulation allows the 3D Gaussians to capture the detailed appearance of the person while maintaining the full animation capabilities of the underlying body model. The Gaussians representation is then optimized to match the input frames, ensuring accurate alignment with the captured data. Moreover, the differentiable nature of the rigging formulation with respect to the parametric body model enables to update the body parameters using the rendered 3D Gaussians as a supervision signal. Leveraging this capability, the thesis proposes a method for tracking the parametric body model driven by the photometric term obtained from the Gaussians. To achieve this, the Gaussians must be well aligned with the parent face, ensuring accurate backpropagation of the photometric signal to the underlying geometry. Compared to previous tracking techniques that primarily use sparse signals for optimization, our method demonstrates superior alignment of the meshes with the input frames. The thesis then leverages these photometrically tracked meshes to reconstruct the 3D avatar from scratch, demonstrating significant improvements in the final reconstruction. This enhancement is attributed to the increased precision of the tracking process. Once reconstructed, the 3D avatar can be rendered from arbitrary viewpoints and animated by manipulating the parameters of the underlying 3D body model.

File

Nome file	Dimensione	Tempo di download stimato (Ore:Minuti:Secondi)
Nome file	Dimensione	28.8 Modem	56K Modem	ISDN (64 Kb)	ISDN (128 Kb)	piu' di 128 Kb
Ci sono 1 file riservati su richiesta dell'autore.
Contatta l'autore