Depth map compression, streaming, and reconstruction for immersive and accessible 3D telepresence

Stephen Siemonsma

doi:10.25820/etd.008041

Back

Dissertation

Depth map compression, streaming, and reconstruction for immersive and accessible 3D telepresence

Stephen Siemonsma

University of Iowa

Doctor of Philosophy (PhD), University of Iowa

Spring 2025

DOI: 10.25820/etd.008041

Files and links (1)

pdf

Dissertation39.28 MB

Embargoed Access, Embargo ends: 06/26/2026

Abstract

The recent surge in remote work and digital communication has highlighted limitations in traditional video conferencing, creating demand for more immersive alternatives. Despite advances in 3D hardware, widespread adoption of 3D telepresence applications remains limited due to challenges in encoding, compressing, and transmitting 3D data over standard connections. This dissertation addresses these challenges through novel techniques for 3D range geometry compression, streaming, and reconstruction. We introduce HoloKinect, a user-friendly 3D video conferencing platform using off-the-shelf hardware that leverages multiwavelength depth (MWD) encoding and standard video codecs for efficient transmission. The limitations of MWD motivated the development of N-DEPTH, a neural depth-to-RGB encoding scheme optimized through a differentiable JPEG approximation layer. N-DEPTH demonstrates superior compression resilience and lower reconstruction error by learning an encoding strategy robust to compression artifacts. Addressing the computational cost of neural methods, we introduce GraDE (Gray Depth Encoding), a computationally efficient algorithm inspired by N-DEPTH, MWD, and Gray codes. GraDE achieves competitive or superior performance to previous approaches, particularly for video streaming, offering a lightweight alternative for resource-constrained devices. Finally, our Collaborative Spatial Streaming platform demonstrates a multi-device application using GraDE, enabling dynamic, spatially coherent 3D capture and streaming from commodity mobile devices in immersive AR/VR environments. Overall, this research advances the state of the art in 3D telepresence by providing efficient, accessible, and robust solutions for depth compression and reconstruction.

Computer Science

3D Compression

3D Streaming

3D Telepresence

Depth Map Compression

Neural Compression

Virtual Reality (VR)

Details

Title: Subtitle: Depth map compression, streaming, and reconstruction for immersive and accessible 3D telepresence
Creators: Stephen Siemonsma
Contributors: Tyler Bell (Advisor)
Guadalupe Canahuate (Committee Member)
Ibrahim Demir (Committee Member)
Hans Johnson (Committee Member)
Kishlay Jha (Committee Member)
Resource Type: Dissertation
Degree Awarded: Doctor of Philosophy (PhD), University of Iowa
Degree in: Electrical and Computer Engineering
Date degree season: Spring 2025
DOI: 10.25820/etd.008041
Publisher: University of Iowa
Number of pages: xxvi, 160 pages
Comment: This thesis has been optimized for improved web viewing. If you require the original version, contact the University Archives at the University of Iowa: https://www.lib.uiowa.edu/sc/contact/
Language: English
Date submitted: 04/29/2025
Description illustrations: illustrations (some color)
Description bibliographic: Includes bibliographical references (page 149-160).
Public Abstract (ETD): In our increasingly digital world, video calls have become essential for work, education, and socializing. Yet traditional video conferencing lacks the natural depth and presence of face-to-face interaction. This dissertation develops innovative solutions to make three-dimensional (3D) video conferencing more accessible and practical for everyday use. This research explores 3D telepresence, which involves transmitting realistic 3D imagery of individuals to remote locations in order to make digital communication feel more immersive. However, sending detailed 3D video data requires a lot of internet bandwidth, and specialized equipment has often been expensive. This dissertation introduces new methods to make 3D telepresence more efficient and accessible. First, we developed HoloKinect, a system using affordable, common devices (like the Microsoft Kinect sensor and a special 3D display) to enable live 3D video calls over standard internet connections. Second, we created new computer algorithms, N-DEPTH and GraDE, to cleverly compress the 3D depth information into regular video formats. These methods significantly reduce the internet speeds needed to maintain good visual quality. N-DEPTH uses artificial intelligence, while GraDE is a faster, non-AI version suitable for less powerful devices. Finally, we built a platform allowing multiple smartphones to capture a scene in 3D together and stream it into virtual reality, demonstrating how this technology can work flexibly with everyday mobile devices. This work demonstrates that realistic 3D communication is achievable with today’s consumer technology, paving the way for more natural and engaging digital interactions.
Academic Unit: Electrical and Computer Engineering
Record Identifier: 9984830920302771

Metrics

6 Record Views