Ph.D. Dissertation Defense: Hui Su

Friday, September 26, 2014
2:00 p.m.
Kim 2211
Maria Hoo
301 405 3681
mch@umd.edu

ANNOUNCEMENT: Ph.D. Dissertation Defense
 
Name: Hui Su
 
Committee:
Professor Min Wu, Chair
Professor K. J. Ray Liu
Professor Gang Qu
Professor Carol Espy-Wilson
Professor Kari Kraus, Dean's Representative
 
Date/Time: Friday, September 26, 2014 at 2:00pm
 
Location: Kim 2211

Title: Temporal and Spatial Alignment of Multimedia Signals

Abstract: With the increasing availability of cameras and other mobile devices, digital images and videos are becoming ubiquitous. Research efforts have been made to develop technologies that utilize multiple pieces of multimedia information simultaneously. This dissertation focuses on the temporal and spatial alignment of multimedia signals, which is a fundamental problem that needs to be solved to enable such applications dealing with multiple pieces of multimedia data.

The first part of the  dissertation addresses the synchronization of video/audio signals. We propose a new modality for audio and video synchronization based on the electric network frequency (ENF) signal naturally embedded in multimedia recordings. Synchronization of audio and video is achieved by aligning the ENF signals. The proposed method is a very different approach to tackling the audio/video synchronization problem from existing work, and offers a strong potential to address such difficult scenarios that are otherwise intractable.

Estimation of the ENF signal from video is a challenging task and has not been well addressed previously. To solve the problem of insufficient sampling rate of video, we propose to exploit the rolling shutter mechanism commonly adopted in CMOS camera sensors. Several techniques are designed to alleviate the distortions of motions and brightness changes in videos for ENF estimation.

We also address several challenges that are unique to the synchronization of digitized analog audio recordings. Speed offset often occurs in digitized analog audio recordings due to the inconsistency in the tape rolling speed. We show that the ENF signal captured by the original analog audio recording can be retained in the digitized version. The ENF signal is approximately considered as a single-tone signal and used as a reference to detect and correct speed offsets automatically.

The last part of the dissertation examines the quality assessment of local image features for efficient and robust spatial alignment. We propose a scheme to evaluate the quality of SIFT features in terms of their robustness and discriminability. A quality score is assigned to every SIFT feature based on its contrast value, scale and descriptor, using a quality metric kernel that is obtained in a one-time training phase. Feature selection is carried out by retaining features with high quality scores. The proposed approach is also applicable to other local image features such as the as the Speeded Up Robust Features (SURF).


Audience: Graduate  Faculty 

remind we with google calendar

 

April 2024

SU MO TU WE TH FR SA
31 1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 1 2 3 4
Submit an Event