Publications:Text Driven Face-Video Synthesis Using GMM and Spatial Correlation

From ISLAB/CAISR

Do not edit this section

Keep all hand-made modifications below

Title Text Driven Face-Video Synthesis Using GMM and Spatial Correlation
Author Dereje Teferi and Maycel I. Faraj and Josef Bigun
Year 2007
PublicationType Conference Paper
Journal
HostPublication Image analysis : 15th Scandinavian Conference, SCIA 2007, Aalborg, Denmark, June 10-14, 2007 ; proceedings
Conference 15th Scandinavian Conference on Image Analysis, Aalborg, Denmark, June 10-14, 2007
DOI
Diva url http://hh.diva-portal.org/smash/record.jsf?searchId=1&pid=diva2:239348
Abstract Liveness detection is increasingly planned to be incorporated into biometric systems to reduce the risk of spoofing and impersonation. Some of the techniques used include detection of motion of the head while posing/speaking, iris size in varying illumination, fingerprint sweat, text-prompted speech, speech-to-lip motion synchronization etc. In this paper, we propose to build a biometric signal to test attack resilience of biometric systems by creating a text-driven video synthesis of faces. We synthesize new realistic looking video sequences from real image sequences representing utterance of digits. We determine the image sequences for each digit by using a GMM based speech recognizer. Then, depending on system prompt (sequence of digits) our method regenerates a video signal to test attack resilience of a biometric system that asks for random digit utterances to prevent play-back of pre-recorded data representing both audio and images. The discontinuities in the new image sequence, created at the connection of each digit, are removed by using a frame prediction algorithm that makes use of the well known block matching algorithm. Other uses of our results include web-based video communication for electronic commerce and frame interpolation for low frame rate video.