Fake videos represent an important misinformation threat. While existing
forensic networks have demonstrated strong performance on image forgeries,
recent results reported on the Adobe VideoSham dataset show that these networks
fail to identify fake content in videos. In this paper, we propose a new
network that is able to detect and localize a wide variety of video forgeries
and manipulations. To overcome challenges that existing networks face when
analyzing videos, our network utilizes both forensic embeddings to capture
traces left by manipulation, context embeddings to exploit forensic traces'
conditional dependencies upon local scene content, and spatial attention
provided by a deep, transformer-based attention mechanism. We create several
new video forgery datasets and use these, along with publicly available data,
to experimentally evaluate our network's performance. These results show that
our proposed network is able to identify a diverse set of video forgeries,
including those not encountered during training. Furthermore, our results
reinforce recent findings that image forensic networks largely fail to identify
fake content in videos.
Metrics
13 Record Views
Details
Title
VideoFACT: Detecting Video Forgeries Using Attention, Scene Context, and Forensic Traces