Deepfake Detection: Analyzing Model Generalization Across Architectures, Datasets, and Pre-Training Paradigms
Journal article, Peer reviewed
Published version
Åpne
Permanent lenke
https://hdl.handle.net/11250/3146455Utgivelsesdato
2023Metadata
Vis full innførselSamlinger
Sammendrag
As deepfake technology gains traction, the need for reliable detection systems is crucial. Recent research has introduced various deep learning-based detection systems, yet they exhibit limitations in generalising effectively across diverse data distributions that differ from the training data. Our study focuses on understanding the generalisation challenge by exploring different aspects such as deep learning model architectures, pre-training strategies and datasets. Through a comprehensive comparative analysis, we evaluate multiple supervised and self-supervised deep learning models for deepfake detection. Specifically, we evaluate eight supervised deep learning architectures and two transformer-based models pre-trained using self-supervised strategies (DINO, CLIP) on four different deepfake detection benchmarks (FakeAVCeleb, CelebDF-V2, DFDC and FaceForensics++). Our analysis encompasses both intra-dataset and inter-dataset evaluations, with the objective of identifying the top-performing models, datasets that equip trained models with optimal generalisation capabilities, and assessing the influence of image augmentations on model performance. We also investigate the trade-off between model size, efficiency and performance. Our main goal is to provide insights into the effectiveness of different deep learning architectures (transformers, CNNs), training strategies (supervised, self-supervised) and deepfake detection benchmarks. Following an extensive empirical analysis, we conclude that Transformer models surpass CNN models in deepfake detection. Furthermore, we show that FaceForensics++ and DFDC datasets equip models with comparably better generalisation capabilities, as compared to FakeAVCeleb and CelebDF-V2 datasets. Our analysis also demonstrates that image augmentations can be beneficial in achieving improved performance, particularly for Transformer models.