[데이터셋 조사] Audio-Visual 데이터셋 조사
LRS3
This dataset introduced by Afouras et al. exclusively comprises of real videos. It consists of 5594 videos spanning over 400 hours of TED and TED-X talks in English. The videos in the dataset are processed such that each frame contains faces and the audio and visual streams are in sync.
https://mmai.io/datasets/lip_reading/
Lip Reading Sentences 3 Dataset
Lip Reading Sentences 3 A new, challenging speaker recognition domain & dataset --> Explore -->
mmai.io
FakeAVCeleb - 구글폼 필요
The FakeAVCeleb dataset is a deepfake detection dataset, which consists of 20,000 video clips in total. It comprises of 500 real videos sampled from the VoxCeleb2 and 19500 deepfake samples generated using different manipulation methods applied on the set of real videos. The dataset consists of the following manipulations where the deepfake algorithms used in each category are indicated within brackets.
• RVFA: Real Visuals - Fake Audio (SV2TTS)
• FVRA-FS: Fake Visuals - Real Audio (FaceSwap)
• FVFA-FS: Fake Visuals - Fake Audio (SV2TTS + FaceSwap)
• FVFA-GAN: Fake Visuals - Fake Audio (SV2TTS + FaceSwapGAN)
• FVRA-GAN: Fake Visuals - Real Audio (FaceSwapGAN)
• FVRA-WL: Fake Visuals - Real Audio (Wav2Lip)
• FVFA-WL: Fake Visuals - Fake Audio (SV2TTS + Wav2Lip)
https://github.com/DASH-Lab/FakeAVCeleb
GitHub - DASH-Lab/FakeAVCeleb: FakeAVCeleb
FakeAVCeleb. Contribute to DASH-Lab/FakeAVCeleb development by creating an account on GitHub.
github.com
KoDF - 구글폼 필요
This dataset is a large-scale dataset comprising real and synthetic videos of 400+ subjects speaking Korean. KoDF consists of 62K+ real videos and 175K+ fake videos synthesized using the following six algorithms: FaceSwap, DeepFaceLab, FaceSwapGAN, FOMM, ATFHP, and Wav2Lip. We use a subset of this dataset following to evaluate the cross-dataset generalization performance of our model.
https://deepbrainai-research.github.io/kodf/
Abstract
Authors: Patrick Kwon*, Jaeseong You*, Gyuhyeon Nam, Sungwoo Park, Gyeongsu Chae * Equal contributionArXiv: arXiv:2103.10094
deepbrainai-research.github.io
DF-TIMIT
The Deepfake TIMIT dataset comprises deepfake videos manipulated using FaceSwapGAN. The real videos used for manipulation have been sourced by sampling similarlooking identities from the VidTIMIT dataset. We use their higher-quality (HQ) version, which consists of 320 videos, in evaluating cross-dataset generalization performance.
https://zenodo.org/records/4068245
DeepfakeTIMIT
DeepfakeTIMIT is a database of videos where faces are swapped using the open source GAN-based approach (adapted from here: https://github.com/shaoanlu/faceswap-GAN), which, in turn, was developed from the original autoencoder-based Deepfake algorithm. When
zenodo.org
DFDC
The DeepFake Detection Challenge (DFDC) dataset is another deepfake dataset that consists of samples with fake audio besides FakeAVCeleb. It consists of over 100K video clips in total generated using deepfake algorithms such as MM/NN Face Swap, NTH, FaceSwapGAN, StyleGAN, and TTS Skins. We use a subset of this dataset consisting of 3215 videos, as used in [21, 22] to evaluate the model’s cross-dataset generalization performance.
https://ai.meta.com/datasets/dfdc/
Deepfake Detection Challenge Dataset
Overview We partnered with other industry leaders and academic experts in September 2019 to create the Deepfake Detection Challenge (DFDC) in order to accelerate development of new ways to detect deepfake videos. In doing so, we created and shared a unique
ai.meta.com
* 본 내용은 "AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection", CVPR, 2024 논문에서 발췌한 것.
https://arxiv.org/abs/2406.02951
AVFF: Audio-Visual Feature Fusion for Video Deepfake Detection
With the rapid growth in deepfake video content, we require improved and generalizable methods to detect them. Most existing detection methods either use uni-modal cues or rely on supervised training to capture the dissonance between the audio and visual m
arxiv.org
'Deep Learning' 카테고리의 다른 글
[자율주행] 다중센서기반 물체 탐지/인식 기술 조사 (0) | 2024.05.27 |
---|---|
[얼굴 인식/Face Recognition] 얼굴 인식 관련 참고 자료 (1) | 2024.04.29 |
[하이퍼파라미터 튜닝 꿀팁] 배치(Batch)를 늘렸다면, Decay를 높이세요! (0) | 2023.11.16 |
ChatGPT4에 이미지 입력으로 넣는 법 / 지피티(GPT) 이미지 해석 (0) | 2023.11.13 |
ChatGPT4 프롬프트로 DALL·E3 사용법 (1) | 2023.11.13 |