[주관적 논문 내용 정리] Collaborative Diffusion for Multi-Modal Face Generation and Editing

Collaborative Diffusion for Multi-Modal Face Generation and Editing

Diffusion models arise as a powerful generative tool recently. Despite the great progress, existing diffusion models mainly focus on uni-modal control, i.e., the diffusion process is driven by only one modality of condition. To further unleash the users' c

arxiv.org

중심 내용만 발췌

In this work, we propose Collaborative Diffusion to exploit pre-trained uni-modal diffusion models (e.g., text-driven and mask-driven models) to achieve multi-modal conditioning without model re-training.

The key of our framework is the dynamic diffuser, which adaptively predicts the influence functions to enhance or suppress the contributions of the pre-trained models based on the spatial-temporal influences of the modalities.

'dynamic diffuer'를 강조하는 비슷한 문장 재등장.

The core of our framework is the dynamic diffuser, which determines the extent of contribution from each collaborator by predicting the spatial-temporal influence functions.

influence function that represents the desired level of contribution from each pre-trained diffusion model.

4. Diffusion 모델에 대한 배경 지식

In diffusion models, each step of the reverse process requires predicting the noise ε. With multiple diffusion models collaborating, we need to carefully determine when, where, and how much each diffusion model
contributes to the prediction ε.

5. 논문을 읽고 느낀 본인의 생각

저자의 모델이 face generation 분야에서 정성적, 정량적으로 높은 성능을 보이고 있다고 하지만,

evalution과 ablation studies에 대한 고찰이 많이 부족하다.

데이터셋도 마스크용 CelebAMask-HQ과 텍스트용 CelebA-Dialog에만 국한되어있다.

influence function이 정말 필요한지에 대해 ablation study에 썼다고는 하나 실험 결과 좋았다~ 만 나와있지

그 이유에 대해 설명하지는 않았다.

또한, 이 모델에 마스크를 입력으로 넣는 Diffusion 브랜치가 어떤 역할을 하고, 얼마나 중요한지에 대한 내용도 없다.

저작자표시 (새창열림)

'논문 리뷰' 카테고리의 다른 글

[논문 리뷰] Unsupervised Object Localization with Representer Point Selection (1)	2024.08.23
[논문 요약] Sound Source Localization is All about Cross-Modal Alignment (0)	2024.08.07
[논문 리뷰] Effective Adapter for Face Recognition in the Wild (0)	2024.04.30
[논문 요약] ★초간단 5줄 요약★ ArcFace: Additive Angular Margin Loss for Deep Face Recognition / 얼굴인식 (0)	2024.04.24
Augmentation Matters: A Simple-yet-Effective Approach to Semi-supervisedSemantic Segmentation 논문 리뷰 (0)	2023.07.11

AI 연구하는 깨굴이

[주관적 논문 내용 정리] Collaborative Diffusion for Multi-Modal Face Generation and Editing