Diffusion Models are Secretly Zero-Shot 3DGS Harmonizers

Gaussian Splatting has become a popular technique for various 3D Computer Vision tasks, including novel view synthesis, scene reconstruction, and dynamic scene rendering. However, the challenge of natural-looking object insertion, where the object's appearance seamlessly matches the scene, remains unsolved. In this work, we propose a method, dubbed D3DR, for inserting a 3DGS-parametrized object into a 3DGS scene while correcting its lighting, shadows, and other visual artifacts to ensure consistency. We reveal a hidden ability of diffusion models trained on large real-world datasets to implicitly understand correct scene lighting, and leverage it in our pipeline. After inserting the object, we optimize a diffusion-based Delta Denoising Score (DDS)-inspired objective to adjust its 3D Gaussian parameters for proper lighting correction. We introduce a novel diffusion personalization technique that preserves object geometry and texture across diverse lighting conditions, and utilize it to achieve consistent identity matching between original and inserted objects. Finally, we demonstrate the effectiveness of the method by comparing it to existing approaches, achieving 2.0 dB PSNR improvements in relighting quality.

Dataset	Metric	D3DR (Ours)	Copy-Paste	LBM	TIP-Editor	R3DG	Instruct-GS2GS
Synthetic Dataset
Synthetic	PSNR_part ↑	11.966	6.519	10.075	6.960	8.598	6.892
	PSNR_cropped ↑	18.039	13.032	16.271	12.502	14.454	13.360
	SSIM_cropped ↑	0.640	0.582	0.638	0.439	0.449	0.526
	CTIS ↑	0.646	0.642	0.643	0.619	0.639	0.644
	DTIS ↑	0.529	0.529	0.526	0.507	0.527	0.529
Real-World Dataset
Real	CTIS ↑	0.643	0.638	0.638	0.625	0.623	0.641
Real	DTIS ↑	0.510	0.505	0.506	0.497	0.501	0.510

Metric	D3DR (Ours)	Copy-Paste	LBM	TIP-Editor	R3DG	Instruct-GS2GS
Training Time (min) ↓	40	0	24	140	185	37
Storage (GB) ↓	0.076	0.076	0.076	0.097	0.955	0.076
N_Gaussians (×10⁶) ↓	0.330	0.330	0.330	1.870	1.970	0.330

Metric	D3DR (Ours)	Copy-Paste	LBM	TIP-Editor	R3DG	Instruct-GS2GS
PSNR_shadow ↑	19.993	19.023	19.544	11.899	14.899	17.923
PSNR_background ↑	23.383	24.169	23.533	16.786	15.232	20.238
SSIM_shadow ↑	0.890	0.924	0.894	0.768	0.738	0.802
SSIM_background ↑	0.910	0.948	0.902	0.815	0.679	0.780

Diffusion Models are Secretly
Zero-Shot 3DGS Harmonizers

Abstract

Overview of the task: Our method aims to insert a 3DGS object into a specific location in a 3DGS scene, followed by adjusting the object's appearance to match the scene's lighting. The final result is also a new 3DGS scene that includes both the input scene and the object with realistic lighting.

Qualitative Results

Comparisons

Quantitative Results

Table 1 — Comparison with Baselines

Table 2 — Efficiency Comparison

Table 3 — Shadow & Background Metrics (Synthetic Dataset)

BibTeX