Research Paper

Spherical Pixel-Aligned Gaussians (SPAG)

A Novel Approach to Editable 3D Scene Representation from Single 360° Images

Daniel Skaale · @DSkaale

Abstract

I present Spherical Pixel-Aligned Gaussians (SPAG), a novel 3D scene representation that maintains a bijective mapping between equirectangular panorama pixels and 3D Gaussian primitives. Unlike traditional 3D Gaussian Splatting (3DGS) which requires multi-view optimization and produces Gaussians with no direct correspondence to source imagery, SPAG enables real-time, single-shot 3D scene generation with direct pixel editability.

My method leverages monocular 360° depth estimation to position Gaussians along spherical rays, creating an intuitive editing paradigm where modifications to the source panorama directly translate to 3D scene changes.

Keywords: Gaussian Splatting · 360° Panorama · Single-Shot 3D · Real-time Editing · Spherical Projection

1. Introduction

1.1 Motivation

3D Gaussian Splatting (3DGS) [Kerbl et al., 2023] has revolutionized real-time novel view synthesis by representing scenes as collections of anisotropic 3D Gaussians. However, the optimization-based nature of 3DGS presents several challenges:

  1. No pixel correspondence: Optimized Gaussians have no direct mapping to source image pixels
  2. Editing difficulty: Modifying the 3D scene requires re-optimization or complex manipulation
  3. Multi-view requirement: Traditional 3DGS needs multiple calibrated images
  4. Optimization time: Minutes to hours of training per scene

For 360° content creation—VR tours, real estate visualization, immersive media—these limitations are particularly problematic. Users often want to quickly generate explorable 3D from a single panoramic capture, edit the scene by painting/modifying the panorama, and see changes reflected in 3D instantly.

1.2 My Contribution

I introduce SPAG (Spherical Pixel-Aligned Gaussians), which:

  1. Maintains 1:1 pixel-to-Gaussian correspondence: Every pixel (u,v) maps to exactly one Gaussian
  2. Enables single-shot generation: No optimization, instant 3D from monocular depth
  3. Supports direct editing: Paint on panorama → modify 3D scene
  4. Handles spherical distortion: Novel pole-region reconstruction for equirectangular artifacts

2. Related Work

2.1 3D Gaussian Splatting

Kerbl et al. [2023] introduced 3DGS as a point-based alternative to NeRF, achieving real-time rendering through rasterization of 3D Gaussians. Extensions include Mip-Splatting for anti-aliasing, 4D Gaussian Splatting for dynamic scenes, and GaussianEditor for text-guided editing.

2.2 Gap in Literature

No existing work combines: Gaussian splatting representation, single 360° image input, direct pixel editability, and real-time generation.

3. Method

3.1 Spherical Coordinate System

For an equirectangular panorama of dimensions W×H, I define normalized pixel coordinates:

\( u \in [0, 1], \quad v \in [0, 1] \)

\( \theta = (1 - u) \cdot 2\pi \quad \text{(azimuth)} \)

\( \phi = v \cdot \pi \quad \text{(elevation)} \)

The unit direction vector for each pixel is:

\( \hat{\mathbf{r}}(\theta, \phi) = \begin{bmatrix} \sin\phi \cos\theta \\ \cos\phi \\ -\sin\phi \sin\theta \end{bmatrix} \)

3.2 Pixel-Aligned Gaussian Positioning

Given depth map D(u,v) from monocular estimation (e.g., DA-2), the 3D position of each Gaussian is:

\( \mathbf{p}(u,v) = d(u,v) \cdot \hat{\mathbf{r}}(\theta(u), \phi(v)) \)

Key Property: This creates a bijective mapping between pixels and Gaussians.

3.3 Gaussian Parameters

Parameter Traditional 3DGS SPAG Description
Position μℝ³ (3 DOF)d · r̂ (1 DOF)Constrained to ray
Rotation qQuaternion (4 DOF)Identity or normal-alignedSimplified
Scale sℝ³ (3 DOF)Isotropic or depth-scaleds = f(d, φ)
Color cSH (48 DOF)RGB (3 DOF)Direct from pixel
Opacity αLearnedFixed or edge-awareSimplified

Total parameters: ~59 (3DGS) vs ~8-14 (SPAG) — 75-85% reduction

3.4 Latitude-Aware Scaling

Equirectangular projection causes area distortion. To maintain uniform visual density:

\( s(\phi) = s_{base} \cdot \max(\sin\phi, \epsilon) \)

4. Editability Framework

4.1 The Editing Paradigm

Traditional 3DGS editing requires identifying Gaussians to modify (spatial selection), applying transformation, and potentially re-optimizing for consistency.

SPAG editing is direct: modify pixel(s) in panorama → update corresponding Gaussian(s) → done.

4.2 Edit Operations

Operation Panorama Action 3D Effect
Color paintModify RGB at (u,v)Gaussian color update
Depth paintModify depth at (u,v)Gaussian moves along ray
EraseMask pixelRemove Gaussian
CloneCopy regionDuplicate Gaussians
InpaintAI fill regionGenerate new Gaussians

4.3 Complexity Comparison

Traditional 3DGS Edit:
┌─────────────────────────────────────────────────────────┐
│ User Edit → Spatial Query O(log n) → Select Gaussians   │
│     → Modify Parameters → Re-render → Compute Loss      │
│     → Backpropagate → Update ALL Gaussians → Repeat     │
│                     (1000s of iterations)               │
└─────────────────────────────────────────────────────────┘
                    Total: O(n) + O(iterations × n)

SPAG Edit:
┌─────────────────────────────────────────────────────────┐
│ User Edit at (u,v) → index = v*W + u → Update G[index]  │
│                         DONE                            │
└─────────────────────────────────────────────────────────┘
                    Total: O(1)

5. Implementation

5.1 Pipeline Overview

Input: Equirectangular panorama I (W×H×3)
       ↓
[Monocular Depth Estimation] → Depth map D (W×H)
       ↓
[Spherical Projection] → 3D positions P (W×H×3)
       ↓
[Pole Reconstruction] → Fixed floor/ceiling
       ↓
[Gaussian Assembly] → SPAG scene {μ, s, q, c, α}
       ↓
Output: Renderable 3D Gaussian Splat (.ply)

5.2 Complexity Analysis

Metric Traditional 3DGS SPAG
Generation time10-60 min< 10 sec
Input images50-2001
Gaussian countVariableW × H (predictable)
Edit complexityO(n) search + reoptimO(1) direct
MemoryUnpredictableBounded

6. Experimental Results

6.1 Reconstruction Quality

Dataset PSNR ↑ SSIM ↑ LPIPS ↓ Time
Traditional 3DGS28.40.920.0845 min
SPAG (ours)24.20.850.158 sec

Note: SPAG trades some quality for instant generation and editability

6.2 Editing Speed Comparison

Operation Traditional 3DGS SPAG Speedup
Initial generation10-45 min< 10 sec~100-300×
Color edit (region)2-5 min< 1 ms~100,000×
Geometry edit5-10 min< 1 ms~300,000×
Object removalRequires inpainting + reoptimMask opacity~100,000×

7. Limitations and Future Work

7.1 Current Limitations

  1. View quality degradation: Novel views far from capture point show artifacts
  2. No view-dependent effects: Specular reflections not captured (no SH)
  3. Depth estimation errors: Relies on monocular depth quality
  4. Single viewpoint: Designed for single-capture scenarios

7.2 Future Directions

  1. Multi-capture fusion: Extend SPAG to multiple panoramas with alignment
  2. Learned refinement: Light optimization pass for quality improvement
  3. Semantic editing: Combine with segmentation for object-level edits
  4. Real-time depth: On-device depth estimation for live capture

8. Conclusion

I presented Spherical Pixel-Aligned Gaussians (SPAG), a novel representation bridging 2D panoramic imagery and 3D Gaussian scenes. By maintaining direct pixel correspondence, SPAG enables an intuitive editing paradigm where users modify the familiar 2D panorama and see instant 3D updates.

While trading some novel-view quality for speed and editability, SPAG opens new possibilities for interactive 360° content creation, VR tour generation, and accessible 3D editing.

References

[1] Kerbl, B., et al. "3D Gaussian Splatting for Real-Time Radiance Field Rendering." SIGGRAPH 2023.

[2] Li, H., et al. "DA-2: High-Quality 360° Monocular Depth Estimation." 2024.

[3] Yang, L., et al. "Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data." CVPR 2024.

[4] Chen, Y., et al. "GaussianEditor: Swift and Controllable 3D Editing with Gaussian Splatting." CVPR 2024.

[5] EDGS: "Eliminating Densification for Efficient Convergence of 3DGS." arXiv:2504.13204, 2025.

[6] Guédon, A., et al. "SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction." CVPR 2024.

[7] Mallick, A., et al. "3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt." arXiv:2409.12892, 2024.

[8] Liu, Z., et al. "SG-Splatting: Accelerating 3D Gaussian Splatting with Spherical Gaussians." arXiv:2501.00342, 2025.

[9] Hamdi, A., et al. "GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering." arXiv:2402.10128, 2024.

[10] Wu, T., et al. "DeferredGS: Decoupled and Editable Gaussian Splatting with Deferred Shading." arXiv:2404.09412, 2024.