Novel view synthesis (NVS) of urban scenes enables the exploration of cities virtually and interactively, which can further be used for urban planning, navigation, digital tourism, etc. However, many current NVS methods require a large amount of images from known views as input a
...
Novel view synthesis (NVS) of urban scenes enables the exploration of cities virtually and interactively, which can further be used for urban planning, navigation, digital tourism, etc. However, many current NVS methods require a large amount of images from known views as input and are sensitive to intrinsic and extrinsic camera parameters. In this paper, we propose a new unified framework for NVS of urban scenes with fewer required views via the integration of scene priors and the joint optimization of camera parameters under an geometric constraint along with NeRF weights. The integration of scene priors makes full use of the priors from the neighbor reference views to reduce the number of required known views. The joint optimization can correct the errors in camera parameters, which are usually derived from algorithms like Structure-from-Motion (SfM), and then further improves the quality of the generated novel views. Experiments show that our method achieves about 25.375 dB and 25.512 dB in average in terms of peak signal-to-noise (PSNR) on synthetic and real data, respectively. It outperforms popular state-of-the-art methods (i.e., BungeeNeRF and MegaNeRF) by about 2–4 dB in PSNR. Notably, our method achieves better or competitive results than the baseline method with only one third of the known view images required for the baseline. The code and dataset are available at https://github.com/Dongber/PriNeRF.
@en