Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is this a trick that can increase PSNR a lot? #37

Open
Terry10086 opened this issue Dec 12, 2024 · 0 comments
Open

Is this a trick that can increase PSNR a lot? #37

Terry10086 opened this issue Dec 12, 2024 · 0 comments

Comments

@Terry10086
Copy link

Thank you for your impressive work and the valuable contributions you have made!

I noticed that the way you calculate ray directions differs slightly from NeRF in the following lines:
directions = F.pad( torch.stack( [ (x - self.K[0, 2] + 0.5) / self.K[0, 0], (y - self.K[1, 2] + 0.5) / self.K[1, 1] * self.sign_z, ], dim=-1, ), (0, 1), value=self.sign_z, ) # [H,W,3] torch.Size([800, 800, 3])
Here, adding 0.5 to both x and y implies that the ray is cast from the center of the pixel, which I find quite reasonable. Furthermore, I believe this approach has additional advantages when dealing with datasets of varying resolutions, such as 800×800 and 100×100, as it reduces ambiguity in ray direction calculations.

For instance, using the Mip-NeRF method, every sample point of each ray in a 100×100 image will reappear in an 800×800 resolution image, potentially causing ambiguity. However, with your method, this ambiguity only exists between non-multiplicative resolutions, such as 300×300 and 900×900. In cases where datasets span resolutions from 100×100 to 800×800, the rays do not overlap as much, which naturally decreases ambiguity.

To validate this hypothesis, I conducted an experiment comparing results with and without adding 0.5. Below are the PSNR results:

image

It seems that PSNR improves significantly when 0.5 is added while calculating ray directions. Do you think this improvement could be attributed to the dataset's resolution setup that I mentioned, or is it more likely due to the intrinsic advantage of starting ray calculations from the pixel center?

I would greatly appreciate your thoughts on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant