You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your impressive work and the valuable contributions you have made!
I noticed that the way you calculate ray directions differs slightly from NeRF in the following lines: directions = F.pad( torch.stack( [ (x - self.K[0, 2] + 0.5) / self.K[0, 0], (y - self.K[1, 2] + 0.5) / self.K[1, 1] * self.sign_z, ], dim=-1, ), (0, 1), value=self.sign_z, ) # [H,W,3] torch.Size([800, 800, 3])
Here, adding 0.5 to both x and y implies that the ray is cast from the center of the pixel, which I find quite reasonable. Furthermore, I believe this approach has additional advantages when dealing with datasets of varying resolutions, such as 800×800 and 100×100, as it reduces ambiguity in ray direction calculations.
For instance, using the Mip-NeRF method, every sample point of each ray in a 100×100 image will reappear in an 800×800 resolution image, potentially causing ambiguity. However, with your method, this ambiguity only exists between non-multiplicative resolutions, such as 300×300 and 900×900. In cases where datasets span resolutions from 100×100 to 800×800, the rays do not overlap as much, which naturally decreases ambiguity.
To validate this hypothesis, I conducted an experiment comparing results with and without adding 0.5. Below are the PSNR results:
It seems that PSNR improves significantly when 0.5 is added while calculating ray directions. Do you think this improvement could be attributed to the dataset's resolution setup that I mentioned, or is it more likely due to the intrinsic advantage of starting ray calculations from the pixel center?
I would greatly appreciate your thoughts on this!
The text was updated successfully, but these errors were encountered:
Thank you for your impressive work and the valuable contributions you have made!
I noticed that the way you calculate ray directions differs slightly from NeRF in the following lines:
directions = F.pad( torch.stack( [ (x - self.K[0, 2] + 0.5) / self.K[0, 0], (y - self.K[1, 2] + 0.5) / self.K[1, 1] * self.sign_z, ], dim=-1, ), (0, 1), value=self.sign_z, ) # [H,W,3] torch.Size([800, 800, 3])
Here, adding 0.5 to both x and y implies that the ray is cast from the center of the pixel, which I find quite reasonable. Furthermore, I believe this approach has additional advantages when dealing with datasets of varying resolutions, such as 800×800 and 100×100, as it reduces ambiguity in ray direction calculations.
For instance, using the Mip-NeRF method, every sample point of each ray in a 100×100 image will reappear in an 800×800 resolution image, potentially causing ambiguity. However, with your method, this ambiguity only exists between non-multiplicative resolutions, such as 300×300 and 900×900. In cases where datasets span resolutions from 100×100 to 800×800, the rays do not overlap as much, which naturally decreases ambiguity.
To validate this hypothesis, I conducted an experiment comparing results with and without adding 0.5. Below are the PSNR results:
It seems that PSNR improves significantly when 0.5 is added while calculating ray directions. Do you think this improvement could be attributed to the dataset's resolution setup that I mentioned, or is it more likely due to the intrinsic advantage of starting ray calculations from the pixel center?
I would greatly appreciate your thoughts on this!
The text was updated successfully, but these errors were encountered: