Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make image similarity check less sensitive #258

Closed
wants to merge 1 commit into from

Conversation

jbitton
Copy link
Contributor

@jbitton jbitton commented Feb 25, 2025

Summary:
as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the np.allclose function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 25, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

jbitton added a commit to jbitton/AugLy that referenced this pull request Feb 25, 2025
Summary:
Pull Request resolved: facebookresearch#258

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163
jbitton added a commit to jbitton/AugLy that referenced this pull request Feb 26, 2025
Summary:

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

facebookresearch#258)

Summary:

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

## overlay wrap text fix

seems like we were modifying the original image in place for overlay wrap text which is not the augly way (we always copy + modify + return new image), so this change fixes that and also fixes 99% of the image tests.

## imagehash change

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Reviewed By: joelicohk, mayaliliya

Differential Revision: D70137163
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D70137163

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 0bfda4c.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants