-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove set fp32 math mode & increase tolerance #1860
Remove set fp32 math mode & increase tolerance #1860
Conversation
I'm not really sure why this is happening and it seems weird to me. Do you know if the test ever passed and when it stopped passing? |
It is working with IPEX, but never worked with upstream pytorch. |
@whitneywhtsang, @etiotto, @FMarno background info for this change |
@@ -327,7 +327,6 @@ def matmul(a, b, accum_dtype, res_dtype): | |||
# Still we can test our matrix multiplication with block pointers against a native torch implementation (i.e., cuBLAS). | |||
|
|||
torch.manual_seed(0) | |||
torch.xpu.set_fp32_math_mode(torch.xpu.utils.FP32MathMode.TF32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should work also when using upstream PyTorch. I do not think we should be fixing the tutorial.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Point of @guangyey is that to use this workaround until this feature is implemented in the upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tutorial is not currently present upstream
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @etiotto
set_fp32_math_mode
is not yet upstreamed to stock PyTorch. And we need time to redesign this API according to other backends. So I personally recommend @ZzEeKkAa to do this workaround until we complete this API.
OK. We can put this in to unblock the work of migrating to use PyTorch (instead of IPEX). @ZzEeKkAa please add a FIXME in the code and open an issue so that once we have support in PyTorch for set_fp32_math_mode
we can go back and revert this change. Once that is done I will be able to approve the PR. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going by Julian's comment, we might also want to add a ticket to support true fp32 matmul. This will of course have the downside of not using DPAS so will be slow by default (not ideal, I know).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ZzEeKkAa Please create the 2 issues as described above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ping. @ZzEeKkAa are the 2 issues mentioned opened ? Links?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@whitneywhtsang @etiotto I've just opened the issues and updated the PR with FIXME comment.
#1956
#1957
I think you should at least add a comment explaining why the Alternatively, could you detect whether IPEX is available and deactivate this particular test until upstream support is available? |
Alternative is not an option. Dependency on IPEX in our code base will be fully deprecated in our code base within a week - week an half. |
Closes #1840 CI on PyTorch main status with noop IPEX: https://github.com/intel/intel-xpu-backend-for-triton/actions/runs/10422253663 For rolling there is the following error (10 tutorial): * `AttributeError: module 'torch.xpu' has no attribute 'set_fp32_math_mode'` (should be fixed by #1860) For LTS: * `Failed to create Level Zero tracer: 2013265921` (instrumentation/test_gpuhello.py) Signed-off-by: Anatoly Myachev <[email protected]>
I think this is a good enough solution for now but we should create an issue to actually support fp32 precision through the dot operation. |
aeae30d
to
75b3ff7
Compare
Kind of resolves #1769 by removing math mode change and increasing absolute tolerance.