Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve error message with mismatched valtype/dtype #2739

Merged
merged 5 commits into from
Aug 6, 2024

Conversation

jacobhinkle
Copy link
Collaborator

@jacobhinkle jacobhinkle commented Aug 1, 2024

We already check types when binding arguments, but now we explicitly state which arguments mismatch, and in which positions.

Fixes #2670 by improving the error message. Before this change the error message was

Expected T0_g[ iS0{i1} ] to be bound to an at::Tensor, but got l

After this PR, the error is

Expected input 0, T0_g[ iS0{i0} ], to be an at::Tensor but got scalar 2

We already check types when binding arguments, but now we explicitly
state which arguments mismatch, and in which positions.

Fixes #2670 by improving the error message. Before this change the error
message was
```
RuntimeError: Expected T0_g[ iS0{i1} ] to be bound to an at::Tensor, but got l
```
After this PR, the error is
```
Mismatch in input type: argument 0 (T0_g[ iS0{i0} ]) should be a float tensor but double scalar 2 was provided.
```
@jacobhinkle
Copy link
Collaborator Author

!build

Copy link
Collaborator

@tfogal tfogal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That error message is a great improvement! Thank you.

@jacobhinkle
Copy link
Collaborator Author

This is not an ideal fix since it replicates (poorly) the validation that already exists in ExpressionEvaluator::bind. I did it this way to be able to provide the argument number, which we lose once we get to the bind call. I'll try again to just improve the error message locally within bind and omit the argument number for now. In the future, it might be nice to be able to declare error contexts with guards so that additional info is printed. For example in bindInputs we could have:

for (const auto i : c10::irange(inputs.size())) {
  ErrorContext e("Error binding input " + std::to_string(i));
  expr_eval.bind(inputs[i], *args[i], true);
}

Then if NVF_CHECK or NVF_ERROR fails inside the lifetime of e, we would get an error message like Error binding input 2: <actual error>. That's a more general problem that we should handle separately (I'll post an issue for it). For now I will just try and clean up the bind error messages and revert the current changes.

@jacobhinkle
Copy link
Collaborator Author

I fixed this up by determining the input position inside the bind_ call when a check fails. I also added some cases to the test to check that our error messages work properly for other kinds of malformed inputs.

Copy link
Collaborator

@tfogal tfogal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-+1'ing the new changes.

But I also want to plug #1758: I'd really encourage an up-front pass to validate as soon as we can. As we found with latency work (#2136), ExprEvaluator is on the critical path, and so I worry about adding more checks in there.

Admittedly--these checks have to be on the critical path (since we can only validate when we see what inputs we actually receive). But it would probably behoove us to do an up-front check for UX purposes and avoid checks inside each Expr evaluation.

csrc/expr_evaluator.cpp Outdated Show resolved Hide resolved
@jacobhinkle jacobhinkle merged commit 37289f4 into main Aug 6, 2024
5 checks passed
@jacobhinkle jacobhinkle deleted the input_tensor_check branch August 6, 2024 12:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Expected ... to be bound to an at::Tensor, but got l
2 participants