Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Megatron Train GPT3 #21

Open
Kingsleyandher opened this issue May 30, 2023 · 3 comments
Open

Using Megatron Train GPT3 #21

Kingsleyandher opened this issue May 30, 2023 · 3 comments

Comments

@Kingsleyandher
Copy link

Kingsleyandher commented May 30, 2023

Hello, there was an error when I used the Sophia optimizer to train GPT3 with Megatron. The error point is that grad cannot be substituted into the optimizer with require_grad = True state to calculate the second derivative. Do you know how to solve this problem?

File "/root/miniconda3/envs/torch18/lib/python3.7/site-packages/torch/autograd/__init__.py", line 277, in grad allow_unused, accumulate_grad=False) # Calls into the C++ engine to run the backward pass RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.

Upvote & Fund

  • We're using Polar.sh so you can upvote and help fund this issue.
  • We receive the funding once the issue is completed & confirmed by you.
  • Thank you in advance for helping prioritize & fund our backlog.
Fund with Polar
@Kingsleyandher Kingsleyandher changed the title Megatron Using Megatron Train GPT3 May 30, 2023
@Kingsleyandher
Copy link
Author

class HutchinsonEstimator(HessianEstimator):
    def estimate(self, p, grad):
        u = torch.randn_like(grad)
        grad_dot_u = torch.sum(grad * u)
        print(f"grad_dot_u requires grad: {grad_dot_u.requires_grad}")   #  -> False
        
        # ↓  RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn.
        hessian_vector_product = torch.autograd.grad(    
            grad_dot_u, p, retain_graph=True)[0]
        return u * hessian_vector_product

@Kingsleyandher
Copy link
Author

This problem same like #7 .

@liuslnlp
Copy link

liuslnlp commented Jun 4, 2023

Hello @Kingsleyandher , I meet the same question, is your problem solved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants