Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Operator] Add Conv3d forward function #412

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Gxiandy
Copy link
Contributor

@Gxiandy Gxiandy commented Jan 12, 2025

PR Category

Operator

Type of Change

New Feature

Description

Add Conv3d forward function and related tests

Issue

Progress

  • Change is properly reviewed (1 reviewer required, 2 recommended).
  • Change is responded to an issue.
  • Change is fully covered by a UT.

Performance

Operator: conv3d Performance Test (dtype=torch.float16, mode=cuda,level=core)

Status Torch Latency (ms) Gems Latency (ms) Gems Speedup Size Detail
SUCCESS 6.790144 6.793216 1.000 {'input': torch.Size([104, 16, 32, 32, 32]), 'weight': torch.Size([32, 16, 4, 4, 4]), 'bias': None, 'groups': 1, 'stride': 1, 'padding': 0}
SUCCESS 4.712448 4.705280 1.002 {'input': torch.Size([64, 32, 18, 180, 18]), 'weight': torch.Size([32, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 0.943104 0.941056 1.002 {'input': torch.Size([4, 32, 110, 110, 10]), 'weight': torch.Size([64, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 2.338816 2.342912 0.998 {'input': torch.Size([4, 64, 110, 110, 10]), 'weight': torch.Size([16, 64, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 0.208896 0.208896 1.000 {'input': torch.Size([16, 32, 120, 12, 12]), 'weight': torch.Size([24, 32, 3, 3, 3]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 7.403520 7.422976 0.997 {'input': torch.Size([16, 32, 240, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 1}
SUCCESS 0.224256 0.224256 1.000 {'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 2, 'padding': 2}
SUCCESS 0.933888 0.931840 1.002 {'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 2}

Operator: conv3d Performance Test (dtype=torch.float32, mode=cuda,level=core)

Status Torch Latency (ms ) Gems Latency (ms) Gems Speedup Size Detail
SUCCESS 35.387390 35.483646 0.997 {'input': torch.Size([104, 16, 32, 32, 32]), 'weight': torch.Size([32, 16, 4, 4, 4]), 'bias': None, 'groups': 1, 'stride': 1, 'padding': 0}
SUCCESS 20.566015 21.796864 0.944 {'input': torch.Size([64, 32, 18, 180, 18]), 'weight': torch.Size([32, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 5.250048 5.249024 1.000 {'input': torch.Size([4, 32, 110, 110, 10]), 'weight': torch.Size([64, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 5.134336 5.169152 0.993 {'input': torch.Size([4, 64, 110, 110, 10]), 'weight': torch.Size([16, 64, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 0.425984 0.431104 0.988 {'input': torch.Size([16, 32, 120, 12, 12]), 'weight': torch.Size([24, 32, 3, 3, 3]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 103.790588 103.080963 1.007 {'input': torch.Size([16, 32, 240, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 1}
SUCCESS 0.407552 0.414720 0.983 {'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 2, 'padding': 2}
SUCCESS 10.667008 10.584064 1.008 {'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 2}

Operator: conv3d Performance Test (dtype=torch.bfloat16, mode=cuda,level=core)

Status Torch Latency (ms) Gems Latency (ms) Gems Speedup Size Detail
SUCCESS 11.389952 11.388928 1.000 {'input': torch.Size([104, 16, 32, 32, 32]), 'weight': torch.Size([32, 16, 4, 4, 4]), 'bias': None, 'groups': 1, 'stride': 1, 'padding': 0}
SUCCESS 4.552704 4.488192 1.014 {'input': torch.Size([64, 32, 18, 180, 18]), 'weight': torch.Size([32, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 0.941056 0.943104 0.998 {'input': torch.Size([4, 32, 110, 110, 10]), 'weight': torch.Size([64, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 3.181568 3.180544 1.000 {'input': torch.Size([4, 64, 110, 110, 10]), 'weight': torch.Size([16, 64, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 0.256000 0.256000 1.000 {'input': torch.Size([16, 32, 120, 12, 12]), 'weight': torch.Size([24, 32, 3, 3, 3]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS 7.410688 7.752704 0.956 {'input': torch.Size([16, 32, 240, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 1}
SUCCESS 0.222208 0.223232 0.995 {'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 2, 'padding': 2}
SUCCESS 0.931840 0.934912 0.997 {'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 2}

@StrongSpoon StrongSpoon self-requested a review January 13, 2025 02:18
@StrongSpoon
Copy link
Collaborator

impressive performance! we will review soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants