[Operator] Add Conv3d forward function #412

Gxiandy · 2025-01-12T13:59:37Z

Operator

New Feature

Add Conv3d forward function and related tests

Operator: conv3d Performance Test (dtype=torch.float16, mode=cuda,level=core)

Status	Torch Latency (ms)	Gems Latency (ms)	Gems Speedup	Size Detail
SUCCESS	6.790144	6.793216	1.000	{'input': torch.Size([104, 16, 32, 32, 32]), 'weight': torch.Size([32, 16, 4, 4, 4]), 'bias': None, 'groups': 1, 'stride': 1, 'padding': 0}
SUCCESS	4.712448	4.705280	1.002	{'input': torch.Size([64, 32, 18, 180, 18]), 'weight': torch.Size([32, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	0.943104	0.941056	1.002	{'input': torch.Size([4, 32, 110, 110, 10]), 'weight': torch.Size([64, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	2.338816	2.342912	0.998	{'input': torch.Size([4, 64, 110, 110, 10]), 'weight': torch.Size([16, 64, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	0.208896	0.208896	1.000	{'input': torch.Size([16, 32, 120, 12, 12]), 'weight': torch.Size([24, 32, 3, 3, 3]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	7.403520	7.422976	0.997	{'input': torch.Size([16, 32, 240, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 1}
SUCCESS	0.224256	0.224256	1.000	{'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 2, 'padding': 2}
SUCCESS	0.933888	0.931840	1.002	{'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 2}

Operator: conv3d Performance Test (dtype=torch.float32, mode=cuda,level=core)

Status	Torch Latency (ms	) Gems Latency (ms)	Gems Speedup	Size Detail
SUCCESS	35.387390	35.483646	0.997	{'input': torch.Size([104, 16, 32, 32, 32]), 'weight': torch.Size([32, 16, 4, 4, 4]), 'bias': None, 'groups': 1, 'stride': 1, 'padding': 0}
SUCCESS	20.566015	21.796864	0.944	{'input': torch.Size([64, 32, 18, 180, 18]), 'weight': torch.Size([32, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	5.250048	5.249024	1.000	{'input': torch.Size([4, 32, 110, 110, 10]), 'weight': torch.Size([64, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	5.134336	5.169152	0.993	{'input': torch.Size([4, 64, 110, 110, 10]), 'weight': torch.Size([16, 64, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	0.425984	0.431104	0.988	{'input': torch.Size([16, 32, 120, 12, 12]), 'weight': torch.Size([24, 32, 3, 3, 3]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	103.790588	103.080963	1.007	{'input': torch.Size([16, 32, 240, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 1}
SUCCESS	0.407552	0.414720	0.983	{'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 2, 'padding': 2}
SUCCESS	10.667008	10.584064	1.008	{'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 2}

Operator: conv3d Performance Test (dtype=torch.bfloat16, mode=cuda,level=core)

Status	Torch Latency (ms)	Gems Latency (ms)	Gems Speedup	Size Detail
SUCCESS	11.389952	11.388928	1.000	{'input': torch.Size([104, 16, 32, 32, 32]), 'weight': torch.Size([32, 16, 4, 4, 4]), 'bias': None, 'groups': 1, 'stride': 1, 'padding': 0}
SUCCESS	4.552704	4.488192	1.014	{'input': torch.Size([64, 32, 18, 180, 18]), 'weight': torch.Size([32, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	0.941056	0.943104	0.998	{'input': torch.Size([4, 32, 110, 110, 10]), 'weight': torch.Size([64, 32, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	3.181568	3.180544	1.000	{'input': torch.Size([4, 64, 110, 110, 10]), 'weight': torch.Size([16, 64, 5, 5, 5]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	0.256000	0.256000	1.000	{'input': torch.Size([16, 32, 120, 12, 12]), 'weight': torch.Size([24, 32, 3, 3, 3]), 'bias': None, 'groups': 1, 'stride': 2, 'padding': 1}
SUCCESS	7.410688	7.752704	0.956	{'input': torch.Size([16, 32, 240, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 1}
SUCCESS	0.222208	0.223232	0.995	{'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 2, 'padding': 2}
SUCCESS	0.931840	0.934912	0.997	{'input': torch.Size([16, 32, 24, 24, 24]), 'weight': torch.Size([24, 16, 3, 3, 3]), 'bias': None, 'groups': 2, 'stride': 1, 'padding': 2}

StrongSpoon · 2025-01-13T06:24:58Z

impressive performance! we will review soon.

[Operator] Add Conv3d forward function

3d8d916

StrongSpoon self-requested a review January 13, 2025 02:18

Provide feedback