Fuse multiple outputs for pointwise ops #3870

pfultz2 · 2025-03-06T23:31:44Z

No description provided.

codecov · 2025-03-07T01:16:14Z

Codecov Report

Attention: Patch coverage is 97.59615% with 5 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/fuse_concat.cpp	88.24%	2 Missing ⚠️
src/fuse_pointwise.cpp	98.28%	2 Missing ⚠️
src/fuse_pointwise_reduce.cpp	0.00%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##           develop    #3870      +/-   ##
===========================================
- Coverage    92.41%   92.07%   -0.34%     
===========================================
  Files          522      525       +3     
  Lines        22532    24272    +1740     
===========================================
+ Hits         20822    22348    +1526     
- Misses        1710     1924     +214

Files with missing lines	Coverage Δ
src/include/migraphx/fuse_pointwise.hpp	`100.00% <ø> (ø)`
src/include/migraphx/instruction.hpp	`100.00% <ø> (ø)`
src/include/migraphx/matcher.hpp	`96.05% <100.00%> (-1.23%)`	⬇️
src/include/migraphx/module.hpp	`100.00% <ø> (ø)`
src/include/migraphx/shape.hpp	`91.18% <ø> (ø)`
src/instruction.cpp	`88.46% <100.00%> (-0.43%)`	⬇️
src/module.cpp	`86.68% <100.00%> (-<0.01%)`	⬇️
src/param_utils.cpp	`97.14% <100.00%> (-2.86%)`	⬇️
src/replace_allocate.cpp	`100.00% <100.00%> (ø)`
src/shape.cpp	`92.20% <100.00%> (-0.72%)`	⬇️
... and 3 more

... and 351 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

amd-jmacaran · 2025-03-07T22:48:07Z

/AzurePipelines run

azure-pipelines · 2025-03-07T22:48:16Z

Azure Pipelines successfully started running 1 pipeline(s).

migraphx-bot · 2025-04-15T02:57:21Z

Test	Batch	Rate new 45eb6e	Rate old ecb974	Diff	Compare
torchvision-resnet50	64	3,229.57	3,231.64	-0.06%	✅
torchvision-resnet50_fp16	64	6,861.91	6,867.39	-0.08%	✅
torchvision-densenet121	32	2,431.54	2,432.50	-0.04%	✅
torchvision-densenet121_fp16	32	4,206.60	4,212.33	-0.14%	✅
torchvision-inceptionv3	32	1,612.96	1,613.30	-0.02%	✅
torchvision-inceptionv3_fp16	32	2,698.99	2,696.30	0.10%	✅
cadene-inceptionv4	16	749.73	750.25	-0.07%	✅
cadene-resnext64x4	16	809.50	809.71	-0.03%	✅
slim-mobilenet	64	6,649.39	6,654.55	-0.08%	✅
slim-nasnetalarge	64	196.73	203.03	-3.11%	🔴
slim-resnet50v2	64	3,319.66	3,434.71	-3.35%	🔴
bert-mrpc-onnx	8	1,142.41	1,142.05	0.03%	✅
bert-mrpc-tf	1	463.66	464.19	-0.11%	✅
pytorch-examples-wlang-gru	1	324.94	476.27	-31.77%	🔴
pytorch-examples-wlang-lstm	1	442.33	442.88	-0.12%	✅
torchvision-resnet50_1	1	812.72	813.23	-0.06%	✅
cadene-dpn92_1	1	427.60	421.24	1.51%	✅
cadene-resnext101_1	1	392.16	392.62	-0.12%	✅
onnx-taau-downsample	1	395.93	395.87	0.01%	✅
dlrm-criteoterabyte	1	31.80	31.80	-0.01%	✅
dlrm-criteoterabyte_fp16	1	50.99	50.96	0.05%	✅
agentmodel	1	8,878.79	9,458.32	-6.13%	🔴
unet_fp16	2	58.37	58.33	0.08%	✅
resnet50v1_fp16	1	1,084.91	1,071.23	1.28%	✅
resnet50v1_int8	1	881.90	893.66	-1.31%	✅
bert_base_cased_fp16	64	1,158.60	1,162.33	-0.32%	✅
bert_large_uncased_fp16	32	353.70	353.92	-0.06%	✅
bert_large_fp16	1	194.51	194.83	-0.16%	✅
distilgpt2_fp16	16	2,218.18	2,215.11	0.14%	✅
yolov5s	1	496.67	543.55	-8.62%	🔴
tinyllama	1	43.60	43.59	0.02%	✅
vicuna-fastchat	1	43.81	44.05	-0.55%	✅
whisper-tiny-encoder	1	411.71	411.57	0.03%	✅
whisper-tiny-decoder	1	411.43	411.31	0.03%	✅
llama2_7b	1	nan	nan	nan%	❌
qwen1.5-7b	1	23.41	23.41	-0.03%	✅
phi3-3.8b	1	nan	nan	nan%	❌
mask-rcnn	1	18.44	18.55	-0.58%	✅
llama3-8b	1	21.65	21.65	0.00%	✅
whisper-large-encoder	1	10.17	10.17	-0.01%	✅
whisper-large-decoder	1	99.05	97.78	1.30%	✅
mistral-7b	1	23.65	23.63	0.08%	✅
FLUX.1-schnell	1	894.15	904.60	-1.16%	✅
nan	nan	nan	nan	nan%	❌

This build is not recommended to merge 🔴

migraphx-bot · 2025-04-15T02:57:23Z

✅ bert-mrpc-onnx: PASSED: MIGraphX meets tolerance

✅ bert-mrpc-tf: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-gru: PASSED: MIGraphX meets tolerance

✅ pytorch-examples-wlang-lstm: PASSED: MIGraphX meets tolerance

✅ torchvision-resnet50_1: PASSED: MIGraphX meets tolerance

✅ cadene-dpn92_1: PASSED: MIGraphX meets tolerance

✅ cadene-resnext101_1: PASSED: MIGraphX meets tolerance

✅ dlrm-criteoterabyte: PASSED: MIGraphX meets tolerance

✅ agentmodel: PASSED: MIGraphX meets tolerance

✅ unet: PASSED: MIGraphX meets tolerance

✅ resnet50v1: PASSED: MIGraphX meets tolerance

✅ bert_base_cased_fp16: PASSED: MIGraphX meets tolerance

🔴bert_large_uncased_fp16: FAILED: MIGraphX is not within tolerance - check verbose output

✅ bert_large: PASSED: MIGraphX meets tolerance

✅ yolov5s: PASSED: MIGraphX meets tolerance

✅ tinyllama: PASSED: MIGraphX meets tolerance

✅ vicuna-fastchat: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-encoder: PASSED: MIGraphX meets tolerance

✅ whisper-tiny-decoder: PASSED: MIGraphX meets tolerance

✅ distilgpt2_fp16: PASSED: MIGraphX meets tolerance

❌llama2_7b: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 205, in main
model = migraphx.parse_onnx(model_name, default_dim_value=batch)
RuntimeError: /src/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: /new-saved-models/llama2_7b/decoder_model.onnx

❌#qwen1.5-7b: ERROR - check error output

usage: accuracy_checker.py [-h] [--onnx ONNX] [--tf TF] [--provider PROVIDER]
[--batch BATCH] [--fill1] [--fill0] [--fp16]
[--argmax] [--verbose] [--tolerance TOLERANCE]
[--input-dim INPUT_DIM] [--target TARGET]
[--ort-run] [--ort-logging]
[--disable-offload-copy] [--disable-fast-math]
[--exhaustive_tune]
accuracy_checker.py: error: unrecognized arguments: input_ids attention_mask position_ids 1 256 @attention_mask 1 256 @position_ids 1 256

❌phi3-3.8b: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 205, in main
model = migraphx.parse_onnx(model_name, default_dim_value=batch)
RuntimeError: /src/AMDMIGraphX/src/onnx/onnx_parser.cpp:264: parse_from: PARSE_FROM: Failed reading onnx file: /new-saved-models/phi3-3.8b/model.onnx

🔴mask-rcnn: FAILED: MIGraphX is not within tolerance - check verbose output

✅ llama3-8b: PASSED: MIGraphX meets tolerance

❌#whisper-large-encoder: ERROR - check error output

Traceback (most recent call last):
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 340, in
main()
File "/src/AMDMIGraphX/tools/accuracy/accuracy_checker.py", line 205, in main
model = migraphx.parse_onnx(model_name, default_dim_value=batch)
RuntimeError: /src/AMDMIGraphX/src/include/migraphx/op/convolution.hpp:100: normalize_compute_shape: CONVOLUTION: mismatched channel numbers

✅ whisper-large-decoder: PASSED: MIGraphX meets tolerance

✅ mistral-7b: PASSED: MIGraphX meets tolerance

✅ FLUX.1-schnell: PASSED: MIGraphX meets tolerance

pfultz2 added 6 commits March 6, 2025 13:20

Fuse multi outputs

6273bc1

Add flag

c41854a

Add unit tests

09df8fb

Enable multi output fusions

b676bbe

Format

e2fffa0

Update license

63d1099

pfultz2 requested a review from causten as a code owner March 6, 2025 23:31

pfultz2 requested review from shivadbhavsar and kahmed10 March 6, 2025 23:33

turneram closed this Mar 7, 2025

turneram deleted the fuse-pointwise-multi-out2 branch March 7, 2025 15:48

turneram restored the fuse-pointwise-multi-out2 branch March 7, 2025 16:12

turneram reopened this Mar 7, 2025

pfultz2 self-assigned this Mar 12, 2025

pfultz2 added 13 commits March 13, 2025 13:56

Fix loop

d76a317

Format

eac7d58

Add unit tests

2350b4f

Format

5175ac4

Skip dynamic shapes for now

082761d

Use builtin fuse function

ef75ca0

Format

f0e6494

Move replacement to seperate function

208d223

Format

d436bd6

Add another test for larger horiz fusion

1051b53

Format

09453ae

Fix some more tests

b67ba10

Format

b6884e3

pfultz2 and others added 15 commits March 18, 2025 11:05

Only replace input when its not used

2c798dd

Format

ec2ee9a

Fix tidy

5992d79

Tidy fixes

a97e180

Format

f9bd2b9

Merge branch 'develop' into fuse-pointwise-multi-out2

eeb1e52

Fix handling across modules

c5dcad4

Format

ab8e0b9

Dont traverse in other modules for reaches

ee0c5ec

Format

261d2c4

Make pointer

4873973

Format

f091722

Update license

e514b11

Format

bbf9e23

Merge branch 'develop' into fuse-pointwise-multi-out2

fab6d0e

pfultz2 mentioned this pull request Apr 11, 2025

Horiz fuse after pointwise #3920

Merged

pfultz2 added 7 commits April 12, 2025 08:36

Add test for multi out pointwise fusion

c3b71d2

Format

298c1ea

Add verify test

6109612

Format

bc12c3f

Skip layernorm multi out fusion

a827c28

Format

4120dbb

Update license

45eb6e4

pfultz2 mentioned this pull request Apr 15, 2025

find_splits::is_dependent refactor #3953

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fuse multiple outputs for pointwise ops #3870

Fuse multiple outputs for pointwise ops #3870

pfultz2 commented Mar 6, 2025

codecov bot commented Mar 7, 2025 •

edited

Loading

amd-jmacaran commented Mar 7, 2025

azure-pipelines bot commented Mar 7, 2025

migraphx-bot commented Apr 15, 2025

migraphx-bot commented Apr 15, 2025

Fuse multiple outputs for pointwise ops #3870

Are you sure you want to change the base?

Fuse multiple outputs for pointwise ops #3870

Conversation

pfultz2 commented Mar 6, 2025

codecov bot commented Mar 7, 2025 • edited Loading

Codecov Report

amd-jmacaran commented Mar 7, 2025

azure-pipelines bot commented Mar 7, 2025

migraphx-bot commented Apr 15, 2025

migraphx-bot commented Apr 15, 2025

codecov bot commented Mar 7, 2025 •

edited

Loading