Commit 2bcfedb
authored
Adjust request "processing time" to current load (#189)
* Validate max-num-seqs
Signed-off-by: Qifan Deng <[email protected]>
* Validate PrefillTimeStdDev
Signed-off-by: Qifan Deng <[email protected]>
* Add param time-factor-under-load
Signed-off-by: Qifan Deng <[email protected]>
* The factor applies on time-to-first-token
Signed-off-by: Qifan Deng <[email protected]>
* Test TTFT when partially loaded
Signed-off-by: Qifan Deng <[email protected]>
* Apply time factor under load to prefill and inter token latency
Signed-off-by: Qifan Deng <[email protected]>
* Improve param desc
Signed-off-by: Qifan Deng <[email protected]>
* Use nRunningReqs instead of runReqChan
Signed-off-by: Qifan Deng <[email protected]>
* unstage manifests/dev-config.yaml
Signed-off-by: Qifan Deng <[email protected]>
* Update readme
Signed-off-by: Qifan Deng <[email protected]>
* Restore changes for inter token latency (lost due to conflicts resolve)
Signed-off-by: Qifan Deng <[email protected]>
* Calc inter token latency based on load instead of one-calc-for-whole request
Signed-off-by: Qifan Deng <[email protected]>
* Calc inter token latency based on load instead of one-calc-for-whole request
Signed-off-by: Qifan Deng <[email protected]>
* Move methods to simulator
Signed-off-by: Qifan Deng <[email protected]>
* Rename helper func
Signed-off-by: Qifan Deng <[email protected]>
* Rename helper func
Signed-off-by: Qifan Deng <[email protected]>
* Fix inter token latency test
Signed-off-by: Qifan Deng <[email protected]>
---------
Signed-off-by: Qifan Deng <[email protected]>1 parent 40ec02c commit 2bcfedb
File tree
6 files changed
+189
-19
lines changed- pkg
- common
- llm-d-inference-sim
6 files changed
+189
-19
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
8 | | - | |
| 8 | + | |
| 9 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
115 | 115 | | |
116 | 116 | | |
117 | 117 | | |
| 118 | + | |
118 | 119 | | |
119 | 120 | | |
120 | 121 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
104 | 104 | | |
105 | 105 | | |
106 | 106 | | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
107 | 115 | | |
108 | 116 | | |
109 | 117 | | |
| |||
259 | 267 | | |
260 | 268 | | |
261 | 269 | | |
| 270 | + | |
262 | 271 | | |
263 | 272 | | |
264 | 273 | | |
| |||
338 | 347 | | |
339 | 348 | | |
340 | 349 | | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
341 | 353 | | |
342 | 354 | | |
343 | 355 | | |
| |||
359 | 371 | | |
360 | 372 | | |
361 | 373 | | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
362 | 378 | | |
363 | 379 | | |
364 | 380 | | |
| |||
373 | 389 | | |
374 | 390 | | |
375 | 391 | | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
376 | 396 | | |
377 | 397 | | |
378 | 398 | | |
| |||
502 | 522 | | |
503 | 523 | | |
504 | 524 | | |
| 525 | + | |
505 | 526 | | |
506 | 527 | | |
507 | 528 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
431 | 431 | | |
432 | 432 | | |
433 | 433 | | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
434 | 454 | | |
435 | 455 | | |
436 | 456 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
672 | 672 | | |
673 | 673 | | |
674 | 674 | | |
675 | | - | |
676 | 675 | | |
677 | | - | |
678 | | - | |
679 | | - | |
680 | | - | |
| 676 | + | |
| 677 | + | |
| 678 | + | |
| 679 | + | |
| 680 | + | |
| 681 | + | |
681 | 682 | | |
682 | 683 | | |
683 | 684 | | |
| |||
706 | 707 | | |
707 | 708 | | |
708 | 709 | | |
709 | | - | |
| 710 | + | |
710 | 711 | | |
711 | 712 | | |
712 | 713 | | |
713 | | - | |
| 714 | + | |
714 | 715 | | |
715 | 716 | | |
716 | 717 | | |
717 | 718 | | |
718 | | - | |
719 | | - | |
720 | | - | |
721 | | - | |
722 | | - | |
723 | | - | |
724 | | - | |
725 | | - | |
726 | | - | |
727 | | - | |
| 719 | + | |
728 | 720 | | |
729 | 721 | | |
730 | 722 | | |
| |||
818 | 810 | | |
819 | 811 | | |
820 | 812 | | |
| 813 | + | |
| 814 | + | |
| 815 | + | |
| 816 | + | |
| 817 | + | |
| 818 | + | |
| 819 | + | |
| 820 | + | |
| 821 | + | |
| 822 | + | |
| 823 | + | |
| 824 | + | |
| 825 | + | |
| 826 | + | |
| 827 | + | |
| 828 | + | |
| 829 | + | |
| 830 | + | |
| 831 | + | |
| 832 | + | |
| 833 | + | |
| 834 | + | |
| 835 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
787 | 787 | | |
788 | 788 | | |
789 | 789 | | |
790 | | - | |
| 790 | + | |
| 791 | + | |
| 792 | + | |
| 793 | + | |
| 794 | + | |
| 795 | + | |
| 796 | + | |
| 797 | + | |
791 | 798 | | |
792 | 799 | | |
793 | 800 | | |
| |||
955 | 962 | | |
956 | 963 | | |
957 | 964 | | |
| 965 | + | |
| 966 | + | |
| 967 | + | |
| 968 | + | |
| 969 | + | |
| 970 | + | |
| 971 | + | |
| 972 | + | |
| 973 | + | |
| 974 | + | |
| 975 | + | |
| 976 | + | |
| 977 | + | |
| 978 | + | |
| 979 | + | |
| 980 | + | |
| 981 | + | |
| 982 | + | |
| 983 | + | |
| 984 | + | |
| 985 | + | |
| 986 | + | |
| 987 | + | |
| 988 | + | |
| 989 | + | |
| 990 | + | |
| 991 | + | |
| 992 | + | |
| 993 | + | |
| 994 | + | |
| 995 | + | |
| 996 | + | |
| 997 | + | |
| 998 | + | |
| 999 | + | |
| 1000 | + | |
| 1001 | + | |
| 1002 | + | |
| 1003 | + | |
| 1004 | + | |
| 1005 | + | |
| 1006 | + | |
| 1007 | + | |
| 1008 | + | |
| 1009 | + | |
| 1010 | + | |
| 1011 | + | |
| 1012 | + | |
| 1013 | + | |
| 1014 | + | |
| 1015 | + | |
| 1016 | + | |
| 1017 | + | |
| 1018 | + | |
| 1019 | + | |
| 1020 | + | |
| 1021 | + | |
| 1022 | + | |
| 1023 | + | |
| 1024 | + | |
| 1025 | + | |
| 1026 | + | |
| 1027 | + | |
| 1028 | + | |
| 1029 | + | |
| 1030 | + | |
| 1031 | + | |
| 1032 | + | |
| 1033 | + | |
| 1034 | + | |
| 1035 | + | |
| 1036 | + | |
| 1037 | + | |
| 1038 | + | |
| 1039 | + | |
| 1040 | + | |
| 1041 | + | |
| 1042 | + | |
| 1043 | + | |
| 1044 | + | |
| 1045 | + | |
| 1046 | + | |
| 1047 | + | |
| 1048 | + | |
| 1049 | + | |
| 1050 | + | |
| 1051 | + | |
| 1052 | + | |
| 1053 | + | |
| 1054 | + | |
| 1055 | + | |
| 1056 | + | |
| 1057 | + | |
| 1058 | + | |
| 1059 | + | |
| 1060 | + | |
| 1061 | + | |
| 1062 | + | |
| 1063 | + | |
| 1064 | + | |
| 1065 | + | |
| 1066 | + | |
| 1067 | + | |
| 1068 | + | |
958 | 1069 | | |
| 1070 | + | |
959 | 1071 | | |
0 commit comments