Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disallow direct configuration of max_num_batched_tokens #75

Merged

Conversation

sunggg
Copy link
Member

@sunggg sunggg commented Nov 21, 2023

Instead, we use max_num_sequences and max_input_len since it is more intuitive.

cc. @masahi

Copy link
Member

@masahi masahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure that benchmark and tests preserve their behavior after this change.

@@ -179,7 +178,6 @@ def main(args: argparse.Namespace):
parser.add_argument("--local-id", type=str, required=True)
parser.add_argument("--artifact-path", type=str, default="dist")
parser.add_argument("--use-staging-engine", action="store_true")
parser.add_argument("--max-num-batched-tokens", type=int, default=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add max_num_sequences if you remove this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep just figured it out so added :)

@@ -120,7 +119,6 @@ def test(args: argparse.Namespace):
parser.add_argument("--local-id", type=str, required=True)
parser.add_argument("--artifact-path", type=str, default="dist")
parser.add_argument("--num-shards", type=int, default=1)
parser.add_argument("--max-num-batched-tokens", type=int, default=-1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to add max_num_sequences if you remove this

@sunggg sunggg merged commit a5deaed into octoml:batch-serving Nov 21, 2023
5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants