Open
Description
Describe the bug
I am using InferenceClient for generating text with tgi endpoint. To control the repetitive generations we had to use no_repeat_ngram_size
parameter. But I am getting below error
TypeError: InferenceClient.text_generation() got an unexpected keyword argument 'no_repeat_ngram_size'
But it is supported in GenerationConfig
Reproduction
No response
Logs
TypeError: InferenceClient.text_generation() got an unexpected keyword argument 'no_repeat_ngram_size'
Traceback (most recent call last):
File "/opt/conda/lib/python3.10/site-packages/gradio/queueing.py", line 489, in call_prediction
output = await route_utils.call_process_api(
File "/opt/conda/lib/python3.10/site-packages/gradio/route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1561, in process_api
result = await self.call_function(
File "/opt/conda/lib/python3.10/site-packages/gradio/blocks.py", line 1191, in call_function
prediction = await utils.async_iteration(iterator)
File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
return await iterator.__anext__()
File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 640, in asyncgen_wrapper
async for response in f(*args, **kwargs):
File "/opt/conda/lib/python3.10/site-packages/gradio/chat_interface.py", line 481, in _stream_fn
first_response = await async_iteration(generator)
File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 519, in async_iteration
return await iterator.__anext__()
File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 512, in __anext__
return await anyio.to_thread.run_sync(
File "/opt/conda/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2134, in run_sync_in_worker_thread
return await future
File "/opt/conda/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 851, in run
result = context.run(func, *args)
File "/opt/conda/lib/python3.10/site-packages/gradio/utils.py", line 495, in run_sync_iterator_async
return next(iterator)
File "/app/app.py", line 56, in generate
for token in client.text_generation(
TypeError: InferenceClient.text_generation() got an unexpected keyword argument 'no_repeat_ngram_size'
System info
- huggingface_hub version: 0.20.3
- Platform: Linux-5.15.0-92-generic-x86_64-with-glibc2.31
- Python version: 3.10.13
- Running in iPython ?: No
- Running in notebook ?: No
- Running in Google Colab ?: No
- Token path ?: /user/.cache/huggingface/token
- Has saved token ?: True
- FastAI: N/A
- Tensorflow: N/A
- Torch: 2.1.1
- Jinja2: 3.1.3
- Graphviz: N/A
- Pydot: N/A
- Pillow: 10.2.0
- hf_transfer: 0.1.5
- gradio: 4.12.0
- tensorboard: N/A
- numpy: 1.26.3
- pydantic: 2.6.1
- aiohttp: 3.9.3
- ENDPOINT: https://huggingface.co
- HF_HUB_CACHE: /data
- HF_ASSETS_CACHE: /user/.cache/huggingface/assets
- HF_TOKEN_PATH: /user/.cache/huggingface/token
- HF_HUB_OFFLINE: False
- HF_HUB_DISABLE_TELEMETRY: False
- HF_HUB_DISABLE_PROGRESS_BARS: None
- HF_HUB_DISABLE_SYMLINKS_WARNING: False
- HF_HUB_DISABLE_EXPERIMENTAL_WARNING: False
- HF_HUB_DISABLE_IMPLICIT_TOKEN: False
- HF_HUB_ENABLE_HF_TRANSFER: True
- HF_HUB_ETAG_TIMEOUT: 10
- HF_HUB_DOWNLOAD_TIMEOUT: 10