You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[2025-03-26 17:34:15.188] Uninitialized parameters: ['model.audio_encoder.encoder.global_cmvn.istd', 'model.audio_encoder.encoder.global_cmvn.mean']
INFO 03-26 17:34:15 model_runner.py:890] Loading model weights took 15.5767 GB
INFO 03-26 17:34:15 model_runner.py:890] Loading model weights took 15.5767 GB
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
INFO 03-26 17:34:29 gpu_executor.py:121] # GPU blocks: 55016, # CPU blocks: 4681
INFO 03-26 17:34:30 gpu_executor.py:121] # GPU blocks: 55016, # CPU blocks: 4681
INFO 03-26 17:34:34 model_runner.py:1181] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 03-26 17:34:34 model_runner.py:1185] CUDA graphs can take additional 13 GiB memory per GPU. If you are running out of memory, consider decreasing gpu_memory_utilization or enforcing eager mode. You can also reduce the max_num_seqs as needed to decrease memory usage.
INFO 03-26 17:34:35 model_runner.py:1181] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 03-26 17:34:35 model_runner.py:1185] CUDA graphs can take additional 13 GiB memory per GPU. If you are running out of memory, consider decreasing gpu_memory_utilization or enforcing eager mode. You can also reduce the max_num_seqs as needed to decrease memory usage.
INFO 03-26 17:34:55 model_runner.py:1300] Graph capturing finished in 21 secs.
WARNING 03-26 17:34:55 sampling_params.py:221] temperature 0.001 is less than 0.01, which may cause numerical errors nan or inf in tensors. We have maxed it out to 0.01.
INFO 03-26 17:34:56 model_runner.py:1300] Graph capturing finished in 21 secs.
WARNING 03-26 17:34:56 sampling_params.py:221] temperature 0.001 is less than 0.01, which may cause numerical errors nan or inf in tensors. We have maxed it out to 0.01.
it seems no error, but i cant open this url: (https://127.0.0.1:8081)
run this command: python -m web_demo.server --model_path /mnt/local_path/demo_VITA_ckpt, and it print info as follows:
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 0% Completed | 0/4 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:09<00:29, 9.71s/it]
Loading safetensors checkpoint shards: 25% Completed | 1/4 [00:10<00:30, 10.23s/it]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:22<00:22, 11.35s/it]
Loading safetensors checkpoint shards: 50% Completed | 2/4 [00:22<00:22, 11.27s/it]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:47<00:17, 17.77s/it]
Loading safetensors checkpoint shards: 75% Completed | 3/4 [00:48<00:18, 18.13s/it]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [01:14<00:00, 21.39s/it]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [01:14<00:00, 18.73s/it]
[2025-03-26 17:34:14.578] Uninitialized parameters: ['model.audio_encoder.encoder.global_cmvn.istd', 'model.audio_encoder.encoder.global_cmvn.mean']
Loading safetensors checkpoint shards: 100% Completed | 4/4 [01:15<00:00, 21.77s/it]
Loading safetensors checkpoint shards: 100% Completed | 4/4 [01:15<00:00, 18.88s/it]
[2025-03-26 17:34:15.188] Uninitialized parameters: ['model.audio_encoder.encoder.global_cmvn.istd', 'model.audio_encoder.encoder.global_cmvn.mean']
INFO 03-26 17:34:15 model_runner.py:890] Loading model weights took 15.5767 GB
INFO 03-26 17:34:15 model_runner.py:890] Loading model weights took 15.5767 GB
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15824, tid: 15824, flags: 0
XGPU-lite: L-235:func: cuInit, pid: 15823, tid: 15823, flags: 0
INFO 03-26 17:34:29 gpu_executor.py:121] # GPU blocks: 55016, # CPU blocks: 4681
INFO 03-26 17:34:30 gpu_executor.py:121] # GPU blocks: 55016, # CPU blocks: 4681
INFO 03-26 17:34:34 model_runner.py:1181] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 03-26 17:34:34 model_runner.py:1185] CUDA graphs can take additional 1
3 GiB memory per GPU. If you are running out of memory, consider decreasing3 GiB memory per GPU. If you are running out of memory, consider decreasinggpu_memory_utilization
or enforcing eager mode. You can also reduce themax_num_seqs
as needed to decrease memory usage.INFO 03-26 17:34:35 model_runner.py:1181] Capturing the model for CUDA graphs. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI.
INFO 03-26 17:34:35 model_runner.py:1185] CUDA graphs can take additional 1
gpu_memory_utilization
or enforcing eager mode. You can also reduce themax_num_seqs
as needed to decrease memory usage.INFO 03-26 17:34:55 model_runner.py:1300] Graph capturing finished in 21 secs.
WARNING 03-26 17:34:55 sampling_params.py:221] temperature 0.001 is less than 0.01, which may cause numerical errors nan or inf in tensors. We have maxed it out to 0.01.
INFO 03-26 17:34:56 model_runner.py:1300] Graph capturing finished in 21 secs.
WARNING 03-26 17:34:56 sampling_params.py:221] temperature 0.001 is less than 0.01, which may cause numerical errors nan or inf in tensors. We have maxed it out to 0.01.
it seems no error, but i cant open this url: (https://127.0.0.1:8081)
so, can you give me some suggestions to solve this issue?
@lxysl @linhaojia13 @BradyFU @longzw1997
The text was updated successfully, but these errors were encountered: