You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Command: Library: tensorflow[and-cuda]
import tensorflow as tf
print(tf.config.list_physical_devices('CPU'))
print(tf.config.list_physical_devices('GPU'))
Log:
2025-02-04 06:04:29.413125: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable TF_ENABLE_ONEDNN_OPTS=0.
2025-02-04 06:04:31.327492: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738649071.978220 80542 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738649072.400870 80542 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-04 06:04:36.546724: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-02-04 06:04:50.776297: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2025-02-04 06:04:50.776349: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:137] retrieving CUDA diagnostic information for host: gpu-nvidia-l4-a363a292-a17a0463-m
2025-02-04 06:04:50.776359: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:144] hostname: gpu-nvidia-l4-a363a292-a17a0463-m
2025-02-04 06:04:50.776482: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:168] libcuda reported version is: 570.86.15
2025-02-04 06:04:50.776521: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:172] kernel reported version is: 570.86.15
2025-02-04 06:04:50.776531: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:259] kernel version seems to match DSO: 570.86.15
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
[]
The text was updated successfully, but these errors were encountered:
I'm using attached install_gpu_driver.sh in dataproc 2.2. GPU is not getting recognized in spark. Attached installation logs for reference
dataproc-gpu-main.txt
dataproc-initialization-script-0.log
install_gpu_driver.txt
Command:
Library: tensorflow[and-cuda]
import tensorflow as tf
print(tf.config.list_physical_devices('CPU'))
print(tf.config.list_physical_devices('GPU'))
Log:
2025-02-04 06:04:29.413125: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable
TF_ENABLE_ONEDNN_OPTS=0
.2025-02-04 06:04:31.327492: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1738649071.978220 80542 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738649072.400870 80542 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-02-04 06:04:36.546724: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
2025-02-04 06:04:50.776297: E external/local_xla/xla/stream_executor/cuda/cuda_driver.cc:152] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error
2025-02-04 06:04:50.776349: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:137] retrieving CUDA diagnostic information for host: gpu-nvidia-l4-a363a292-a17a0463-m
2025-02-04 06:04:50.776359: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:144] hostname: gpu-nvidia-l4-a363a292-a17a0463-m
2025-02-04 06:04:50.776482: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:168] libcuda reported version is: 570.86.15
2025-02-04 06:04:50.776521: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:172] kernel reported version is: 570.86.15
2025-02-04 06:04:50.776531: I external/local_xla/xla/stream_executor/cuda/cuda_diagnostics.cc:259] kernel version seems to match DSO: 570.86.15
[PhysicalDevice(name='/physical_device:CPU:0', device_type='CPU')]
[]
The text was updated successfully, but these errors were encountered: