Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building wheel error during installation #978

Closed
Drzhishi opened this issue Jul 1, 2024 · 3 comments
Closed

Building wheel error during installation #978

Drzhishi opened this issue Jul 1, 2024 · 3 comments
Labels
bug Something isn't working build Build system

Comments

@Drzhishi
Copy link

Drzhishi commented Jul 1, 2024

I manually download flash-attn, then use 'pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable' for installation, Received error 'Building wheel for transformer_engine (setup.py)... error'

torch2.2
cuda11.8

(tuling) xx@DESKTOP-UA3C67F:~/ChatTTS$ pip install git+https://github.com/NVIDIA/TransformerEngine.git@stable
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting git+https://github.com/NVIDIA/TransformerEngine.git@stable
Cloning https://github.com/NVIDIA/TransformerEngine.git (to revision stable) to /tmp/pip-req-build-9lezr884
Running command git clone --filter=blob:none --quiet https://github.com/NVIDIA/TransformerEngine.git /tmp/pip-req-build-9lezr884
Running command git checkout -b stable --track origin/stable
Switched to a new branch 'stable'
Branch 'stable' set up to track remote branch 'stable' from 'origin'.
Resolved https://github.com/NVIDIA/TransformerEngine.git to commit c81733f
Running command git submodule update --init --recursive -q
Preparing metadata (setup.py) ... done
Requirement already satisfied: pydantic in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from transformer_engine==1.6.0+c81733f) (2.7.4)
Requirement already satisfied: torch in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from transformer_engine==1.6.0+c81733f) (2.2.2)
Requirement already satisfied: flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from transformer_engine==1.6.0+c81733f) (2.4.2)
Requirement already satisfied: einops in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6->transformer_engine==1.6.0+c81733f) (0.8.0)
Requirement already satisfied: packaging in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6->transformer_engine==1.6.0+c81733f) (24.1)
Requirement already satisfied: ninja in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from flash-attn!=2.0.9,!=2.1.0,<=2.4.2,>=2.0.6->transformer_engine==1.6.0+c81733f) (1.11.1.1)
Requirement already satisfied: annotated-types>=0.4.0 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from pydantic->transformer_engine==1.6.0+c81733f) (0.7.0)
Requirement already satisfied: pydantic-core==2.18.4 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from pydantic->transformer_engine==1.6.0+c81733f) (2.18.4)
Requirement already satisfied: typing-extensions>=4.6.1 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from pydantic->transformer_engine==1.6.0+c81733f) (4.11.0)
Requirement already satisfied: filelock in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from torch->transformer_engine==1.6.0+c81733f) (3.13.1)
Requirement already satisfied: sympy in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from torch->transformer_engine==1.6.0+c81733f) (1.12)
Requirement already satisfied: networkx in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from torch->transformer_engine==1.6.0+c81733f) (3.2.1)
Requirement already satisfied: jinja2 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from torch->transformer_engine==1.6.0+c81733f) (3.1.4)
Requirement already satisfied: fsspec in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from torch->transformer_engine==1.6.0+c81733f) (2024.6.1)
Requirement already satisfied: MarkupSafe>=2.0 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from jinja2->torch->transformer_engine==1.6.0+c81733f) (2.1.3)
Requirement already satisfied: mpmath>=0.19 in /home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages (from sympy->torch->transformer_engine==1.6.0+c81733f) (1.3.0)
Building wheels for collected packages: transformer_engine
Building wheel for transformer_engine (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [242 lines of output]
Could not determine CUDA Toolkit version
/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/init.py:81: _DeprecatedInstaller: setuptools.installer and fetch_build_eggs are deprecated.
!!

          ********************************************************************************
          Requirements should be satisfied by a PEP 517 installer.
          If you are using pip, you can try `pip install --use-pep517`.
          ********************************************************************************

  !!
    dist.fetch_build_eggs(dist.setup_requires)
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-cpython-310
  creating build/lib.linux-x86_64-cpython-310/transformer_engine
  copying transformer_engine/_version.py -> build/lib.linux-x86_64-cpython-310/transformer_engine
  copying transformer_engine/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/sharding.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/softmax.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/cpp_extensions.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/fused_attn.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/fp8.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/layernorm.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/mlp.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/dot.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  copying transformer_engine/jax/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/common
  copying transformer_engine/common/utils.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/common
  copying transformer_engine/common/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/common
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/float8_tensor.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/utils.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/numerics_debug.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/export.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/softmax.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/cpu_offload.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/graph.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/attention.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/jit.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/distributed.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/fp8.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/constants.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/transformer.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/te_onnx_extensions.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  copying transformer_engine/pytorch/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/utils.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/recompute.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/cpp_extensions.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/distributed.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/fp8.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/constants.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/profile.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  copying transformer_engine/paddle/fp8_buffer.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/jax/flax
  copying transformer_engine/jax/flax/module.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax/flax
  copying transformer_engine/jax/flax/transformer.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax/flax
  copying transformer_engine/jax/flax/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax/flax
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/jax/praxis
  copying transformer_engine/jax/praxis/module.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax/praxis
  copying transformer_engine/jax/praxis/transformer.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax/praxis
  copying transformer_engine/jax/praxis/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/jax/praxis
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/common/recipe
  copying transformer_engine/common/recipe/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/common/recipe
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/layernorm_mlp.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/layernorm.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/layernorm_linear.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/base.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/rmsnorm.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/linear.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/_common.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  copying transformer_engine/pytorch/module/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/module
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/gemm.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/transpose.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/fused_attn.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/cast.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/normalization.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  copying transformer_engine/pytorch/cpp_extensions/activation.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/pytorch/cpp_extensions
  creating build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/softmax.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/attention.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/layernorm_mlp.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/layernorm.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/layernorm_linear.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/base.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/transformer.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/rmsnorm.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/linear.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  copying transformer_engine/paddle/layer/__init__.py -> build/lib.linux-x86_64-cpython-310/transformer_engine/paddle/layer
  running build_ext
  Building CMake extension transformer_engine
  Running command /tmp/pip-req-build-9lezr884/.eggs/cmake-3.29.6-py3.10-linux-x86_64.egg/cmake/data/bin/cmake -S /tmp/pip-req-build-9lezr884/transformer_engine -B /tmp/pip-req-build-9lezr884/build/cmake -DPython_EXECUTABLE=/home/cx/anaconda3/envs/tuling/bin/python -DPython_INCLUDE_DIR=/home/cx/anaconda3/envs/tuling/include/python3.10 -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/tmp/pip-req-build-9lezr884/build/lib.linux-x86_64-cpython-310 -GNinja
  -- The CUDA compiler identification is NVIDIA 11.8.89
  -- The CXX compiler identification is GNU 11.4.0
  -- Detecting CUDA compiler ABI info
  -- Detecting CUDA compiler ABI info - done
  -- Check for working CUDA compiler: /usr/local/cuda-11.8/bin/nvcc - skipped
  -- Detecting CUDA compile features
  -- Detecting CUDA compile features - done
  -- Detecting CXX compiler ABI info
  -- Detecting CXX compiler ABI info - done
  -- Check for working CXX compiler: /usr/bin/c++ - skipped
  -- Detecting CXX compile features
  -- Detecting CXX compile features - done
  -- Found CUDAToolkit: /usr/local/cuda-11.8/targets/x86_64-linux/include (found version "11.8.89")
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD
  -- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
  -- Found Threads: TRUE
  -- cudnn found at /usr/local/cuda-11.8/lib64/libcudnn.so.
  CMake Warning (dev) at /tmp/pip-req-build-9lezr884/.eggs/cmake-3.29.6-py3.10-linux-x86_64.egg/cmake/data/share/cmake-3.29/Modules/FindPackageHandleStandardArgs.cmake:438 (message):
    The package name passed to `find_package_handle_standard_args` (LIBRARY)
    does not match the name of the calling package (CUDNN).  This can lead to
    problems in calling code that expects `find_package` result variables
    (e.g., `_FOUND`) to follow a certain pattern.
  Call Stack (most recent call first):
    cmake/FindCUDNN.cmake:44 (find_package_handle_standard_args)
    CMakeLists.txt:24 (find_package)
  This warning is for project developers.  Use -Wno-dev to suppress it.

  -- Found LIBRARY: /usr/local/cuda-11.8/targets/x86_64-linux/include
  -- cuDNN: /usr/local/cuda-11.8/lib64/libcudnn.so
  -- cuDNN: /usr/local/cuda-11.8/targets/x86_64-linux/include
  -- cudnn_adv_infer found at /usr/local/cuda-11.8/lib64/libcudnn_adv_infer.so.
  -- cudnn_adv_train found at /usr/local/cuda-11.8/lib64/libcudnn_adv_train.so.
  -- cudnn_cnn_infer found at /usr/local/cuda-11.8/lib64/libcudnn_cnn_infer.so.
  -- cudnn_cnn_train found at /usr/local/cuda-11.8/lib64/libcudnn_cnn_train.so.
  -- cudnn_ops_infer found at /usr/local/cuda-11.8/lib64/libcudnn_ops_infer.so.
  -- cudnn_ops_train found at /usr/local/cuda-11.8/lib64/libcudnn_ops_train.so.
  -- Found Python: /home/cx/anaconda3/envs/tuling/bin/python (found version "3.10.14") found components: Interpreter Development.Module
  -- JAX support: OFF
  -- Configuring done (9.9s)
  -- Generating done (0.0s)
  -- Build files have been written to: /tmp/pip-req-build-9lezr884/build/cmake
  Running command /tmp/pip-req-build-9lezr884/.eggs/cmake-3.29.6-py3.10-linux-x86_64.egg/cmake/data/bin/cmake --build /tmp/pip-req-build-9lezr884/build/cmake
  [1/32] Building CXX object common/CMakeFiles/transformer_engine.dir/transformer_engine.cpp.o
  [2/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/gemm/cublaslt_gemm.cu.o
  /tmp/pip-req-build-9lezr884/transformer_engine/common/gemm/cublaslt_gemm.cu(73): warning #550-D: variable "counter" was set but never used

  /tmp/pip-req-build-9lezr884/transformer_engine/common/gemm/cublaslt_gemm.cu(73): warning #550-D: variable "counter" was set but never used

  /tmp/pip-req-build-9lezr884/transformer_engine/common/gemm/cublaslt_gemm.cu(73): warning #550-D: variable "counter" was set but never used

  /tmp/pip-req-build-9lezr884/transformer_engine/common/gemm/cublaslt_gemm.cu(73): warning #550-D: variable "counter" was set but never used

  [3/32] Building CXX object common/CMakeFiles/transformer_engine.dir/layer_norm/ln_api.cpp.o
  [4/32] Building CXX object common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn.cpp.o
  [5/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/transpose/transpose.cu.o
  [6/32] Building CXX object common/CMakeFiles/transformer_engine.dir/rmsnorm/rmsnorm_api.cpp.o
  [7/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/transpose/transpose_fusion.cu.o
  [8/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/activation/swiglu.cu.o
  [9/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/activation/relu.cu.o
  [10/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/util/cast.cu.o
  [11/32] Building CXX object common/CMakeFiles/transformer_engine.dir/util/cuda_driver.cpp.o
  [12/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/rmsnorm/rmsnorm_bwd_semi_cuda_kernel.cu.o
  [13/32] Building CXX object common/CMakeFiles/transformer_engine.dir/util/cuda_runtime.cpp.o
  [14/32] Building CXX object common/CMakeFiles/transformer_engine.dir/util/rtc.cpp.o
  [15/32] Building CXX object common/CMakeFiles/transformer_engine.dir/util/system.cpp.o
  [16/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/activation/gelu.cu.o
  [17/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_rope/fused_rope.cu.o
  [18/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/recipe/delayed_scaling.cu.o
  [19/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/transpose/cast_transpose.cu.o
  [20/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/layer_norm/ln_bwd_semi_cuda_kernel.cu.o
  [21/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/transpose/multi_cast_transpose.cu.o
  [22/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_softmax/scaled_aligned_causal_masked_softmax.cu.o
  [23/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_softmax/scaled_masked_softmax.cu.o
  [24/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/rmsnorm/rmsnorm_fwd_cuda_kernel.cu.o
  [25/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_softmax/scaled_upper_triang_masked_softmax.cu.o
  [26/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/transpose/cast_transpose_fusion.cu.o
  [27/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_fp8.cu.o
  FAILED: common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_fp8.cu.o
  /usr/local/cuda-11.8/bin/nvcc -forward-unknown-to-host-compiler -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-9lezr884/transformer_engine -I/tmp/pip-req-build-9lezr884/transformer_engine/common/include -I/tmp/pip-req-build-9lezr884/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-9lezr884/build/cmake/common/string_headers -isystem /usr/local/cuda-11.8/targets/x86_64-linux/include --threads 4 --expt-relaxed-constexpr -O3 -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -Xcompiler=-fPIC -MD -MT common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_fp8.cu.o -MF common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_fp8.cu.o.d -x cu -c /tmp/pip-req-build-9lezr884/transformer_engine/common/fused_attn/fused_attn_fp8.cu -o common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_fp8.cu.o
  Killed
  [28/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_arbitrary_seqlen.cu.o
  FAILED: common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_arbitrary_seqlen.cu.o
  /usr/local/cuda-11.8/bin/nvcc -forward-unknown-to-host-compiler -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-9lezr884/transformer_engine -I/tmp/pip-req-build-9lezr884/transformer_engine/common/include -I/tmp/pip-req-build-9lezr884/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-9lezr884/build/cmake/common/string_headers -isystem /usr/local/cuda-11.8/targets/x86_64-linux/include --threads 4 --expt-relaxed-constexpr -O3 -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -Xcompiler=-fPIC -MD -MT common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_arbitrary_seqlen.cu.o -MF common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_arbitrary_seqlen.cu.o.d -x cu -c /tmp/pip-req-build-9lezr884/transformer_engine/common/fused_attn/fused_attn_f16_arbitrary_seqlen.cu -o common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_arbitrary_seqlen.cu.o
  Killed
  Killed
  Killed
  [29/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_max512_seqlen.cu.o
  FAILED: common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_max512_seqlen.cu.o
  /usr/local/cuda-11.8/bin/nvcc -forward-unknown-to-host-compiler -Dtransformer_engine_EXPORTS -I/tmp/pip-req-build-9lezr884/transformer_engine -I/tmp/pip-req-build-9lezr884/transformer_engine/common/include -I/tmp/pip-req-build-9lezr884/transformer_engine/common/../../3rdparty/cudnn-frontend/include -I/tmp/pip-req-build-9lezr884/build/cmake/common/string_headers -isystem /usr/local/cuda-11.8/targets/x86_64-linux/include --threads 4 --expt-relaxed-constexpr -O3 -O3 -DNDEBUG -std=c++17 "--generate-code=arch=compute_70,code=[compute_70,sm_70]" "--generate-code=arch=compute_80,code=[compute_80,sm_80]" "--generate-code=arch=compute_89,code=[compute_89,sm_89]" "--generate-code=arch=compute_90,code=[compute_90,sm_90]" -Xcompiler=-fPIC -MD -MT common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_max512_seqlen.cu.o -MF common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_max512_seqlen.cu.o.d -x cu -c /tmp/pip-req-build-9lezr884/transformer_engine/common/fused_attn/fused_attn_f16_max512_seqlen.cu -o common/CMakeFiles/transformer_engine.dir/fused_attn/fused_attn_f16_max512_seqlen.cu.o
  Killed
  Killed
  [30/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/layer_norm/ln_fwd_cuda_kernel.cu.o
  [31/32] Building CUDA object common/CMakeFiles/transformer_engine.dir/fused_attn/utils.cu.o
  ninja: build stopped: subcommand failed.
  Traceback (most recent call last):
    File "/tmp/pip-req-build-9lezr884/setup.py", line 336, in _build_cmake
      subprocess.run(command, cwd=build_dir, check=True)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/subprocess.py", line 526, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['/tmp/pip-req-build-9lezr884/.eggs/cmake-3.29.6-py3.10-linux-x86_64.egg/cmake/data/bin/cmake', '--build', '/tmp/pip-req-build-9lezr884/build/cmake']' returned non-zero exit status 1.

  During handling of the above exception, another exception occurred:

  Traceback (most recent call last):
    File "<string>", line 2, in <module>
    File "<pip-setuptools-caller>", line 34, in <module>
    File "/tmp/pip-req-build-9lezr884/setup.py", line 617, in <module>
      main()
    File "/tmp/pip-req-build-9lezr884/setup.py", line 602, in main
      setuptools.setup(
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/__init__.py", line 104, in setup
      return distutils.core.setup(**attrs)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 184, in setup
      return run_commands(dist)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/core.py", line 200, in run_commands
      dist.run_commands()
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/wheel/bdist_wheel.py", line 368, in run
      self.run_command("build")
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/command/build.py", line 132, in run
      self.run_command(cmd_name)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/cmd.py", line 316, in run_command
      self.distribution.run_command(command)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/dist.py", line 967, in run_command
      super().run_command(command)
    File "/home/cx/anaconda3/envs/tuling/lib/python3.10/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "/tmp/pip-req-build-9lezr884/setup.py", line 368, in run
      ext._build_cmake(
    File "/tmp/pip-req-build-9lezr884/setup.py", line 338, in _build_cmake
      raise RuntimeError(f"Error when running CMake: {e}")
  RuntimeError: Error when running CMake: Command '['/tmp/pip-req-build-9lezr884/.eggs/cmake-3.29.6-py3.10-linux-x86_64.egg/cmake/data/bin/cmake', '--build', '/tmp/pip-req-build-9lezr884/build/cmake']' returned non-zero exit status 1.
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for transformer_engine
Running setup.py clean for transformer_engine
Failed to build transformer_engine
ERROR: Could not build wheels for transformer_engine, which is required to install pyproject.toml-based projects

How do I install successfully, and is it related to cmake? I would be very grateful if you could give me a detailed answer.

@timmoon10
Copy link
Collaborator

We use Ninja to parallelize the build process and I suspect it's overwhelming your system resources. We're thinking about ways to handle this more gracefully, but for now can you try running with CMAKE_BUILD_PARALLEL_LEVEL=1 in your environment? You may also want to see #976 (comment).

@timmoon10 timmoon10 added bug Something isn't working build Build system labels Jul 1, 2024
@timmoon10
Copy link
Collaborator

With #987, you can control the number of parallel build jobs with the MAX_JOBS environment variable.

@timmoon10 timmoon10 mentioned this issue Aug 9, 2024
13 tasks
@timmoon10
Copy link
Collaborator

Current guidance for disabling parallel builds: #1077 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working build Build system
Projects
None yet
Development

No branches or pull requests

2 participants