Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory management when using amd blis with block2 #30

Open
floatingCatty opened this issue Feb 20, 2025 · 9 comments
Open

Memory management when using amd blis with block2 #30

floatingCatty opened this issue Feb 20, 2025 · 9 comments

Comments

@floatingCatty
Copy link

Hi!

I recently tried to use AMD-blis and AMD-libflame as the linear algebra backend for the numerical calculation. I bind the package with block2 software, while during the iterative computation, the memory keeps growing. The issue in pyblock2 package is: block-hczhai/block2-preview#163

The developer identified that the problem might caused by the AMD-blis library, can you help me check why this is the case, and how to address it? Thanks!

Best,

Zhanghao

@kvaragan
Copy link
Collaborator

Hi, Can you write a mail to [email protected].
If its direct issue on AOCL BLAS, we can address it here.
Thanks!!

@kvaragan
Copy link
Collaborator

Thanks for reporting, the details are mentioned here: block-hczhai/block2-preview#163
Will check !!

@floatingCatty
Copy link
Author

Hi, Can you write a mail to [email protected]. If its direct issue on AOCL BLAS, we can address it here. Thanks!!

Thanks for the quick reply!

Yes, it might be a direct issue on AOCL BLAS. DO I need to write an email?

@kvaragan
Copy link
Collaborator

Please do write.
In the meantime - can you try out: https://github.com/amd/blis/tree/dev
We update this branch biweekly.

$./configure --enable-cblas -t openmp amdzen --prefix=
$ make -j
$ make install

sup disabled will work - the code gets build, but we don't call those functions.

@floatingCatty
Copy link
Author

Please do write. In the meantime - can you try out: https://github.com/amd/blis/tree/dev We update this branch biweekly.

$./configure --enable-cblas -t openmp amdzen --prefix= $ make -j $ make install

sup disabled will work - the code gets build, but we don't call those functions.

Hi I just tried to build the dev branch using <./configure --prefix=$PWD/install --disable-sup-handling --enable-cblas -t openmp auto>. The code takes a while to run to see the memory growth, hope it works. Thanks.

@floatingCatty
Copy link
Author

Hi, this new dev branch wouldn't help. I switch to OpenBLAS for now. I tested with MKL and OpenBLAS, both worked well without the problem before. Hope this report helps to identify some possible issues.

@sireeshasanga
Copy link
Collaborator

Thank you for sharing the issue details. We will analyze further and keep you posted.

@kvaragan
Copy link
Collaborator

Hi, this new dev branch wouldn't help. I switch to OpenBLAS for now. I tested with MKL and OpenBLAS, both worked well without the problem before. Hope this report helps to identify some possible issues.

Hi, how do we reproduce this issue ?

@floatingCatty
Copy link
Author

Hi, this new dev branch wouldn't help. I switch to OpenBLAS for now. I tested with MKL and OpenBLAS, both worked well without the problem before. Hope this report helps to identify some possible issues.

Hi, how do we reproduce this issue ?

Hi, I just sent a mail for the toolchains support for the detail of reproduction. I copied the reproduce info here also:

<
I installed the latest block2 in the repository. Located in block-hczhai/block2-preview: Efficient parallel quantum chemistry DMRG in MPO formalism
I am using Python 3.10.13, the system info is "Linux frea 4.19.87-amd64-nvm #4 SMP Thu Dec 12 07:06:36 EST 2019 x86_64 GNU/Linux".

I use this command to build the block2 "cmake .. -DBUILD_LIB=ON -DLARGE_BOND=ON -DMPI=ON -DUSE_MKL=OFF -DF77UNDERSCORE=ON -DUSE_COMPLEX=ON -DUSE_SG=ON -DBLAS_LIBRARIES=Path_to_AMD_BLIS -DLAPACK_LIBRARIES=path_to_amd_libflame -DCMAKE_BUILD_TYPE=Release -DCMAKE_CXX_FLAGS="-O3 -march=native".

To reproduce, there is an example I give in the block2 issue to reproduce the memory usage:
block-hczhai/block2-preview#163 (comment)

The example is in the params.zip file, the reply contains a code that can execute it once the block2_preview is installed.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants