-
Notifications
You must be signed in to change notification settings - Fork 200
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Externally owned data in hypre_CSRMatrix #1095
Comments
I am re-opening this to update it to what seems to be working. I have only medium confidence here, so grain of salt. Any input on whether this is valid would still be appreciated. The setup is as follows. This is not a direct copy of the code but should get the point across.
A few of these steps are non-trivial and potentially wrong. Step 6 allocates the Step 13 and the layouts of |
@ictmay thank you for the update and apologize for the delay. On the surface, the logic seems correct including your comments about column ordering (except for remaining column indices being sorted in ascending order: that is not strictly necessary but advisable). Is this implementation openly available somewhere in AMP so we can take a look? We can also discuss about adding Thank you! |
@victorapm No worries on the delay, and thank you for getting back to us. For access to the source I'll have to talk to Bobby. I only work on the internal (restricted enclave) version, and I'm not sure what all has made it into the externally viewable version. Regarding the The column map isn't so large (except when it is...), so it could easily fall into the same category as the A nice solution that would avoid most of the touchy steps above, though one that would take some non-zero dev time and testing time, would be to make I don't know how often people want to do shallow-copies into Hypre, and without broader demand it makes more sense to have this as an interface we maintain rather than something that gets added to Hypre. |
@victorapm All of the shallow copy code is present in our external code here: Lines 41 -- 43 create the IJMatrix, and the process described above starts on line 136 and continues to the end of the file. |
Hi @ictmay , I'm sorry I dropped the ball on this issue! I looked at the link you shared in the previous post and saw there has been some recent changes in AMP. Is this still an issue to you? Is there something we can help with, e.g., providing an ownership flag for |
@victorapm No worries. Short answer, yes a flag guarding There have been some changes from our side, but the majority of the shallow copy approach is the same as outlined above. The main points are swapping non-zeros per row for the starting positions of each row, and deleting the global column indices when no longer needed. We have since tested on more machines and I have some more confidence that the above is valid, at least for now. Adding a guard around The biggest hurdle was getting column ordering within each row correct (diag block with diagonal element first and ascending after that, offd ascending order on local column specifically). Having that documented somewhere would be helpful. Even better would be a higher level function to call that would do the sort. That way if the expected order changes we wouldn't have to rewrite this. The ideal would be some function to create a ParCSR matrix from user data abstracting all of this away. It would take in diag/offd row pointers, local column indices, data fields, and offd column map from the user and a flag to allow/disallow mutation of them. When disallowed Hypre would check that everything is compatible with its assumptions (e.g. column ordering) and error if not. That last point is not needed by us presently, but if there is broader interest in supporting these shallow copies that would go a long way. |
@ictmay I got confused earlier, now I see the code is already checking
Sounds good, I will check with the team about this, thanks for the suggestion Lastly, if I may suggest a performance improvement to the HypreMatrixAdaptor code: When running on GPUs, we want to minimize the number of calls to Hope this helps |
@victorapm I guess I was also confused. It looks like If you all do decide to make a function to create a matrix from user data I would be happy to put it to use. Regarding the performance improvement, I will fix this if I get time, but that path is not performance critical right now anyway. That should only get called if we can't do a shallow copy, which in turn only happens if the incoming matrix isn't in our own format (e.g. a user has a Trilinos matrix and wants BoomerAMG as a preconditioner). Thank you again for all the help! |
@victorapm Apologies, the intricacies of this are coming back to me in waves. The The real problem is Guarding those fields means that I can safely pass my equivalent field(s) in without them getting free'd by Hypre. That does open the question of whether the Hopefully that makes sense. |
@ictmay thank you for explaining everything. We could add a Regarding |
@victorapm Adding a guard around the I take your point about changing the behavior of a free being touchy, though making |
I am working with the LANL AMP team. We have a parallel CSR matrix format that should be compatible with the Hypre format. Ideally we would wrap our format into a Hypre IJMatrix with a minimal amount of duplicate data.
Question #609 provides some initial information on how to approach this. I have looked at which (de)allocations are guarded by
owns_data
insidehypre_ParCSRMatrixCreate
,hypre_CSRMatrixCreate
, and the matching destroy functions to decide what we should own versus what Hypre should own. The heavy data all lives in thediag
andoffd
blocks.At present I do the following:
IJMatrix
and set type toHYPRE_PARCSR
, do not call initialize.hypre_IJMatrixCreateParCSR
. This sets owns_data inParCSR
to 1 and createsdiag
andoffd
structs.need_aux
to zero inside theIJMatrix->translator
.diag
andoffd
blocks and callhypre_CSRMatrixInitialize
manually. This allocates the->i
field which is not guarded byowns_data
.diag
andoffd
viahypre_CSRMatrixSetDataOwner
.->i
fields.->big_j
,->j
, and->data
fields ofdiag
andoffd
to reference our data. These are all guarded byowns_data
and are the heaviest memory requirements.The issue comes from the
ParCSR->col_map_offd
field. I'd prefer to leaveParCSR->owns_data==1
so that it will manage the setup/teardown of the CSRMatrix structs themselves and then I can just setowns_data
to zero in there. This means that Hypre needs to also own thecol_map_offd
field. The main benefit is thatHYPRE_IJMatrixDestroy
does (seem to) do all of the correct deallocations.That is a long-winded set up to premise a couple of questions:
ParCSR->owns_data=0
and manage more of the setup/teardown on my end?HYPRE_IJMatrixAssemble
to getcol_map_offd
created and filled. The internal functionhypre_IJMatrixAssembleParCSR
does not check the value ofoffd->owns_data
and freesoffd->big_j
yielding an eventual double-free (line 2922 of IJMatrix_parcsr.c).col_map_offd
and rearrangesoffd->j
. What are the exact requirements oncol_map_offd
? It must be more than just a map from local->global indices sinceoffd->j
also gets shuffled.I did add a check for the
offd->owns_data
flag intohypre_IJMatrixAssembleParCSR
, and that does avoid the double-free issue. Surprisingly this allows BoomerAMG to work as a solver, but I think that is a fluke. This doesn't really solve the overall issue. That function still modifies values in thebig_j
andj
arrays that it doesn't own. As a minimal solution it should check theowns_data
flag and throw an error.That's quite the wall of text. I appreciate any help, and please let me know if any clarifications are needed.
The text was updated successfully, but these errors were encountered: