Proposal for batch.filter(mask) method to enhance efficiency #9681
leonardcaquot94
started this conversation in
Ideas
Replies: 3 comments 1 reply
-
We absolutely need a more efficient Batch splicing method. +1. You might be able to propose a pull request to get the ball rolling on this. |
Beta Was this translation helpful? Give feedback.
0 replies
-
@leonardcaquot94 I've created a partial solution here. If you or anyone else has feedback, please let me know. We need to update the
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Here is a PR linked to this discussion : #9911 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello PyG team!
I’d like to propose the addition of a
batch.filter(mask)
method to PyTorch Geometric, which would take a mask or a list of indices and efficiently filter the batch without using thenew_batch = Batch.from_data_list(batch[mask])
conversion. Below is a sample comparison between two approaches, demonstrating significant performance improvement.Code Example:
Execution Results:
Explanation:
Additional Suggestions:
To make this feature more efficient, it could be beneficial to reuse some of the existing code from the collate function to handle custom node and edge attributes iteratively. Additionally, it might be useful to provide an option to return a "negative sub-batch" (i.e., the elements that are excluded from the mask) alongside the positive sub-batch.
Usage :
Currently, when we filter batch data like
sub_batch = batch[mask]
, it returns a data list. But I believe it would be more convenient if it kept the result as aBatch
object. This way, users can maintain theBatch
format and callsub_batch.to_data_list()
only when they explicitly need a data list. This would streamline operations where batch structure needs to be preserved.I believe these improvements could greatly enhance performance, especially in batch filtering scenarios.
Let me know what you think of this idea, and if any further clarifications are needed!
Beta Was this translation helpful? Give feedback.
All reactions