Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEAL - Utilizing multiple edge features #17

Open
bits-glitch opened this issue Jul 27, 2021 · 6 comments
Open

SEAL - Utilizing multiple edge features #17

bits-glitch opened this issue Jul 27, 2021 · 6 comments

Comments

@bits-glitch
Copy link

Hello SEAL Team,

thank you very much for the great implementation!

I noticed that you can incorporate the node features and edge weight into the SEAL learning process (--use_feature and --use_edge_weight). Dealing with edge weight seems pretty straightforward for me, but have you thought about a possibility of combining further edge features (not only one weight) in SEAL? Is there a possibility of doing this as well?

@muhanzhang
Copy link
Collaborator

Hi,

Absolutely! You may replace the graph convolution layers in SEAL with a GNN that deals with edge features, such as GINE. Since there are currently no link prediction datasets with edge features, I did not add such a GNN.

@bits-glitch
Copy link
Author

Thank you very much!

@bits-glitch
Copy link
Author

I am currently implementing this and I have one remaining question. Currently, you are compressing the edge features into one edge weight.

The k-hop_subgraph extraction method is relying on A, and no longer obtains the edges via edge_index, thus obtaining the edge features by indexing of Data.edge_attr is not easily feasible. So in my opinion, it would be nicest to extend the dimensions of the ssp.csr_matrix, so that we have (num_edge_features, num_nodes, num_nodes). But, this is not easily feasible, if I understand the documentation of CSR Matrix correctly. Do you know a straightforward solution for this?

@bits-glitch bits-glitch reopened this Aug 2, 2021
@muhanzhang
Copy link
Collaborator

@bits-glitch One possible solution is to create a dictionary storing the map from (i,j) to edge attribute vector, and feed this dict into the subgraph extraction function.

One other built-in data structure is the "Hybrid sparse COO tensors" in PyTorch. Check this doc. It supports querying a dense tensor via sparse indices, and slicing along a dense dimension. You may convert the edge_attr into this format and do the querying.

@bits-glitch
Copy link
Author

bits-glitch commented Aug 3, 2021

Thank you for the advice, that was really helpful!
Getting all edge attributes by using the map (as you suggested) was straightforward:

# assemble (node1id,node2id) similar to scipy.sparse.csr_matrix notation
node1id = subgraph.tocoo().col
node2id = subgraph.tocoo().row

node1id = [nodes[i] for i in node1id] 
node2id = [nodes[i] for i in node2id] 

edg_idx = list(tuple(zip(node1id,node2id)))
# Remove target link between the subgraph.
edg_idx.remove((src,dst))
edg_idx.remove((dst,src))

edge_features = torch.tensor([map[i].numpy() for i in edg_idx]).t()

Two questions remain:

  1. how to order the edge attributes, i.e. if the ordering is correct in this implementation. Could you give me some advice concerning that matter?
  2. how to feed the edge attributes to GINE. I get a dimensionality problem with GINE
    The dimensions of edge_index and edge_attr should be consistent, but I am not sure how to fix the problem with x:
x:torch.Size([270, 32])
edge_attr:torch.Size([6, 422])
edge_index:torch.Size([2, 422])

Do we need to use a different GNN operator?

@muhanzhang
Copy link
Collaborator

  1. I guess the order is correct here. When you construct the pyg subgraph here, you stack the row and col indices returned by ssp.find(). If a scipy coo matrix's col and row attributes have a consistent order as the find(), then it is fine. My guess is that they should be consistent (both row-first expanding). You may verify this by feeding in a small test example.

  2. According to GINE, it needs x and edge_attr to have the same feature dimension for summing them up. You may use other GNNs supporting edge features, such as NNConv, or manually add a linear layer to transform edge_attr to have the same dimension as x.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants