Reuse entity, item and user embeddings #29

lfomendes · 2020-06-19T18:56:42Z

I'm experimenting with KGAT using my own user-item and KG data.
But I would like to reuse all the embeddings generated during the recommendations in other models and to visualize/clusterize the items.

How can I do that?
Looking at the KGAT.py I see that you keep

all_weights['user_embed']
all_weights['entity_embed']
all_weights['relation_embed']
all_weights['trans_W']

Does it make sense to get the embeddings from the 'entity embed' or since you are using TransR they only make sense when I trasnform with a specific relation?

Thanks

lfomendes · 2020-06-22T15:00:17Z

And is it possible to save these vectors?
The only output for this code is the recommendation metrics?

xiangwang1223 · 2020-06-22T15:05:52Z

Yes, you can save these embedding vectors via the codes like:

knowledge_graph_attention_network/Model/Main.py

Lines 170 to 177 in 530327a

    
           user_embed, entity_embed, relation_embed = sess.run( 
        
               [model.weights['user_embed'], model.weights['entity_embed'], model.weights['relation_embed']], 
        
               feed_dict={}) 
        
           temp_save_path = '%spretrain/%s/%s.npz' % (args.proj_path, args.dataset, args.model_type) 
        
           ensureDir(temp_save_path) 
        
           np.savez(temp_save_path, user_embed=user_embed, entity_embed=entity_embed, relation_embed=relation_embed) 
        
           print('save the weights of kgat in path: ', temp_save_path)

Hope it can be helpful.

lfomendes · 2020-06-22T19:58:25Z

I will try that
Thank you =D

Does the relation and entity make sense alone? Or do I have to combine them in some way?
In TransR a projection matrix is used, correct?

xiangwang1223 · 2020-06-23T01:21:35Z

I think that storing the relation matrices and entity embeddings separately is reasonable, such that you can combine them later in flexible ways.

lfomendes · 2020-06-23T12:59:28Z

Thank you for your response =D I got it running and saved the embeddings like you showed me. But now I'm trying to run with "real" data but I got some errors I'm using my own KG data I'm running with the following command !python Main.py --model_type kgat --alg_type bi --dataset bh15 --regs

[1e-5,1e-5] --layer_size [64,32,16] --embed_size 64 --lr 0.001 --epoch 100 --verbose 50 --save_flag 1 --pretrain 1 --batch_size 256 --node_dropout [0.1] --mess_dropout [0.1,0.1,0.1] --use_att True --use_kge True

And this is the data [n_users, n_items]=[252346, 462485]

[n_train, n_test]=[722370, 80211] [n_entities, n_relations, n_triples]=[483176, 5, 1593039] [batch_size, batch_size_kg]=[256, 564]

And the error: without pretraining.

Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(*args) File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[735522,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node gradients/SparseTensorDenseMatMul_251/SparseTensorDenseMatMul_grad/SparseTensorDenseMatMul}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.

I will try to understand what is happening but if you have any tips I would appreciate it Thanks ps: I'm using tensorflow 2 but i dont think this is the problem

…

On Mon, Jun 22, 2020 at 10:21 PM Xiang Wang ***@***.***> wrote: I think that storing the relation matrices and entity embeddings separately is reasonable, such that you can combine them later in flexible ways. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AB3D5F232X3GPFUSD6GNOCLRX77SXANCNFSM4OC63TRA> .

FrankChengGD · 2021-12-20T19:37:27Z

Thank you for your response =D I got it running and saved the embeddings like you showed me. But now I'm trying to run with "real" data but I got some errors I'm using my own KG data I'm running with the following command !python Main.py --model_type kgat --alg_type bi --dataset bh15 --regs
[1e-5,1e-5] --layer_size [64,32,16] --embed_size 64 --lr 0.001 --epoch 100 --verbose 50 --save_flag 1 --pretrain 1 --batch_size 256 --node_dropout [0.1] --mess_dropout [0.1,0.1,0.1] --use_att True --use_kge True
And this is the data [n_users, n_items]=[252346, 462485]
[n_train, n_test]=[722370, 80211] [n_entities, n_relations, n_triples]=[483176, 5, 1593039] [batch_size, batch_size_kg]=[256, 564]
And the error: without pretraining.
Traceback (most recent call last): File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1365, in _do_call return fn(args) File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1350, in _run_fn target_list, run_metadata) File "/home/ec2-user/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/tensorflow_core/python/client/session.py", line 1443, in _call_tf_sessionrun run_metadata) tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[735522,32] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[{{node gradients/SparseTensorDenseMatMul_251/SparseTensorDenseMatMul_grad/SparseTensorDenseMatMul}}]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
I will try to understand what is happening but if you have any tips I would appreciate it Thanks ps: I'm using tensorflow 2 but i dont think this is the problem
…
On Mon, Jun 22, 2020 at 10:21 PM Xiang Wang @.**> wrote: I think that storing the relation matrices and entity embeddings separately is reasonable, such that you can combine them later in flexible ways. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#29 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB3D5F232X3GPFUSD6GNOCLRX77SXANCNFSM4OC63TRA .

@lfomendes
Hi! I am also facing the same error as this 'ResourceExhaustedError'. Is it means the GPU is short of memory?
May I get some help on how did you fix this error?

Thanks a lot!

lfomendes changed the title ~~Reuse entity, item embeddings afterwards~~ Reuse entity, item and user embeddings Jun 22, 2020

Genius-pig mentioned this issue Jan 21, 2024

Question about pretrain embedding LunaBlack/KGAT-pytorch#3

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reuse entity, item and user embeddings #29

Reuse entity, item and user embeddings #29

lfomendes commented Jun 19, 2020 •

edited

Loading

lfomendes commented Jun 22, 2020

xiangwang1223 commented Jun 22, 2020

lfomendes commented Jun 22, 2020

xiangwang1223 commented Jun 23, 2020

lfomendes commented Jun 23, 2020 via email

FrankChengGD commented Dec 20, 2021

Reuse entity, item and user embeddings #29

Reuse entity, item and user embeddings #29

Comments

lfomendes commented Jun 19, 2020 • edited Loading

lfomendes commented Jun 22, 2020

xiangwang1223 commented Jun 22, 2020

lfomendes commented Jun 22, 2020

xiangwang1223 commented Jun 23, 2020

lfomendes commented Jun 23, 2020 via email

FrankChengGD commented Dec 20, 2021

lfomendes commented Jun 19, 2020 •

edited

Loading