Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Best hyper parameters in your paper section 4.2.3 #19

Open
rugezhao opened this issue Dec 8, 2019 · 6 comments
Open

Best hyper parameters in your paper section 4.2.3 #19

rugezhao opened this issue Dec 8, 2019 · 6 comments

Comments

@rugezhao
Copy link

rugezhao commented Dec 8, 2019

Hi,
I am trying to reproduce the results in your paper, but I could not find the best hyper-parameters in the paper or repo.
Can you share more information on hyperparameters for each dataset?

@xiangwang1223
Copy link
Owner

Thanks for your interest. Please get the latest version from github. For the parameter settings, please refer to the README file; for the corresponding training log, please refer to the log files. Thanks.

@rugezhao
Copy link
Author

rugezhao commented Dec 9, 2019

Both the README and https://github.com/xiangwang1223/knowledge_graph_attention_network/blob/master/Log/training_log_amazon-book.log are using the pretrained embeddings. I'm wondering what the parameters are used for training from scratch.

@srtianxia
Copy link

@rugezhao @xiangwang1223 I think this code is different from the paper such as the loss of KGE, In paper, the loss of KGE is contain Wr, but not in code (in code, Wr is used to calculate attention) ... I am more confused about this approach

@xiangwang1223
Copy link
Owner

Please CAREFULLY check the lines 194-199 in KGAT.py, where the model parameters "trans_W" are used to calculate the KGE loss, which is CONSISTENT to Equation (1) in the paper; and check the line 395, where the same parameters "trans_W" are used to calculate the attention scores, which is also CONSISTENT to Equation (4) in the paper.

ALL the codes are the same as the formulation in the paper.

@srtianxia
Copy link

Please CAREFULLY check the lines 194-199 in KGAT.py, where the model parameters "trans_W" are used to calculate the KGE loss, which is CONSISTENT to Equation (1) in the paper; and check the line 395, where the same parameters "trans_W" are used to calculate the attention scores, which is also CONSISTENT to Equation (4) in the paper.

ALL the codes are the same as the formulation in the paper.

Thanks for your reply!but the loss of KGE is in lines 229-252, Wr is really not in the loss of KGE, If I made a mistake, please point out, thank you!

def _build_loss_phase_II(self):
def _get_kg_score(h_e, r_e, t_e):
kg_score = tf.reduce_sum(tf.square((h_e + r_e - t_e)), 1, keepdims=True)
return kg_score
pos_kg_score = _get_kg_score(self.h_e, self.r_e, self.pos_t_e)
neg_kg_score = _get_kg_score(self.h_e, self.r_e, self.neg_t_e)
# Using the softplus as BPR loss to avoid the nan error.
kg_loss = tf.reduce_mean(tf.nn.softplus(-(neg_kg_score - pos_kg_score)))
# maxi = tf.log(tf.nn.sigmoid(neg_kg_score - pos_kg_score))
# kg_loss = tf.negative(tf.reduce_mean(maxi))
kg_reg_loss = tf.nn.l2_loss(self.h_e) + tf.nn.l2_loss(self.r_e) + \
tf.nn.l2_loss(self.pos_t_e) + tf.nn.l2_loss(self.neg_t_e)
kg_reg_loss = kg_reg_loss / self.batch_size_kg
self.kge_loss2 = kg_loss
self.reg_loss2 = self.regs[1] * kg_reg_loss
self.loss2 = self.kge_loss2 + self.reg_loss2
# Optimization process.
self.opt2 = tf.train.AdamOptimizer(learning_rate=self.lr).minimize(self.loss2)

@srtianxia
Copy link

I found my mistake, thank you for your correction @xiangwang1223

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants