Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

图存储的内存占用,以及split算子体现在何处? #287

Open
TheFinalHydra opened this issue Jul 23, 2020 · 4 comments
Open

Comments

@TheFinalHydra
Copy link

想请教关于euler 2.0的几个问题

  1. 目前的图结构加载起来后是以什么形式存在?是以邻接表的形式完全存在内存中吗,还是只存点和边的id在内存,具体的features单独存? 如果不是全部在内存中,是以什么形式,耗费的内存大概占图大小的多少比例呢?
  2. 文档里面说的分布式自动添加split和merge算子的代码位置具体在哪里呢,我看IDSplit好像并没有被实际调用?这个op主要好处是dag分布式情况下优化起来更方便吗
  3. 目前分布式下对于一些多跳采样的操作,看文档描述感觉仍然要等前一度的点返回了才能进行下一度的采样,感觉dag并行调度的优势似乎没有办法充分得到体现?

看了文档和一点点代码之后有些疑问,不知道是否愿意帮忙答疑一下,谢谢!

@zakheav
Copy link
Contributor

zakheav commented Jul 23, 2020 via email

@zakheav
Copy link
Contributor

zakheav commented Jul 23, 2020 via email

@TheFinalHydra
Copy link
Author

感谢这么详细的回复!

那就是说各个点/边的features也是直接存到了点/边的类中,而不是分开存储的吗,我看json2partdat.py的代码似乎也就是把每个点和边的所有属性直接打平了依次serialize到文件中。
第三点听起来挺好的,类似于把几个op给batch到一个rpc里面了

代码路径厉害了,非常感谢!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants