Skip to content
This repository was archived by the owner on Mar 8, 2020. It is now read-only.

Avoid using JNI refs as map keys #116

Merged
merged 9 commits into from
Aug 28, 2019
Merged

Avoid using JNI refs as map keys #116

merged 9 commits into from
Aug 28, 2019

Conversation

bzz
Copy link
Contributor

@bzz bzz commented Aug 16, 2019

This is a proposition for addressing the #113.

Original impl used a reference as a key in a native map, but in JNI object references are neither constant nor unique, and thus were violating the contract on native map hashCode()-equivalent for a key.

This should allow to get rid of global reference in the interface lookup, but there more (clearly identified) cases where we still rely on side-effects of having a global reference and thus leak some memory.

This seems to works on macOS but fails on linux, but needs further work:

  • this cahnge fails on linux, but works on macOS
  • when NewGloablRef in interface lookup is removed, it a pre-order iterator tests

This branch is intentionally pushed to upstream repo, so @ncordon please feel free to take over and help.


This change is Reviewable

@bzz bzz assigned bzz and ncordon Aug 16, 2019
This was referenced Aug 16, 2019
@bzz
Copy link
Contributor Author

bzz commented Aug 16, 2019

CI passes now, before rebasing on #115

@bzz
Copy link
Contributor Author

bzz commented Aug 16, 2019

Rebased on latest master (after was #115 merged).

CI fails now (only on linux) for FilterManagedTest test, most probably on iterator failing to call JVM .hashCode() while looking up a node for ValueAt() while iterating.

Easy to reproduce on linux with

./sbt 'testOnly org.bblfsh.client.v2.FilterManagedTest -- -z "XPath filter should find all positions under context"'

@ncordon ncordon force-pushed the fix-leaky-keys branch 2 times, most recently from 89e7d66 to 7738d78 Compare August 19, 2019 22:12
@kuba-- kuba-- added the wip work in progress label Aug 20, 2019
@ncordon ncordon removed the wip work in progress label Aug 20, 2019
@ncordon
Copy link
Member

ncordon commented Aug 20, 2019

I have get rid (I think) of the memory leaks from #113. I think maybe it would make sense to remove all the comments of new reference, steal reference, because there is no such thing here as it is Python (all the references have to be cleaned up by the JNI, either explicitly if they are global refs, or implicitly as in local refs, which are cleaned automatically by JNI). WDYT?

@ncordon ncordon changed the title WIP avoid using JNI refs as map keys Avoid using JNI refs as map keys Aug 20, 2019
Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r2.
Reviewable status: all files reviewed, 2 unresolved discussions (waiting on @bzz and @dennwc)


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 368 at r2 (raw file):

        ObjectMethod(env, "valueAt", METHOD_JNODE_VALUE_AT, CLS_JNODE, obj, i);
    // TODO(#113) investigate, it looks like a potential memory leak
    return lookupOrCreate(val);  // borrows the reference

Depending on the resolution of @ncordon's question: If we are borrowing a pointer here I think that should be part of the function's doc comment, rather than a comment on the return statement, since that's something the caller will need to be aware of.


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 435 at r2 (raw file):

    if (obj2node.count(obj) > 0) {
      return obj2node[obj];
    } else {

I suggest treating the cache hit as an early return case:

if (obj2node.count(obj) > 0) {
   return obj2node[obj];
}
Node *node  = new Node(this, obj);
// etc.

Relatedly: Please be consistent about pointer/name grouping. It looks like most of this file follows the traditional "separate" convention (A *a vs. A* a).

(That's the style I personally prefer too, because of how it binds in the grammar, but I don't actually care as long as we are reasonably consistent)

@ncordon
Copy link
Member

ncordon commented Aug 21, 2019

I am going to block my own PR because I found a big memory leak this afternoon (simply parsing a file in a while similar as to what I did with the Python client). The fix for the memory leak is not useful if I do not do some further changes (I have to think how). The memory leak was caused because the ContextExt dispose method in JNI had a condition flipped, and was never deleting the tree. I am going to try and explain the problem:

Consider the following code:

import scala.io.Source
import org.bblfsh.client.v2.{BblfshClient, NodeExt}, BblfshClient._
import gopkg.in.bblfsh.sdk.v2.protocol.driver.Mode

val client = BblfshClient("localhost", 9432)

val fileName = "src/test/resources/python_file.py"
val fileContent = Source.fromFile(fileName).getLines.mkString("\n")

val resp = client.parse(fileName, fileContent)
val uast = resp.uast.decode()
val rootNode: NodeExt = uast.root()
val root = rootNode.load()
  • uast is a ContextExt which holds an integer which points to the native context (in JNI).
  • When I do val rootNode = uast.root(), I generate a NodeExt, which again holds a pointer to the native context and to the handle of the node.
  • When I do a rootNode.load(), I should not have disposed of the ContextExt, since otherwise I am going to try an access freed memory.

Problem is that NodeExt should not hold a pointer to the native context, but to ContextExt. If not, the JVM Garbage Collector could get in between those calls and free uast (because there is not anything else pointing to it). This can be checked with:

val uast = resp.uast.decode()
val rootNode: NodeExt = uast.root()
uast.dispose()
val root = rootNode.load()

and we end up with a SIGSEV.

Before doing my commit and flipping the condition we had SIGSEVs in Context, because it is similar to ContextExt but this one was actually deleting the underlying native data. That's why I started digging into this, because I was getting cores from time to time running the tests (hardly ever, because JVM GC has to be triggered in a very precise point).

@ncordon
Copy link
Member

ncordon commented Aug 21, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 368 at r2 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Depending on the resolution of @ncordon's question: If we are borrowing a pointer here I think that should be part of the function's doc comment, rather than a comment on the return statement, since that's something the caller will need to be aware of.

I am going to restructure docs then

@ncordon
Copy link
Member

ncordon commented Aug 21, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 435 at r2 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

I suggest treating the cache hit as an early return case:

if (obj2node.count(obj) > 0) {
   return obj2node[obj];
}
Node *node  = new Node(this, obj);
// etc.

Relatedly: Please be consistent about pointer/name grouping. It looks like most of this file follows the traditional "separate" convention (A *a vs. A* a).

(That's the style I personally prefer too, because of how it binds in the grammar, but I don't actually care as long as we are reasonably consistent)

I will fix this, and yes, I'll try to be consistent with the current style :)

@ncordon ncordon mentioned this pull request Aug 22, 2019
bzz and others added 7 commits August 23, 2019 19:55
Original impl used a reference as a key in a native map,
but in JNI object references are neither constant nor unique,
and thus were violating the contract on hashCode for a map key.

This should allow to get rid of global reference in interface lookup,
but there are 2 more (clearly identified) cases where
we still rely on side-effects of having a glabal reference
and thus leak some memory

Signed-off-by: Alexander Bezzubov <[email protected]>
The condition of the pointer if(!p) was flipped

Signed-off-by: ncordon <[email protected]>
Gets rid of the ContextExt leak of memory

Signed-off-by: ncordon <[email protected]>
Avoids unwanted deallocations by boxing ctx handler in a Context / ContextExt

Signed-off-by: ncordon <[email protected]>
Also alleviates NIO direct buffers deallocation slowness

Signed-off-by: ncordon <[email protected]>
@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 819 at r4 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

Note that now the iterators do not contain only the opaque handle to the context, but a reference to the Java objet which contains the handle, to avoid deallocations of that context.

*object

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/scala/org/bblfsh/client/v2/ContextExt.scala, line 36 at r4 (raw file):

    override def finalize(): Unit = {
      this.dispose()
    }

Note the finalizer for this class was missing

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/test/scala/org/bblfsh/client/v2/libuast/IteratorNativeTest.scala, line 48 at r4 (raw file):

    iter.hasNext() should be(false)
    iter.ctx should be(null)

This is ugly Scala, I know, since null references should be avoided. But before this was a handle and null value was 0. Here either we clean the values with ContextExt(0) or assign null. I am inclined for the second option

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019

This is ready for review now. I have sanitized as much as possible SIGSEVs, memory leaks and so on

Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 8 of 9 files at r4, 1 of 1 files at r5.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @bzz, @creachadair, @dennwc, and @ncordon)


src/main/native/jni_utils.cc, line 67 at r5 (raw file):

const char FIELD_ITER_NODE[] = "Ljava/lang/Object;";
const char FIELD_ITER_CTX[] = "Lorg/bblfsh/client/v2/Context;";
const char FIELD_ITER_EXT_NODE[] = "Ljava/lang/Object;";

Really Object rather than Node or NodeExt? (That's fine if intended, but might benefit from a comment)


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 100 at r4 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

Rationale behind this is that I have to tie the ContextExt in the C++ side to the jCtx: ContextExt Scala object. That way every time that we ask for a node and we are returned a NodeExt, it can box jCtx so GC thinks: okay, jCtx is being used by a NodeExt so I cannot deallocate it. Former implementation only included the handle (a pointer) in the ScalaNodeExt. That did not prevent GC from collecting jCtx before we ended up working with our node (for example doing iterators). Same criteria applies to iterators, which also box a Context / ContextExt now.

I think that makes sense, since we want to ensure the mirror object doesn't exit its dynamic scope until the real one does.


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 613 at r4 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

p also deletes ctx

Maybe put that directly into the comment? (e.g., // Since p owns the underlying context, deleting p deletes that too.)


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 847 at r5 (raw file):

  if (p) {
    delete p;
    setHandle<Context>(env, self, 0, nativeContext);

Is there a race condition here where JVM might access the pointer that has just been deleted before we finish updating the field? (I don't know that there is—but if so this could be use-after-free; I think nulling the handle first would avert that problem)


src/main/scala/org/bblfsh/client/v2/BblfshClient.scala, line 153 at r5 (raw file):

      buf.copyTo(bufDirectCopy)
      val result = BblfshClient.decode(bufDirectCopy)
      // Sometimes the direct buffer can take a lot to deallocate,

Just to double check that I understood this correctly:

The issue is that the handle is small, so it doesn't add memory pressure to the JVM heap, but it pins a large buffer on the mirror side which keeps the process RSS high until other JVM allocations trigger a GC?

I wonder if we could avoid doing this every time by only doing the GC when we know the buffer size is "fairly big" for some reasonable definition? (E.g., > 1MiB or something)

@bzz bzz removed their assignment Aug 27, 2019
Copy link
Contributor Author

@bzz bzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 9 files at r4, 1 of 1 files at r5.
Reviewable status: all files reviewed, 18 unresolved discussions (waiting on @bzz, @creachadair, @dennwc, and @ncordon)


src/main/native/jni_utils.cc, line 69 at r5 (raw file):

const char FIELD_ITER_EXT_NODE[] = "Ljava/lang/Object;";
const char FIELD_ITER_EXT_CTX[] = "Lorg/bblfsh/client/v2/ContextExt;";
const char FIELD_NODE_EXT_CTX[] = "Lorg/bblfsh/client/v2/ContextExt;";

may be one constant could be enough here, since the signatures are the same?


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 100 at r4 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

I think that makes sense, since we want to ensure the mirror object doesn't exit its dynamic scope until the real one does.

👍


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 163 at r4 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

jCtxExt being a WeakReference makes possible for GC to collect jCtxExt from outside the JNI if the only thing that is using it is the native ContextExt. If not, we would end up with an auto-reference to deallocate: Scala ContextExt has a handle to a native ContextExt which also contains JCtxExt, with the result of never deallocating anything

As this is a Scala project - a minor suggestion would be to avoiding using Java until it really means something languages-specific to void Scala/Java compatibility confusion.

One terminology distinction I found useful: instead of constantly confusing C/C++ vs Scala/Java, that is a native part VS managed part that works for multiple client implementations. Or just JVM to refer a particular managed environment.

One side-effect that may be worth documenting is - jCtxExt may now have a valid reference (not null) that points to the JVM null object. So every native client consuming this (if we have one) might need to do env->IsSameObject(jCtxExt, null).

It's even a bit more involved - every native client would need to get actually local ref from a weak one, even if he did made the above check - the weak ref might be GCed from another thread after the check and before the usage and thus will refer an null JVM object anyway.


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 63 at r5 (raw file):

void setObjectField(JNIEnv *env, jobject obj, jobject field, const char *name, const char *sig) {
  jfieldID fId = FieldID(env, obj, name, sig);
  env->SetObjectField(obj, fId, field);

We should probably also either check for JNI error/exception here, or at every client of this method with checkJvmException.


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 106 at r5 (raw file):

    JNIEnv *env = getJNIEnv();
    jobject jObj = NewJavaObject(env, CLS_NODE, "(Lorg/bblfsh/client/v2/ContextExt;J)V", jCtxExt, node);

probably nit pick, but as soon as signature gets longer and becomes very specific - it might be worth extracting a named constant there


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 611 at r5 (raw file):

    // NodeExt contains a ctx: ContextExt and ContextExt the
    // handle for the native context, called nativeContext
    jobject jCtxExt = ObjectField(env, src, "ctx", FIELD_NODE_EXT_CTX);

How about

// NodeExt contains a ctx: ContextExt (JVM ref) and a nativeContext: ContextExt (handle)

or some words to that effect in order to avoid the confusion of : notation only used for one field, but not for the other?


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 653 at r5 (raw file):

  jobject jCtxExt = NewJavaObject(env, CLS_CTX_EXT, "(J)V", p);

  // Associates the JVM context ext to the native ContextExt

May be worth spelling actual JVM class name here as well, e.g

// Saves weak reference to JVM ContextExt in the native ContextExt

or words to that effect.


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 740 at r5 (raw file):

  jobject jCtxExt = ObjectField(env, nodeExt, "ctx", FIELD_NODE_EXT_CTX);

  if (!jCtxExt)

Is there a reason no to follow the convention of not having empty line between getting the value and an error check here?


src/main/native/project/build.properties, line 1 at r5 (raw file):

sbt.version=1.2.8

Why is this dir and file needed, in such an unusual place?


src/test/scala/org/bblfsh/client/v2/libuast/IteratorNativeTest.scala, line 48 at r4 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

This is ugly Scala, I know, since null references should be avoided. But before this was a handle and null value was 0. Here either we clean the values with ContextExt(0) or assign null. I am inclined for the second option

In this case it's very reasonable and looks good to me.

@bzz
Copy link
Contributor Author

bzz commented Aug 27, 2019

Can not add myself to reviewer box in GH UI

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019

Probably because you opened the PR? Do not worry, I will not merge this until you approve changes

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/jni_utils.cc, line 69 at r5 (raw file):

Previously, bzz (Alexander) wrote…

may be one constant could be enough here, since the signatures are the same?

I am going to name fields but what they return and not the class they are applied to

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/jni_utils.cc, line 69 at r5 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

I am going to name fields but what they return and not the class they are applied to

*by what they return

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/jni_utils.cc, line 67 at r5 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Really Object rather than Node or NodeExt? (That's fine if intended, but might benefit from a comment)

Yes, that was intentional. It was @bzz who first noticed it, but Scala classes for iterators extend an abstract class that is "templated" in a type: abstract class UastAbstractIter[T >: Null](var node: T, var treeOrder: Int, var iter: Long) and node must be accessed as an arbitrary object. Probably because Scala is compiling this in Java as a type that both NodeExt and Node extend

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 63 at r5 (raw file):

Previously, bzz (Alexander) wrote…

We should probably also either check for JNI error/exception here, or at every client of this method with checkJvmException.

Done! Also added the check to setHandle function

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 106 at r5 (raw file):

Previously, bzz (Alexander) wrote…

probably nit pick, but as soon as signature gets longer and becomes very specific - it might be worth extracting a named constant there

Done!

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 163 at r4 (raw file):

Previously, bzz (Alexander) wrote…

As this is a Scala project - a minor suggestion would be to avoiding using Java until it really means something languages-specific to void Scala/Java compatibility confusion.

One terminology distinction I found useful: instead of constantly confusing C/C++ vs Scala/Java, that is a native part VS managed part that works for multiple client implementations. Or just JVM to refer a particular managed environment.

One side-effect that may be worth documenting is - jCtxExt may now have a valid reference (not null) that points to the JVM null object. So every native client consuming this (if we have one) might need to do env->IsSameObject(jCtxExt, null).

It's even a bit more involved - every native client would need to get actually local ref from a weak one, even if he did made the above check - the weak ref might be GCed from another thread after the check and before the usage and thus will refer an null JVM object anyway.

I have changed the name of the method to setManagedContext. Thanks for the suggestion.

I have a question regarding the second part because I am not sure it can happen that jCtxExt is not null and the object it points to is. I mean, the Scala jCtx: Context is associated to a native ctx: Context which contains the weak reference jCtxExt = jCtx. If jCtx points null, it means we have called dispose on it, and therefore we have deleted ctx and jCtxExt. The only way we can access jCtxExt and point to null is if we manipulate it (copy it for example and then call dispose in the managed side) from the C++ side, isn't? If we return it to the client wrapped in a NodeExt for example it stops being a weak reference.

Probably I misunderstood what you said, so feel free to correct me please :)

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 435 at r2 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

I will fix this, and yes, I'll try to be consistent with the current style :)

This was changed already :)

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 368 at r2 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

I am going to restructure docs then

Done!

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 611 at r5 (raw file):

// NodeExt contains a ctx: ContextExt (JVM ref) and a nativeContext: ContextExt (handle)
Yeah, much better

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 611 at r5 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

// NodeExt contains a ctx: ContextExt (JVM ref) and a nativeContext: ContextExt (handle)
Yeah, much better

Changed now!

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 653 at r5 (raw file):

// Saves weak reference to JVM ContextExt in the native ContextExt

Done!

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 613 at r4 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Maybe put that directly into the comment? (e.g., // Since p owns the underlying context, deleting p deletes that too.)

This is strange, because the comment is there, but Reviewable is eating that line:

  if (env->ExceptionCheck() || !jCtxExt) {
    jCtxExt = nullptr;
    // This also deletes the underlying ctx
    delete (p);
    checkJvmException("failed to instantiate ContextExt class");
  }

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 740 at r5 (raw file):

Previously, bzz (Alexander) wrote…

Is there a reason no to follow the convention of not having empty line between getting the value and an error check here?

None, only myself not being consistent 😆

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 847 at r5 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Is there a race condition here where JVM might access the pointer that has just been deleted before we finish updating the field? (I don't know that there is—but if so this could be use-after-free; I think nulling the handle first would avert that problem)

Yeah good advice. I do not know if there is, but as a matter of fact we cannot put dispose as private methods, so it is still worth nulling the field.

@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/native/project/build.properties, line 1 at r5 (raw file):

Previously, bzz (Alexander) wrote…

Why is this dir and file needed, in such an unusual place?

Sorry, it seems like if I tried to open sbt in the native folder by mistake. Deleted

Complies with the team Reviewable review

Signed-off-by: ncordon <[email protected]>
@ncordon
Copy link
Member

ncordon commented Aug 27, 2019


src/main/scala/org/bblfsh/client/v2/BblfshClient.scala, line 153 at r5 (raw file):

Previously, creachadair (M. J. Fromberger) wrote…

Just to double check that I understood this correctly:

The issue is that the handle is small, so it doesn't add memory pressure to the JVM heap, but it pins a large buffer on the mirror side which keeps the process RSS high until other JVM allocations trigger a GC?

I wonder if we could avoid doing this every time by only doing the GC when we know the buffer size is "fairly big" for some reasonable definition? (E.g., > 1MiB or something)

Yeah, almost. I would add that since the buffer size is going to be much higher than the JVM ContextExt we generate (in JVM heap, and they only contain a Long) we generate with it, we would have to store a huge deal of ContextExt for the GC to wake up if we do not use this line. And we would be using a lot of RAM (I have peaked at 7 GiB) without the GC collecting the buffers 🙄 Furthermore, if the buffer is not released, the Garbage Collector cannot release the native memory for ContextExt (because somehow interprets that ContextExt is using it, probably because of the call stack we generated and because it keeps a dependency tree inside to know what to deallocate first).

Copy link
Contributor

@creachadair creachadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r6.
Reviewable status: all files reviewed, 11 unresolved discussions (waiting on @bzz, @dennwc, and @ncordon)


src/main/native/jni_utils.cc, line 67 at r5 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

Yes, that was intentional. It was @bzz who first noticed it, but Scala classes for iterators extend an abstract class that is "templated" in a type: abstract class UastAbstractIter[T >: Null](var node: T, var treeOrder: Int, var iter: Long) and node must be accessed as an arbitrary object. Probably because Scala is compiling this in Java as a type that both NodeExt and Node extend

👍


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 613 at r4 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

This is strange, because the comment is there, but Reviewable is eating that line:

  if (env->ExceptionCheck() || !jCtxExt) {
    jCtxExt = nullptr;
    // This also deletes the underlying ctx
    delete (p);
    checkJvmException("failed to instantiate ContextExt class");
  }

Yeah, I don't know what was going on there. Maybe I just looked at the wrong section.


src/main/native/org_bblfsh_client_v2_libuast_Libuast.cc, line 847 at r5 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

Yeah good advice. I do not know if there is, but as a matter of fact we cannot put dispose as private methods, so it is still worth nulling the field.

It definitely make sense to null it either way, as you've done here—but I meant about the order between nulling and deletion.

But then I realized it doesn't actually matter, because even if we null first, it is possible for a field access to capture the handle just before the null and use it after the free, even if we've done it in the other order. So we are going to have to rely on the runtime to synchronize either way.


src/main/scala/org/bblfsh/client/v2/BblfshClient.scala, line 153 at r5 (raw file):

Previously, ncordon (Nacho Cordón) wrote…

Yeah, almost. I would add that since the buffer size is going to be much higher than the JVM ContextExt we generate (in JVM heap, and they only contain a Long) we generate with it, we would have to store a huge deal of ContextExt for the GC to wake up if we do not use this line. And we would be using a lot of RAM (I have peaked at 7 GiB) without the GC collecting the buffers 🙄 Furthermore, if the buffer is not released, the Garbage Collector cannot release the native memory for ContextExt (because somehow interprets that ContextExt is using it, probably because of the call stack we generated and because it keeps a dependency tree inside to know what to deallocate first).

👍

@ncordon ncordon merged commit 44375ee into master Aug 28, 2019
@ncordon ncordon deleted the fix-leaky-keys branch August 28, 2019 17:07
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants