-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Validate connections before using them and expose validation as config. #249
Conversation
return new DatabaseConnectionConfiguration(adapter, host, dbName, port, parallelTesting, username, password); | ||
Optional<Integer> connectionMaxRetries = loadOpt(dbInfo, "connection_max_retries", | ||
envVar(CONNECTION_MAX_RETRIES, dbNameKey), prop(CONNECTION_MAX_RETRIES, dbNameKey), new ToLong()) | ||
.transform(Long::intValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is super annoying, but you need to explicitly convert these to int
s like this because the cast at https://github.com/LiveRamp/jack/blob/master/jack-core/src/com/rapleaf/jack/DatabaseConnectionConfiguration.java#L164 is not sufficient, and the underlying type is a Long
, not an Integer
because the YAML loader reads all numbers as Long
s. The only reason we don't see this issue with the port
config as well is because we never actually treat it as an int
later on.
Without doing this, we'll get a ClassCastException
in DatabaseConnection
at this line: https://github.com/LiveRamp/jack/pull/249/files#diff-bb343e2007095498d574a0c34e39f6c9R98
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation here!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Good explanation. Could you add some of that as a comment? I think I'd have a hard time reconstructing that just from the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I added the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM % a few nits. You'll probably want to get some other eyes on this from #dev-cat (in other words, I don't trust myself to be a primary reviewer on jack
code).
FYI @pwestling is no longer with us at LiveRamp...
jack-core/src/com/rapleaf/jack/DatabaseConnectionConfiguration.java
Outdated
Show resolved
Hide resolved
return new DatabaseConnectionConfiguration(adapter, host, dbName, port, parallelTesting, username, password); | ||
Optional<Integer> connectionMaxRetries = loadOpt(dbInfo, "connection_max_retries", | ||
envVar(CONNECTION_MAX_RETRIES, dbNameKey), prop(CONNECTION_MAX_RETRIES, dbNameKey), new ToLong()) | ||
.transform(Long::intValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the explanation here!
8b5e3e9
to
f1f30c2
Compare
Unfortunately, this PR doesn't address connections that are pooled by the The other option would be to implement Edit: So it does actually work. At some point
|
f1f30c2
to
f790a75
Compare
I decided to implement the above comment, which validates connections from the connection pool on borrowing them. The commit message, copied below, explains the logic for doing this:
|
jack-core/pom.xml
Outdated
@@ -71,7 +71,7 @@ | |||
<dependency> | |||
<groupId>org.apache.commons</groupId> | |||
<artifactId>commons-pool2</artifactId> | |||
<version>2.4.2</version> | |||
<version>2.7.0</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, there is an issue in commons-pool2
described here: #195 (comment)
I roughly remembered that the root cause is that the new timer object is not run as a daemon thread and can only be shutdown manually, which will block workflow termination. Not sure if the issue has been addressed in 2.7.0
. I highly doubt that.
Although the object pool should be explicitly closed in general / as a best practice, but when I left LiveRamp, most of the workflows don't do that. Also when the object pool was introduced to jack, one of the goal was to shield user from having to worry about closing and releasing the resource.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay. I will put it back to 2.4.2 then. The bug which caused me to update this is not so serious, it just means we validate the connections on create as well as borrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any other feedback, or does this look good from your side @tuliren? You seem like the correct person to ask considering you have the most commits to this repo ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good.
I guess maybe there should be a comment here that warns people about the commons-pool2
version issue. Can you add that with this PR? Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the comment. Thanks for the feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
18d4a7c
to
3077e05
Compare
This is technically covered by the implementation of most methods in `AbstractDatabaseModel`, which check for exceptions and then call `resetConnection`, but this explicitly ensures that no matter the underlying method implementation, we'll still validate the connection before using it. We also validate while connections are idle to reduce the number of destroy/create cycles that we might need at the time of borrowing a connection. This is especially relevant for long running processes that might attempt to use very old pooled connections.
3077e05
to
e42c212
Compare
isClosed
is not sufficient to validate connections. You need to check that they are actually still usable, which is whatisValid
does. This PR prevents applications from seeing issues when databases failover and connections are no longer valid.