-
Notifications
You must be signed in to change notification settings - Fork 76
Join kdocs #1574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Join kdocs #1574
Conversation
Jolanrensen
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I ever made any KoDEx proficiency awards, you'd have earned one by now :D.
Great docs! Just some small notes: Check @ExcludeFromSources to not grow the .jar size. Anything that's not reachable by the user and just used to @include somewhere else can be excluded.
Adding a String example inside the DSLs would be nice
| * such columns are treated as distinct. Such a column from the right [DataFrame] will be automatically | ||
| * renamed in the resulting [DataFrame]. | ||
| */ | ||
| private interface JoinBehavior |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ExcludeFromSources, check other places too :)
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/join.kt
Outdated
Show resolved
Hide resolved
| private interface JoinBehavior | ||
|
|
||
| /** | ||
| * Joins this [DataFrame] with [other][\other] [DataFrame] using selected key columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*the other, the selected key columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check other places too
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/join.kt
Outdated
Show resolved
Hide resolved
| * // Join rows where the `date` value in the right DataFrame | ||
| * // falls within the interval defined by the `startDate` and `endDate` | ||
| * // values in the left DataFrame. | ||
| * dfLeft.{@get [JoinWithMethod] joinWith}(dfRight) { right.date in startDate..endDate } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as the other example, can you add another with Strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it looks really awful in this DSL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still, it's important for people to see it's possible
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/joinWith.kt
Outdated
Show resolved
Hide resolved
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/joinWith.kt
Outdated
Show resolved
Hide resolved
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/joinWith.kt
Outdated
Show resolved
Hide resolved
| joinWith(right, JoinType.Full, joinExpression) | ||
|
|
||
| /** | ||
| * Performs [filter join][JoinType.Filter] of this [DataFrame] with the [right][\right] [DataFrame] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm wondering... Is "Performs a filter join of this DataFrame with the right DataFrame" better? Depends whether we think of "join" as a special concept or as a generic action. You can say "I performed a cleanup", but at the same time "I performed gradlew clean".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, a good question!
I believe you’re right — “performs a join” is better since join is used here as a technical operation rather than just an action.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check correct usage of "a" vs "an" :)
core/src/main/kotlin/org/jetbrains/kotlinx/dataframe/api/join.kt
Outdated
Show resolved
Hide resolved
| filterJoin(other) { columns.toColumnSet() } | ||
|
|
||
| /** | ||
| * Performs a [exclude join][JoinType.Exclude] of this [DataFrame] with [other][\other] [DataFrame] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*an
| ): DataFrame<A> = join(other, type) { columns.toColumnSet() } | ||
|
|
||
| /** | ||
| * Performs a [inner join][JoinType.Inner] of this [DataFrame] with the [other][\other] [DataFrame] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*an
| private interface JoinDocs | ||
|
|
||
| // `join` method used in the example | ||
| @Suppress("ktlint:standard:class-naming") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you @Suppress("ClassName") you suppress both intelliJ and KtLint at the same time :)
| * | ||
| * In such cases, columns with duplicate names are automatically renamed | ||
| * using the pattern `"\$name\$n"`, where `name` is the original column name | ||
| * and `n` is a unique index (1, 2, 3, and so on); the first name column goes without a number. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the first time the name of the column is encountered, no number is appended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you mean "not added"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"appending" is adding something at the end
| * dfLeft.{@get [JoinMethod] join}(dfRight, "name", "city") | ||
| * ``` | ||
| */ | ||
| private interface JoinStringApiExample |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
exclude from sources
| leftJoin(other) { columns.toColumnSet() } | ||
|
|
||
| /** | ||
| * Performs a [right join][JoinType.Right] of this [DataFrame] with [other][\other] [DataFrame] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*the other
|
|
||
| /** | ||
| * Performs a [right join][JoinType.Right] of this [DataFrame] with [other][\other] [DataFrame] | ||
| * using selected key columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the selected key columns
| * Each join type has a corresponding shortcut function: | ||
| * [innerJoinWith], [leftJoinWith], [rightJoinWith], [fullJoinWith], [filterJoinWith], and [excludeJoinWith]. | ||
| * | ||
| * See also [join], which performs a join by exact values equality in the selected columns. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*exact value equality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Like "I sort my directory by file size" not "files size". It's an attribute of a single file, so it should be in singular form. (In Dutch/German it would be 1 word)
It's a bit strange with equality, I agree, as equality can only be expressed between two values. So maybe it could be better written like "which performs a join based on the exact equality between values in the selected columns"
| ): DataFrame<A> = joinWithImpl(right, type, addNewColumns = type.addNewColumns, joinExpression) | ||
|
|
||
| /** | ||
| * Performs a [inner join][JoinType.Inner] of this [DataFrame] with the [right][\right] [DataFrame] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*an
| * | ||
| * // String API; join rows where `score` value in the left `DataFrame` is higher than 3.4 | ||
| * // and the `passed` value in the right `DataFrame` is `true`. | ||
| * dfLeft.{@get [JoinWithMethod] joinWith}(dfRight) { "score"<Int>() > 3.4 && right["passed"] as Boolean } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
You can also avoid the cast like this by writing:
right[{ "passed"<Boolean>() }]
or
right.getValue<Boolean>("passed")
whatever you like best :)
please also add this example to the "normal" join
Closes #1387.