@@ -69,13 +69,13 @@ rows having the index value of `'c'`.
6969| Reduce multiple values | ` df['z'].mean(skipna = False) ` | ` mean(df.z) ` |
7070| | ` df['z'].mean() ` | ` mean(skipmissing(df.z)) ` |
7171| | ` df[['z']].agg(['mean']) ` | ` combine(df, :z => mean ∘ skipmissing) ` |
72- | Add new columns | ` df.assign(z1 = df['z'] + 1) ` | ` df.z1 = df.z .+ 1 ` |
73- | | | ` insertcols!(df, :z1 => df.z .+ 1) ` |
74- | | | ` transform(df, :z => (v -> v .+ 1) => :z1) ` |
72+ | Add new columns | ` df.assign(z1 = df['z'] + 1) ` | ` transform(df, :z => (v -> v .+ 1) => :z1) ` |
7573| Rename columns | ` df.rename(columns = {'x': 'x_new'}) ` | ` rename(df, :x => :x_new) ` |
7674| Pick & transform columns | ` df.assign(x_mean = df['x'].mean())[['x_mean', 'y']] ` | ` select(df, :x => mean, :y) ` |
7775| Sort rows | ` df.sort_values(by = 'x') ` | ` sort(df, :x) ` |
7876| | ` df.sort_values(by = ['grp', 'x'], ascending = [True, False]) ` | ` sort(df, [:grp, order(:x, rev = true)]) ` |
77+ | Drop missing rows | ` df.dropna() ` | ` dropmissing(df) ` |
78+ | Select unique rows | ` df.drop_duplicates() ` | ` unique(df) ` |
7979
8080Note that pandas skips ` NaN ` values in its analytic functions by default. By contrast,
8181Julia functions do not skip ` NaN ` 's. If necessary, you can filter out
@@ -93,6 +93,21 @@ examples above do not synchronize the column names between pandas and DataFrames
9393(you can pass ` renamecols=false ` keyword argument to ` select ` , ` transform ` and
9494` combine ` functions to retain old column names).
9595
96+ ### Mutating operations
97+
98+ | Operation | pandas | DataFrames.jl |
99+ | :----------------- | :---------------------------------------------------- | :------------------------------------------- |
100+ | Add new columns | ` df['z1'] = df['z'] + 1 ` | ` df.z1 = df.z .+ 1 ` |
101+ | | | ` transform!(df, :z => (x -> x .+ 1) => :z1) ` |
102+ | | ` df.insert(1, 'const', 10) ` | ` insertcols!(df, 2, :const => 10) ` |
103+ | Rename columns | ` df.rename(columns = {'x': 'x_new'}, inplace = True) ` | ` rename!(df, :x => :x_new) ` |
104+ | Sort rows | ` df.sort_values(by = 'x', inplace = True) ` | ` sort!(df, :x) ` |
105+ | Drop missing rows | ` df.dropna(inplace = True) ` | ` dropmissing!(df) ` |
106+ | Select unique rows | ` df.drop_duplicates(inplace = True) ` | ` unique!(df) ` |
107+
108+ Generally speaking, DataFrames.jl follows the Julia convention of using ` ! ` in the
109+ function name to indicate mutation behavior.
110+
96111### Grouping data and aggregation
97112
98113DataFrames.jl provides a ` groupby ` function to apply operations
0 commit comments