Xplorer provides a set of functions for convenient exploration of
datasets in R, see()
and browse()
.
see()
prints the following details of an object to the console:
- attributes (useful to inspect labels)
- class, typeof, mode, storage.mode
- table (frequency and percentage of values)
- summary statistics
There are several other packages which attempt to do someting similar,
but none of them offers the ease of use and thourough inspection of
ojects. For example, base R offers the functions summary()
, str()
and structure()
. However, each of those is limited in what they
display and leave other aspects untouched. see()
combines a number of
functions that would have to be invoked seperately and mirrors Stata’s
‘codebook’.
see()
is an S3 generic and currently comes with the following data
types:
- character
- factor
- numeric
- data.frame
- labelled
- default (all other types)
library("Xplorer")
# Create example data
numeric.labelled <- c(1, 2, 2, 3, 3, 3, 4, 4, 4, 4)
attributes(numeric.labelled)$label <- "mylabel"
# see the object
see(numeric.labelled)
#> $label
#> [1] "mylabel"
#>
#>
#>
#> Class Typeof Mode Storage.mode
#> numeric double numeric double
#>
#>
#> Frequency Percent
#> 1 1 10
#> 2 2 20
#> 3 3 30
#> 4 4 40
#>
#>
#> N Missings Distinct.values Min Max Mean Median Std.Deviation Variance
#> 10 0 4 1 4 3 3 1.054093 1.111111
The aim of browse()
is to create a data.frame that allows you to use
the search field in the Rstudio View()
function to search for
variables based on their names or values, which is currently not
possible. In particular if a dataset is very large and no comprehensive
codebook is available, quickly searching for variable names makes
finding variables of interest much easier.
If you are not working in Rstudio, View()
may not work in the same way
and may not offer a search option. Hence browse()
does not call
View()
directly. Instead, store the data in a new object first, which
then can be opened with View()
in Rstudio. browse()
can also be
called on a data.frame directly to print the output to the console.
library("Xplorer")
# Create an example data.frame and add some variable labels
data(mtcars)
attributes(mtcars$mpg)$label <- "Miles/(US) gallon"
attributes(mtcars$cyl)$label <- "Number of cylinders"
attributes(mtcars$disp)$label <- "Displacement (cu.in.)"
attributes(mtcars$hp)$label <- "Gross horsepower"
attributes(mtcars$drat)$label <- "Rear axle ratio"
attributes(mtcars$wt)$label <- "Weight (1000 lbs)"
attributes(mtcars$qsec)$label <- "1/4 mile time"
attributes(mtcars$vs)$label <- "V/S"
# Call browse() directly...
browse(mtcars)
#> variable_name variable_label range distinct_values
#> 1 mpg Miles/(US) gallon 10.4, 33.9 25
#> 2 cyl Number of cylinders 4, 8 3
#> 3 disp Displacement (cu.in.) 71.1, 472.0 27
#> 4 hp Gross horsepower 52, 335 22
#> 5 drat Rear axle ratio 2.76, 4.93 22
#> 6 wt Weight (1000 lbs) 1.513, 5.424 29
#> 7 qsec 1/4 mile time 14.5, 22.9 30
#> 8 vs V/S 0, 1 2
#> 9 am - 0, 1 2
#> 10 gear - 3, 5 3
#> 11 carb - 1, 8 6
#> class typeof N missings
#> 1 numeric double 32 0
#> 2 numeric double 32 0
#> 3 numeric double 32 0
#> 4 numeric double 32 0
#> 5 numeric double 32 0
#> 6 numeric double 32 0
#> 7 numeric double 32 0
#> 8 numeric double 32 0
#> 9 numeric double 32 0
#> 10 numeric double 32 0
#> 11 numeric double 32 0
# ...or assign to df and open with View()
df <- browse(mtcars)
Then use View(df)
and use the search field to search variable names
and labels. Or, even simpler, just use View(browse(mtcars))
.
You can install Xplorer from GitHub with:
remotes::install_github("fschaffner/Xplorer")
Please report issues or requests for additional functionality to https://github.com/fschaffner/Xplorer/issues.