Sample Data Frames by a Group Variable
Arguments
- data
A data frame or tibble with at least 1 variable.
- group
A variable in
data
that will be used for groupings.- n, prop
Supply either
n
, the number of groups, orprop
, the proportion of groups to select.n
must be a positive integer that is greater than or equal to 1.prop
must be a positive numeric value that is greater than 0 and less than or equal to 1.Default is
n
= 1.- prob
Optional. A vector of probability weights for obtaining the elements of the group being sampled. Must be the same length as the total unique values in
data
'sgroup
variable.- group_output
A logical boolean
TRUE
orFALSE
. IfTRUE
, returns a grouped tibble.Default is
FALSE
.
Examples
vec_coords <- 1:10
df_data <-
data.frame(
"x" = vec_coords,
"y" = vec_coords,
"group_col" = group_numbers(1:5) |> rep(each = 2)
)
df_sampled_data_prop <-
df_data |>
group_sample(group_col, prop = .2)
df_sampled_data_prop
#> x y group_col
#> 1 9 9 5
#> 2 10 10 5
df_sampled_data_n <-
df_data |>
group_sample(group_col, n = 2)
df_sampled_data_n
#> x y group_col
#> 1 5 5 3
#> 2 6 6 3
#> 3 7 7 4
#> 4 8 8 4