Skip to contents

Sample Data Frames by a Group Variable

Usage

group_sample(
  data,
  group,
  n = 1,
  prop = NULL,
  prob = NULL,
  group_output = FALSE
)

Arguments

data

A data frame or tibble with at least 1 variable.

group

A variable in data that will be used for groupings.

n, prop

Supply either n, the number of groups, or prop, the proportion of groups to select. n must be a positive integer that is greater than or equal to 1. prop must be a positive numeric value that is greater than 0 and less than or equal to 1.

Default is n = 1.

prob

Optional. A vector of probability weights for obtaining the elements of the group being sampled. Must be the same length as the total unique values in data's group variable.

group_output

A logical boolean TRUE or FALSE. If TRUE, returns a grouped tibble.

Default is FALSE.

Value

A sampled dataframe

Examples

vec_coords <- 1:10
df_data <-
 data.frame(
   "x" = vec_coords,
   "y" = vec_coords,
   "group_col" = group_numbers(1:5) |> rep(each = 2)
 )

df_sampled_data_prop <-
 df_data |>
 group_sample(group_col, prop = .2)

df_sampled_data_prop
#>    x  y group_col
#> 1  9  9         5
#> 2 10 10         5

df_sampled_data_n <-
 df_data |>
 group_sample(group_col, n = 2)

df_sampled_data_n
#>   x y group_col
#> 1 5 5         3
#> 2 6 6         3
#> 3 7 7         4
#> 4 8 8         4