In this vignette, we will demonstrate how to retrieve and work with metadata for NBDC studies, including data dictionaries, levels tables, session/event information, and identifier columns.
Overview
The NBDCtools
package provides functions to access the
following metadata elements for ABCD and HBCD studies:
- Data dictionary: Variable and table definitions, data types, and other information.
- Levels table: Value, labels, and order for categorical variables.
- Sessions table: Information about study sessions/events.
- Identifier columns: Variables used to identify unique observations.
Data dictionary
The data dictionary contains information about all variables in the tabulated data for a given study, including their names, labels, data types, which tables they belong to, etc. To read more about the structure of the data dictionary for NBDC studies, see here.
Basic usage
# Get data dictionary for the ABCD Study (latest release)
get_dd("abcd")
#> # A tibble: 83,206 × 44
#> study domain sub_domain source metric atlas table_name table_label name
#> <chr> <chr> <chr> <chr> <chr> <chr> <glue> <glue> <glu>
#> 1 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 2 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 3 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 4 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 5 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 6 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 7 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 8 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 9 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 10 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> # ℹ 83,196 more rows
#> # ℹ 35 more variables: label <glue>, instruction <chr>, header <chr>,
#> # note <chr>, unit <chr>, type_var <chr>, type_data <chr>, type_level <chr>,
#> # type_field <chr>, order_display <chr>, branching_logic <chr>,
#> # label_es <chr>, instruction_es <chr>, header_es <chr>, note_es <chr>,
#> # table_nda <chr>, table_nda_5_0 <chr>, table_redcap <chr>, name_nda <chr>,
#> # name_deap <chr>, name_redcap <chr>, name_redcap_exp <chr>, …
# Get data dictionary for the HBCD Study (latest release)
get_dd("hbcd")
#> # A tibble: 48,699 × 30
#> study domain source table_name table_label name label instruction header
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Spec… NA NA
#> 2 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Nail… NA NA
#> 3 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Numb… NA NA
#> 4 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Spec… NA NA
#> 5 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Any … NA NA
#> 6 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Scre… NA NA
#> 7 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Scre… NA NA
#> 8 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Conf… NA NA
#> 9 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Conf… NA NA
#> 10 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Conf… NA NA
#> # ℹ 48,689 more rows
#> # ℹ 21 more variables: note <chr>, unit <chr>, type_var <chr>, type_data <chr>,
#> # type_level <chr>, type_field <chr>, order_display <chr>,
#> # branching_logic <chr>, label_es <chr>, instruction_es <chr>,
#> # header_es <chr>, note_es <chr>, name_short <chr>, name_stata <chr>,
#> # url_table <chr>, url_warn_use <chr>, url_warn_data <chr>,
#> # url_table_warn_use <chr>, url_table_warn_data <chr>, …
# Get data dictionary for a specific release
get_dd("abcd", release = "6.0")
#> # A tibble: 83,206 × 44
#> study domain sub_domain source metric atlas table_name table_label name
#> <chr> <chr> <chr> <chr> <chr> <chr> <glue> <glue> <glu>
#> 1 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 2 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 3 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 4 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 5 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 6 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 7 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 8 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 9 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 10 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> # ℹ 83,196 more rows
#> # ℹ 35 more variables: label <glue>, instruction <chr>, header <chr>,
#> # note <chr>, unit <chr>, type_var <chr>, type_data <chr>, type_level <chr>,
#> # type_field <chr>, order_display <chr>, branching_logic <chr>,
#> # label_es <chr>, instruction_es <chr>, header_es <chr>, note_es <chr>,
#> # table_nda <chr>, table_nda_5_0 <chr>, table_redcap <chr>, name_nda <chr>,
#> # name_deap <chr>, name_redcap <chr>, name_redcap_exp <chr>, …
# If you are not sure what releases are available, just use a random number
# and the function will return an error message presenting the available
# releases
get_dd("abcd", release = "999.0")
#> Error in `get_dd()`:
#> ! Invalid release '999.0'. Valid releases are: 6.0, latest
#> If you believe this version should exist, it might be the metadata
#> is outdated. Please update the `NBDCtoolsData` package to get the latest metadata.
Study-specific functions
For convenience, you can use study-specific functions that do not require specifying the study parameter:
# ABCD-specific function
get_dd_abcd()
#> # A tibble: 83,206 × 44
#> study domain sub_domain source metric atlas table_name table_label name
#> <chr> <chr> <chr> <chr> <chr> <chr> <glue> <glue> <glu>
#> 1 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 2 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 3 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 4 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 5 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 6 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 7 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 8 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 9 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 10 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> # ℹ 83,196 more rows
#> # ℹ 35 more variables: label <glue>, instruction <chr>, header <chr>,
#> # note <chr>, unit <chr>, type_var <chr>, type_data <chr>, type_level <chr>,
#> # type_field <chr>, order_display <chr>, branching_logic <chr>,
#> # label_es <chr>, instruction_es <chr>, header_es <chr>, note_es <chr>,
#> # table_nda <chr>, table_nda_5_0 <chr>, table_redcap <chr>, name_nda <chr>,
#> # name_deap <chr>, name_redcap <chr>, name_redcap_exp <chr>, …
# HBCD-specific function
get_dd_hbcd(release = "1.0")
#> # A tibble: 48,699 × 30
#> study domain source table_name table_label name label instruction header
#> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Spec… NA NA
#> 2 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Nail… NA NA
#> 3 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Numb… NA NA
#> 4 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Spec… NA NA
#> 5 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Any … NA NA
#> 6 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Scre… NA NA
#> 7 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Scre… NA NA
#> 8 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Conf… NA NA
#> 9 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Conf… NA NA
#> 10 Core BioSpecim… Biolo… bio_bm_bi… USDTL Nail… bio_… Conf… NA NA
#> # ℹ 48,689 more rows
#> # ℹ 21 more variables: note <chr>, unit <chr>, type_var <chr>, type_data <chr>,
#> # type_level <chr>, type_field <chr>, order_display <chr>,
#> # branching_logic <chr>, label_es <chr>, instruction_es <chr>,
#> # header_es <chr>, note_es <chr>, name_short <chr>, name_stata <chr>,
#> # url_table <chr>, url_warn_use <chr>, url_warn_data <chr>,
#> # url_table_warn_use <chr>, url_table_warn_data <chr>, …
Filtering by variables
You can retrieve a subset of the data dictionary for specific variables:
# Get data dictionary for specific variables
vars_of_interest <- c(
"ab_g_dyn__visit_dtt",
"ab_g_dyn__visit_age",
"ab_g_stc__design_id__fam"
)
get_dd("abcd", vars = vars_of_interest)
#> # A tibble: 3 × 44
#> study domain sub_domain source metric atlas table_name table_label name label
#> <chr> <chr> <chr> <chr> <chr> <chr> <glue> <glue> <glu> <glu>
#> 1 Core ABCD … Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g… Visi…
#> 2 Core ABCD … Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g… Visi…
#> 3 Core ABCD … Standard … Gener… NA NA ab_g_stc ABCD Stati… ab_g… Desi…
#> # ℹ 34 more variables: instruction <chr>, header <chr>, note <chr>, unit <chr>,
#> # type_var <chr>, type_data <chr>, type_level <chr>, type_field <chr>,
#> # order_display <chr>, branching_logic <chr>, label_es <chr>,
#> # instruction_es <chr>, header_es <chr>, note_es <chr>, table_nda <chr>,
#> # table_nda_5_0 <chr>, table_redcap <chr>, name_nda <chr>, name_deap <chr>,
#> # name_redcap <chr>, name_redcap_exp <chr>, url_table <chr>,
#> # url_warn_use <chr>, url_warn_data <chr>, url_table_warn_use <chr>, …
Filtering by tables
You can also retrieve a subset of the data dictionary for specific tables:
# Get data dictionary for specific tables
tables_of_interest <- c(
"ab_g_dyn",
"ab_g_stc"
)
get_dd_abcd(tables = tables_of_interest)
#> # A tibble: 73 × 44
#> study domain sub_domain source metric atlas table_name table_label name
#> <chr> <chr> <chr> <chr> <chr> <chr> <glue> <glue> <glu>
#> 1 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 2 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 3 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 4 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 5 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 6 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 7 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 8 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 9 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 10 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> # ℹ 63 more rows
#> # ℹ 35 more variables: label <glue>, instruction <chr>, header <chr>,
#> # note <chr>, unit <chr>, type_var <chr>, type_data <chr>, type_level <chr>,
#> # type_field <chr>, order_display <chr>, branching_logic <chr>,
#> # label_es <chr>, instruction_es <chr>, header_es <chr>, note_es <chr>,
#> # table_nda <chr>, table_nda_5_0 <chr>, table_redcap <chr>, name_nda <chr>,
#> # name_deap <chr>, name_redcap <chr>, name_redcap_exp <chr>, …
Levels table
The levels table provides value labels for categorical variables, showing what each numeric value in the data corresponds to, as well as the order of levels. To read more about the structure of the levels table for NBDC studies, see here.
Basic usage
# Get levels table for ABCD Study (latest release)
get_levels("abcd")
#> # A tibble: 63,502 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__cohort_edu__cgs 1 1 Up to high school (No d… NA
#> 2 ab_g_dyn__cohort_edu__cgs 2 2 High school diploma/GED NA
#> 3 ab_g_dyn__cohort_edu__cgs 3 3 Some college NA
#> 4 ab_g_dyn__cohort_edu__cgs 4 4 Bachelor’s degree NA
#> 5 ab_g_dyn__cohort_edu__cgs 5 5 Graduate school or prof… NA
#> 6 ab_g_dyn__cohort_grade 0 1 Kindergarten NA
#> 7 ab_g_dyn__cohort_grade 1 2 1st grade NA
#> 8 ab_g_dyn__cohort_grade 2 3 2nd grade NA
#> 9 ab_g_dyn__cohort_grade 3 4 3rd grade NA
#> 10 ab_g_dyn__cohort_grade 4 5 4th grade NA
#> # ℹ 63,492 more rows
# Get levels table for HBCD Study (latest release)
get_levels("hbcd")
#> # A tibble: 17,051 × 5
#> name value order_level label label_es
#> <chr> <chr> <int> <chr> <lgl>
#> 1 bio_bm_biosample_nails_results_bio_test_ord… 1 1 Cust… NA
#> 2 bio_bm_biosample_nails_results_bio_test_ord… 2 2 Only… NA
#> 3 bio_bm_biosample_nails_results_bio_test_ord… 3 3 Canc… NA
#> 4 bio_bm_biosample_nails_results_bio_test_ord… 4 4 no r… NA
#> 5 bio_bm_biosample_nails_results_bio_c_any_sp… 1 1 posi… NA
#> 6 bio_bm_biosample_nails_results_bio_c_any_sp… 0 2 nega… NA
#> 7 bio_bm_biosample_nails_results_bio_c_any_sp… 3 3 QNS NA
#> 8 bio_bm_biosample_nails_results_bio_c_any_st… 1 1 posi… NA
#> 9 bio_bm_biosample_nails_results_bio_c_any_st… 0 2 nega… NA
#> 10 bio_bm_biosample_nails_results_bio_c_any_st… 3 3 QNS NA
#> # ℹ 17,041 more rows
# Get levels table for a specific release
get_levels("abcd", release = "6.0")
#> # A tibble: 63,502 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__cohort_edu__cgs 1 1 Up to high school (No d… NA
#> 2 ab_g_dyn__cohort_edu__cgs 2 2 High school diploma/GED NA
#> 3 ab_g_dyn__cohort_edu__cgs 3 3 Some college NA
#> 4 ab_g_dyn__cohort_edu__cgs 4 4 Bachelor’s degree NA
#> 5 ab_g_dyn__cohort_edu__cgs 5 5 Graduate school or prof… NA
#> 6 ab_g_dyn__cohort_grade 0 1 Kindergarten NA
#> 7 ab_g_dyn__cohort_grade 1 2 1st grade NA
#> 8 ab_g_dyn__cohort_grade 2 3 2nd grade NA
#> 9 ab_g_dyn__cohort_grade 3 4 3rd grade NA
#> 10 ab_g_dyn__cohort_grade 4 5 4th grade NA
#> # ℹ 63,492 more rows
Study-specific functions
# ABCD-specific function
get_levels_abcd(release = "6.0")
#> # A tibble: 63,502 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__cohort_edu__cgs 1 1 Up to high school (No d… NA
#> 2 ab_g_dyn__cohort_edu__cgs 2 2 High school diploma/GED NA
#> 3 ab_g_dyn__cohort_edu__cgs 3 3 Some college NA
#> 4 ab_g_dyn__cohort_edu__cgs 4 4 Bachelor’s degree NA
#> 5 ab_g_dyn__cohort_edu__cgs 5 5 Graduate school or prof… NA
#> 6 ab_g_dyn__cohort_grade 0 1 Kindergarten NA
#> 7 ab_g_dyn__cohort_grade 1 2 1st grade NA
#> 8 ab_g_dyn__cohort_grade 2 3 2nd grade NA
#> 9 ab_g_dyn__cohort_grade 3 4 3rd grade NA
#> 10 ab_g_dyn__cohort_grade 4 5 4th grade NA
#> # ℹ 63,492 more rows
# HBCD-specific function
get_levels_hbcd()
#> # A tibble: 17,051 × 5
#> name value order_level label label_es
#> <chr> <chr> <int> <chr> <lgl>
#> 1 bio_bm_biosample_nails_results_bio_test_ord… 1 1 Cust… NA
#> 2 bio_bm_biosample_nails_results_bio_test_ord… 2 2 Only… NA
#> 3 bio_bm_biosample_nails_results_bio_test_ord… 3 3 Canc… NA
#> 4 bio_bm_biosample_nails_results_bio_test_ord… 4 4 no r… NA
#> 5 bio_bm_biosample_nails_results_bio_c_any_sp… 1 1 posi… NA
#> 6 bio_bm_biosample_nails_results_bio_c_any_sp… 0 2 nega… NA
#> 7 bio_bm_biosample_nails_results_bio_c_any_sp… 3 3 QNS NA
#> 8 bio_bm_biosample_nails_results_bio_c_any_st… 1 1 posi… NA
#> 9 bio_bm_biosample_nails_results_bio_c_any_st… 0 2 nega… NA
#> 10 bio_bm_biosample_nails_results_bio_c_any_st… 3 3 QNS NA
#> # ℹ 17,041 more rows
Filtering by variables and/or tables
As for the data dictionary, you can also retrieve a subset of the levels table for specific variables or tables:
# Get levels for specific categorical variables
get_levels("abcd", vars = c("ab_g_dyn__visit_type"))
#> # A tibble: 3 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__visit_type 1 1 On-site NA
#> 2 ab_g_dyn__visit_type 2 2 Remote NA
#> 3 ab_g_dyn__visit_type 3 3 Hybrid NA
# Get levels for all categorical variables in specific tables
get_levels("abcd", tables = "ab_g_dyn")
#> # A tibble: 123 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__visit_type 1 1 On-site NA
#> 2 ab_g_dyn__visit_type 2 2 Remote NA
#> 3 ab_g_dyn__visit_type 3 3 Hybrid NA
#> 4 ab_g_dyn__visit__day1_inform 1 1 Biological mother NA
#> 5 ab_g_dyn__visit__day1_inform 2 2 Biological father NA
#> 6 ab_g_dyn__visit__day1_inform 3 3 Adoptive mother NA
#> 7 ab_g_dyn__visit__day1_inform 4 4 Adoptive father NA
#> 8 ab_g_dyn__visit__day1_inform 5 5 Custodial mother NA
#> 9 ab_g_dyn__visit__day1_inform 6 6 Custodial father NA
#> 10 ab_g_dyn__visit__day1_inform 7 7 Grandmother NA
#> # ℹ 113 more rows
# Get levels for a combination of specific variables and tables
get_levels_abcd(vars = "ab_g_dyn__visit_type", tables = "ab_g_stc")
#> # A tibble: 54 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__visit_type 1 1 On-site NA
#> 2 ab_g_dyn__visit_type 2 2 Remote NA
#> 3 ab_g_dyn__visit_type 3 3 Hybrid NA
#> 4 ab_g_stc__design_famrel 0 1 Single NA
#> 5 ab_g_stc__design_famrel 1 2 Sibling NA
#> 6 ab_g_stc__design_famrel 2 3 Twin NA
#> 7 ab_g_stc__design_famrel 3 4 Triplet NA
#> 8 ab_g_stc__design_sstwin 0 1 No NA
#> 9 ab_g_stc__design_sstwin 1 2 Yes NA
#> 10 ab_g_stc__cohort_ethn 1 1 Hispanic NA
#> # ℹ 44 more rows
Sessions table
The sessions table contains information about the events/session IDs that are part of a given release as well as their labels.
Basic usage
# Get sessions information for ABCD Study (latest release)
get_sessions("abcd")
#> # A tibble: 26 × 2
#> session_id label
#> <fct> <fct>
#> 1 ses-00S Screener
#> 2 ses-00A Baseline
#> 3 ses-00M 0.5 Year
#> 4 ses-01A 1 Year
#> 5 ses-01M 1.5 Year
#> 6 ses-02A 2 Year
#> 7 ses-02M 2.5 Year
#> 8 ses-03A 3 Year
#> 9 ses-03M 3.5 Year
#> 10 ses-04A 4 Year
#> # ℹ 16 more rows
# Get sessions information for HBCD Study (latest release)
get_sessions("hbcd")
#> # A tibble: 3 × 2
#> session_id label
#> <chr> <chr>
#> 1 ses-V01 Visit 1
#> 2 ses-V02 Visit 2
#> 3 ses-V03 Visit 3
Study-specific functions
# ABCD-specific function (for a specified release)
get_sessions_abcd(release = "6.0")
#> # A tibble: 26 × 2
#> session_id label
#> <fct> <fct>
#> 1 ses-00S Screener
#> 2 ses-00A Baseline
#> 3 ses-00M 0.5 Year
#> 4 ses-01A 1 Year
#> 5 ses-01M 1.5 Year
#> 6 ses-02A 2 Year
#> 7 ses-02M 2.5 Year
#> 8 ses-03A 3 Year
#> 9 ses-03M 3.5 Year
#> 10 ses-04A 4 Year
#> # ℹ 16 more rows
# HBCD-specific function (for a specified release)
get_sessions_hbcd(release = "1.0")
#> # A tibble: 3 × 2
#> session_id label
#> <chr> <chr>
#> 1 ses-V01 Visit 1
#> 2 ses-V02 Visit 2
#> 3 ses-V03 Visit 3
Identifier columns
Identifier columns are the variables used to uniquely identify
observations in the dataset. These columns are essential for joining
data from different tables. The get_id_cols()
function
retrieves the identifier columns for a given study.
Basic usage
# Get identifier columns for ABCD Study (latest release)
get_id_cols("abcd")
#> [1] "participant_id" "session_id"
# Get identifier columns for HBCD Study (latest release)
get_id_cols("hbcd")
#> [1] "participant_id" "session_id" "run_id"
Study-specific functions
# ABCD-specific function (for a specified release)
get_id_cols_abcd(release = "6.0")
#> [1] "participant_id" "session_id"
# HBCD-specific function (for a specified release)
get_id_cols_hbcd(release = "1.0")
#> [1] "participant_id" "session_id" "run_id"
General metadata function
The get_metadata()
function is the low-level function
that is used by all specific metadata functions. You can use it directly
to retrieve any type of metadata:
# Get data dictionary (same as get_dd)
get_metadata("abcd", type = "dd", release = "6.0")
#> # A tibble: 83,206 × 44
#> study domain sub_domain source metric atlas table_name table_label name
#> <chr> <chr> <chr> <chr> <chr> <chr> <glue> <glue> <glu>
#> 1 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 2 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 3 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 4 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 5 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 6 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 7 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 8 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 9 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> 10 Core ABCD (Gene… Standard … Gener… NA NA ab_g_dyn ABCD Dynam… ab_g…
#> # ℹ 83,196 more rows
#> # ℹ 35 more variables: label <glue>, instruction <chr>, header <chr>,
#> # note <chr>, unit <chr>, type_var <chr>, type_data <chr>, type_level <chr>,
#> # type_field <chr>, order_display <chr>, branching_logic <chr>,
#> # label_es <chr>, instruction_es <chr>, header_es <chr>, note_es <chr>,
#> # table_nda <chr>, table_nda_5_0 <chr>, table_redcap <chr>, name_nda <chr>,
#> # name_deap <chr>, name_redcap <chr>, name_redcap_exp <chr>, …
# Get levels table (same as get_levels)
get_metadata("abcd", type = "levels", vars = "ab_g_dyn__visit_type")
#> # A tibble: 3 × 5
#> name value order_level label label_es
#> <glue> <chr> <dbl> <chr> <chr>
#> 1 ab_g_dyn__visit_type 1 1 On-site NA
#> 2 ab_g_dyn__visit_type 2 2 Remote NA
#> 3 ab_g_dyn__visit_type 3 3 Hybrid NA
# Get sessions table (same as get_sessions)
get_metadata("abcd", type = "sessions")
#> # A tibble: 26 × 2
#> session_id label
#> <fct> <fct>
#> 1 ses-00S Screener
#> 2 ses-00A Baseline
#> 3 ses-00M 0.5 Year
#> 4 ses-01A 1 Year
#> 5 ses-01M 1.5 Year
#> 6 ses-02A 2 Year
#> 7 ses-02M 2.5 Year
#> 8 ses-03A 3 Year
#> 9 ses-03M 3.5 Year
#> 10 ses-04A 4 Year
#> # ℹ 16 more rows