| Title: | Rocket-Fast Clinical Research Reporting |
|---|---|
| Description: | Description of the tables, both grouped and not grouped, with some associated data management actions, such as sorting the terms of the variables and deleting terms with zero numbers. |
| Authors: | USMR CHU de Bordeaux [aut, cre], Valentine Renaudeau [aut], Marion Kret [aut], Matisse Decilap [aut], Sahardid Mohamed Houssein [aut], Mohamedou Sow [aut], Thomas Ferté [aut] |
| Maintainer: | USMR CHU de Bordeaux <[email protected]> |
| License: | GPL (>= 3) |
| Version: | 1.1.5 |
| Built: | 2026-05-22 16:33:38 UTC |
| Source: | https://github.com/biostatusmr/rastarocket |
This function merge missing data row into label row of gtsummary object
add_missing_info(base_table)add_missing_info(base_table)
base_table |
A |
A gtsummary table object with missing value information and modifications applied.
This function adds p-values to a gtsummary table using the specified tests and separates the p-value footnotes.
add_pvalues(res, tests)add_pvalues(res, tests)
res |
A |
tests |
A list of tests to pass to |
A gtsummary table object with p-values added and footnotes separated.
library(gtsummary) tbl <- trial %>% tbl_summary(by = trt) tbl <- add_pvalues(tbl, tests = TRUE)library(gtsummary) tbl <- trial %>% tbl_summary(by = trt) tbl <- add_pvalues(tbl, tests = TRUE)
This function appends the text "n (dm ; %dm)" to the labels of all variables in a dataset.
It uses the labelled package to modify and update variable labels in-place.
ajouter_label_ndm(data, col_to_skip = NULL)ajouter_label_ndm(data, col_to_skip = NULL)
data |
A data frame containing the dataset whose variable labels need to be updated. |
col_to_skip |
A column to skip when adding |
The function iterates over all columns in the dataset and performs the following steps:
Retrieves the current label of each variable using labelled::var_label.
Creates a new label by appending the text "n (dm ; %dm)" to the existing label.
Updates the variable's label using labelled::set_variable_labels.
This is useful when preparing a dataset for descriptive analysis, where it is helpful to display
missing data statistics (n, dm, and %dm) alongside variable labels in summary tables.
A data frame with updated variable labels.
# Example usage: library(labelled) # Create a sample dataset data <- data.frame( var1 = c(1, 2, NA), var2 = c("A", "B", NA) ) # Assign initial labels data <- labelled::set_variable_labels( data, var1 = "Variable 1", var2 = "Variable 2" ) # Add "n (dm ; %dm)" to labels data <- ajouter_label_ndm(data) # Check updated labels labelled::var_label(data)# Example usage: library(labelled) # Create a sample dataset data <- data.frame( var1 = c(1, 2, NA), var2 = c("A", "B", NA) ) # Assign initial labels data <- labelled::set_variable_labels( data, var1 = "Variable 1", var2 = "Variable 2" ) # Add "n (dm ; %dm)" to labels data <- ajouter_label_ndm(data) # Check updated labels labelled::var_label(data)
This function generates a summary table from a data frame with specified
grouping and variable types. It uses the gtsummary package to create
descriptive statistics for categorical and continuous variables, with
options for customizing the rounding and labels.
base_table( data1, show_missing_data, by_group = FALSE, var_group, quali = NULL, quanti = NULL, stat_var_quanti = c("{mean} ({sd})", "{median} ({p25} ; {p75})", "{min} ; {max}"), digits = list(r_quanti = 1, r_quali = 1), freq_relevel = FALSE )base_table( data1, show_missing_data, by_group = FALSE, var_group, quali = NULL, quanti = NULL, stat_var_quanti = c("{mean} ({sd})", "{median} ({p25} ; {p75})", "{min} ; {max}"), digits = list(r_quanti = 1, r_quali = 1), freq_relevel = FALSE )
data1 |
A data frame containing the dataset to be analyzed. |
show_missing_data |
Default to
|
by_group |
A boolean (default is FALSE) to analyse by group. |
var_group |
A variable used for grouping (if applicable). Defaults to |
quali |
A vector of qualitative variables to be described. Defaults to |
quanti |
A vector of quantitative variables to be described. Defaults to |
stat_var_quanti |
A character vector specifying the statistics to display for continuous variables. Default is |
digits |
A list, the number of decimal places to round categorical and continuous variable. r_quanti and r_quali can be a single integer or a vector of integer. Default is list(r_quanti = 1, r_quali = 1) |
freq_relevel |
Boolean (default = FALSE). If TRUE, reorder factors by frequency (most to least frequent) using gtsummary. |
A gtsummary table summarizing the specified variables,
grouped by var_group if provided, with customizable statistics
and rounding options.
# Example usage with the iris dataset base_table(iris, var_group = "Species", show_missing_data = TRUE)# Example usage with the iris dataset base_table(iris, var_group = "Species", show_missing_data = TRUE)
Generate css to be included in quarto.
css_generator(path_logo = NULL)css_generator(path_logo = NULL)
path_logo |
The path to logo, will automatically be guessed on the package. |
A css string
gtsummary and gt tablesThis function takes a gtsummary or gt table and applies custom formatting. It allows you to align columns,
apply bold text to certain rows, and adjust column widths if specified.
custom_format(gt_table, align = "right", column_size = NULL)custom_format(gt_table, align = "right", column_size = NULL)
gt_table |
A |
align |
A character string defining the alignment of specific columns. Passed to the
|
column_size |
A named list or vector defining the width of columns (optional). The list should specify the width for one or more columns. If not provided, column widths will not be modified. |
A gt table object with the specified formatting applied.
The table will have columns aligned according to the align parameter,
and cells in the "label" rows will have bold text. If column_size is provided,
the column widths will be adjusted accordingly.
# Example usage tbl <- RastaRocket::desc_var(iris, table_title = "test", group = TRUE, var_group = "Species") formatted_tbl <- custom_format(tbl, align = "center", column_size = list(label ~ gt::pct(50))) formatted_tbl# Example usage tbl <- RastaRocket::desc_var(iris, table_title = "test", group = TRUE, var_group = "Species") formatted_tbl <- custom_format(tbl, align = "center", column_size = list(label ~ gt::pct(50))) formatted_tbl
This function customizes the column headers, optional spanning header, and table caption
for a gtsummary table. It supports adding a feature name, total label, group title, and
formats missing data presentation.
custom_headers( base_table_missing, var_characteristic = NULL, show_missing_data = TRUE, show_n_per_group = TRUE, var_tot = NULL, var_group = NULL, group_title = NULL, table_title )custom_headers( base_table_missing, var_characteristic = NULL, show_missing_data = TRUE, show_n_per_group = TRUE, var_tot = NULL, var_group = NULL, group_title = NULL, table_title )
base_table_missing |
A |
var_characteristic |
Optional. A string to label the features column. |
show_missing_data |
Logical. If |
show_n_per_group |
A boolean indicating whether to display group sizes (n) for each level of the grouping variable. |
var_tot |
Optional. A string to label the total column. |
var_group |
Optional. Name of a grouping variable for adding a spanning header. |
group_title |
Optional. Title for the spanning header. If |
table_title |
Title for the entire table. |
A gtsummary table object with updated headers, spanning header, and caption.
Rounds a numeric value to a specified number of decimal places and formats it to always show the specified number of decimal places, including trailing zeros.
custom_round(x, digits = 1)custom_round(x, digits = 1)
x |
A numeric vector to be rounded and formatted. |
digits |
An integer indicating the number of decimal places to round to. Defaults to 1. |
A character vector with the rounded and formatted numbers.
RastaRocket::custom_round(3.14159) # "3.1" RastaRocket::custom_round(3.14159, 3) # "3.142" RastaRocket::custom_round(c(2, 2.5), 2) # "2.00" "2.50"RastaRocket::custom_round(3.14159) # "3.1" RastaRocket::custom_round(3.14159, 3) # "3.142" RastaRocket::custom_round(c(2, 2.5), 2) # "2.00" "2.50"
This function customizes a gtsummary summary table by adding an overall column,
handling missing data, applying group-specific statistics, and updating headers
and captions. It provides flexible options for grouping, displaying missing data,
and customizing table titles.
customize_table( base_table, by_group = FALSE, var_group, add_total, show_missing_data, show_n_per_group, group_title, table_title, var_title, var_tot = NULL, var_characteristic = NULL )customize_table( base_table, by_group = FALSE, var_group, add_total, show_missing_data, show_n_per_group, group_title, table_title, var_title, var_tot = NULL, var_characteristic = NULL )
base_table |
A |
by_group |
A boolean (default is FALSE) to analyse by group. |
var_group |
A variable used for grouping (if applicable). Defaults to |
add_total |
A boolean (default is TRUE) to add total column or not when var_group is specified. |
show_missing_data |
Default to
|
show_n_per_group |
Default to
|
group_title |
A character string specifying the title for the grouping variable. Default is |
table_title |
A character string specifying the title of the table. |
var_title |
A character string for the title of the variable column in the table. Defaults to |
var_tot |
A string specifying the name of total column. Default is |
var_characteristic |
A string specifying the name of characteristic column. Default is |
The show_missing_data parameter determines whether missing data counts and
percentages are displayed:
If TRUE, missing data columns are added.
If FALSE, only non-missing data counts are displayed.
Headers for columns and spanning headers are customized using the group_title,
table_title, and var_title arguments.
A customized gtsummary table object with added columns, headers, captions,
and modifications based on the provided arguments.
# Example usage with a sample gtsummary table library(gtsummary) base_table <- trial %>% gtsummary::tbl_summary( type = list( gtsummary::all_continuous() ~ "continuous2" ), by = "trt", missing = "always", missing_stat = "{N_nonmiss} ({N_miss})", statistic = list( gtsummary::all_continuous2() ~ c("{mean} ({sd})", "{median} ({p25} ; {p75})", "{min} ; {max}") )) customize_table( base_table, var_group = "trt", add_total = TRUE, show_missing_data = TRUE, show_n_per_group = FALSE, group_title = "Treatment Group", table_title = "Summary Statistics", var_title = "Variables", var_tot = "Total" )# Example usage with a sample gtsummary table library(gtsummary) base_table <- trial %>% gtsummary::tbl_summary( type = list( gtsummary::all_continuous() ~ "continuous2" ), by = "trt", missing = "always", missing_stat = "{N_nonmiss} ({N_miss})", statistic = list( gtsummary::all_continuous2() ~ c("{mean} ({sd})", "{median} ({p25} ; {p75})", "{min} ; {max}") )) customize_table( base_table, var_group = "trt", add_total = TRUE, show_missing_data = TRUE, show_n_per_group = FALSE, group_title = "Treatment Group", table_title = "Summary Statistics", var_title = "Variables", var_tot = "Total" )
This function modifies a data frame by updating the stat_0 column. If any values in
stat_0 are missing (NA), they are replaced by the values from the n column.
After the replacement, the n column is removed from the data frame.
customize_table_body(data)customize_table_body(data)
data |
A data frame that must contain at least two columns:
|
The function uses dplyr::case_when to conditionally update the stat_0 column.
After the replacement process, the n column is dropped using dplyr::select(-n).
This function is particularly useful for cleaning and preparing table data.
A modified data frame with:
Updated stat_0 values (replaced with n values where NA is found).
The n column removed after integration.
# Example data data <- data.frame( stat_0 = c(NA, "B", "C"), n = c(10, 20, 30) ) # Apply the function modified_data <- RastaRocket::customize_table_body(data) print(modified_data)# Example data data <- data.frame( stat_0 = c(NA, "B", "C"), n = c(10, 20, 30) ) # Apply the function modified_data <- RastaRocket::customize_table_body(data) print(modified_data)
A function to describe adverse events (AE) by grade.
desc_ei_per_grade( df_pat_grp, df_pat_grade, id_col = "USUBJID", group_col = "RDGRPNAME", ei_num_col = "EINUM", ei_grdm_col = "EIGRDM", ei_grav_col = "EIGRAV", severity = TRUE, digits = 1, language = "fr" )desc_ei_per_grade( df_pat_grp, df_pat_grade, id_col = "USUBJID", group_col = "RDGRPNAME", ei_num_col = "EINUM", ei_grdm_col = "EIGRDM", ei_grav_col = "EIGRAV", severity = TRUE, digits = 1, language = "fr" )
df_pat_grp |
A dataframe with two columns: USUBJID (Patient id) and RDGRPNAME (the RCT arm). |
df_pat_grade |
A dataframe with four columns: USUBJID (Patient id), EINUM (the AE id), EIGRDM (the AE grade) and EIGRAV (the AE severity which must be "Grave" and "Non grave"). |
id_col |
Patient id column (default: "USUBJID"). |
group_col |
group column, the rct arm (default: "RDGRPNAME"). |
ei_num_col |
AE id column (default: "EINUM"). |
ei_grdm_col |
AE grade column (default: "EIGRDM"). |
ei_grav_col |
AE severity column (default: "EIGRAV"). |
severity |
A boolean to show severe adverse event line or not (default: TRUE). |
digits |
Number of digits for percentages |
language |
'fr' default or 'en' |
A gt table summarizing the AE by grade.
df_pat_grp <- data.frame(USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 3), rep("B", 3), rep("C", 4))) df_pat_grade <- data.frame(USUBJID = c("ID_1", "ID_1", "ID_2", "ID_8", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EIGRDM = c(1, 3, 4, 2, 4), EIGRAV = c("Grave", "Non grave", "Non grave", "Non grave", "Grave")) desc_ei_per_grade(df_pat_grp = df_pat_grp, df_pat_grade = df_pat_grade)df_pat_grp <- data.frame(USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 3), rep("B", 3), rep("C", 4))) df_pat_grade <- data.frame(USUBJID = c("ID_1", "ID_1", "ID_2", "ID_8", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EIGRDM = c(1, 3, 4, 2, 4), EIGRAV = c("Grave", "Non grave", "Non grave", "Non grave", "Grave")) desc_ei_per_grade(df_pat_grp = df_pat_grp, df_pat_grade = df_pat_grade)
A function to describe AE by soc and pt
desc_ei_per_pt( df_pat_grp, df_pat_llt, id_col = "USUBJID", group_col = "RDGRPNAME", ei_num_col = "EINUM", ei_llt_col = "EILLTN", ei_soc_col = "EISOCPN", ei_pt_col = "EIPTN", language = "fr", order_by_freq = TRUE, digits = 1 )desc_ei_per_pt( df_pat_grp, df_pat_llt, id_col = "USUBJID", group_col = "RDGRPNAME", ei_num_col = "EINUM", ei_llt_col = "EILLTN", ei_soc_col = "EISOCPN", ei_pt_col = "EIPTN", language = "fr", order_by_freq = TRUE, digits = 1 )
df_pat_grp |
A dataframe with two columns: id_pat and grp (the rct arm) |
df_pat_llt |
A dataframe with two columns: id_pat (patient id), num_ae (AE id), llt (AE LLT), pt (AE PT), soc (AE) |
id_col |
Patient id column (default: "USUBJID"). |
group_col |
group column, the rct arm (default: "RDGRPNAME"). |
ei_num_col |
AE id column (default: "EINUM"). |
ei_llt_col |
AE LLT column (default: "EILLTN"). |
ei_soc_col |
AE SOC column (default: "EISOCPN"). |
ei_pt_col |
AE PT column (default: "EIPTN") |
language |
'fr' default or 'en' |
order_by_freq |
Logical. Should PT and SOC be ordered by frequency? Defaults to TRUE. If FALSE, PT and SOC are ordered alphabetically. |
digits |
Number of digits for percentages |
A gt table
df_pat_grp <- data.frame(USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 3), rep("B", 3), rep("C", 4))) df_pat_llt <- data.frame(USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt1", "llt4", "llt3", "llt1"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections")) desc_ei_per_pt(df_pat_grp = df_pat_grp, df_pat_llt = df_pat_llt)df_pat_grp <- data.frame(USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 3), rep("B", 3), rep("C", 4))) df_pat_llt <- data.frame(USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt1", "llt4", "llt3", "llt1"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections")) desc_ei_per_pt(df_pat_grp = df_pat_grp, df_pat_llt = df_pat_llt)
This function creates descriptive tables for variables in a dataset. It can handle qualitative and quantitative variables, grouped or ungrouped, and supports multiple configurations for handling missing data (DM).
desc_var( data1, table_title = "", quali = NULL, quanti = NULL, add_total = TRUE, var_title = "Variable", by_group = FALSE, var_group = NULL, group_title = NULL, stat_var_quanti = c("{mean} ({sd})", "{median} ({p25} ; {p75})", "{min} ; {max}"), digits = list(r_quanti = 1, r_quali = 0), drop_levels = TRUE, freq_relevel = FALSE, tests = FALSE, show_n_per_group = FALSE, show_missing_data = NULL, var_tot = NULL, var_characteristic = NULL, include_all_na_cat = TRUE )desc_var( data1, table_title = "", quali = NULL, quanti = NULL, add_total = TRUE, var_title = "Variable", by_group = FALSE, var_group = NULL, group_title = NULL, stat_var_quanti = c("{mean} ({sd})", "{median} ({p25} ; {p75})", "{min} ; {max}"), digits = list(r_quanti = 1, r_quali = 0), drop_levels = TRUE, freq_relevel = FALSE, tests = FALSE, show_n_per_group = FALSE, show_missing_data = NULL, var_tot = NULL, var_characteristic = NULL, include_all_na_cat = TRUE )
data1 |
A data frame containing the dataset to be analyzed. |
table_title |
A character string specifying the title of the table. |
quali |
A vector of qualitative variables to be described. Defaults to |
quanti |
A vector of quantitative variables to be described. Defaults to |
add_total |
A boolean (default is TRUE) to add total column or not when var_group is specified. |
var_title |
A character string for the title of the variable column in the table. Defaults to |
by_group |
A boolean (default is FALSE) to analyse by group. |
var_group |
A variable used for grouping (if applicable). Defaults to |
group_title |
A character string specifying the title for the grouping variable. Default is |
stat_var_quanti |
A character vector specifying the statistics to display for continuous variables. Default is |
digits |
A list, the number of decimal places to round categorical and continuous variable. r_quanti and r_quali can be a single integer or a vector of integer. Default is list(r_quanti = 1, r_quali = 1) |
drop_levels |
Boolean (default = TRUE). Drop unused levels. |
freq_relevel |
Boolean (default = FALSE). If TRUE, reorder factors by frequency (most to least frequent) using gtsummary. |
tests |
A value in order to add p value. Default to
|
show_n_per_group |
Default to
|
show_missing_data |
Default to
|
var_tot |
A string specifying the name of total column. Default is |
var_characteristic |
A string specifying the name of characteristic column. Default is |
include_all_na_cat |
Should the categorical variable with a missing levels (all values are NA) be displayed. Default to |
The function processes the dataset according to the specified parameters and generates descriptive tables.
It first uses the ajouter_label_ndm() function to append missing data statistics to variable labels.
Depending on the group and DM arguments, it adjusts the dataset and creates tables using helper functions like desc_group, desc_degroup, and desc_degroup_group.
Qualitative variables are reordered, and unused levels are dropped when necessary.
A gtsummary table object containing the descriptive statistics.
# Example usage: library(dplyr) # Sample dataset data1 <- data.frame( group = c("A", "B", "B", "C"), var1 = c(1, 2, 3, NA), var2 = c("X", "Y", "X", NA) ) # Generate descriptive table table <- desc_var( data1 = data1, table_title = "Descriptive Table", quanti = "var1" ) # Order categorical features by frequency table1 <- desc_var( data1 = data1, table_title = "Descriptive Table", quanti = "var1", freq_relevel = TRUE) # Round quantitative and qualitative features using a vector of integer table2 <- desc_var( data1 = iris, quanti = "Sepal.Length", stat_var_quanti = c("{sum}", "{mean} ({sd})"), digits = list(r_quanti = c(1, 3, 2), r_quali = c(0, 2)) )# Example usage: library(dplyr) # Sample dataset data1 <- data.frame( group = c("A", "B", "B", "C"), var1 = c(1, 2, 3, NA), var2 = c("X", "Y", "X", NA) ) # Generate descriptive table table <- desc_var( data1 = data1, table_title = "Descriptive Table", quanti = "var1" ) # Order categorical features by frequency table1 <- desc_var( data1 = data1, table_title = "Descriptive Table", quanti = "var1", freq_relevel = TRUE) # Round quantitative and qualitative features using a vector of integer table2 <- desc_var( data1 = iris, quanti = "Sepal.Length", stat_var_quanti = c("{sum}", "{mean} ({sd})"), digits = list(r_quanti = c(1, 3, 2), r_quali = c(0, 2)) )
Prepare a dataframe for creating AE plots
df_builder_ae(df_pat_grp, df_pat_llt, ref_grp = NULL)df_builder_ae(df_pat_grp, df_pat_llt, ref_grp = NULL)
df_pat_grp |
A data frame of patient groups. Must contain columns |
df_pat_llt |
A data frame with USUBJID (subject ID), EINUM (AE ID), EILLTN (LLT identifier), EIPTN (PT identifier), EISOCPN (soc identifier) and EIGRDM (severity grade) |
ref_grp |
(Optional) A reference group for comparisons. Defaults to the first group in |
A dataframe with all the info to build AE plots
This function transforms a given name into an email address following the format [email protected].
from_name_to_adress(name = "Peter Parker")from_name_to_adress(name = "Peter Parker")
name |
A character string representing a full name. Default is "Peter Parker". |
A character string containing the generated email address.
from_name_to_adress("John Doe") from_name_to_adress()from_name_to_adress("John Doe") from_name_to_adress()
gt tablesText indentation for gt tables
indent_gt_table(g_table, indent = 0)indent_gt_table(g_table, indent = 0)
g_table |
A |
indent |
A numerical value corresponding to the pixel value, which defines the text indentation (default = 0 corresponding to px(0)). 30 ~ px(30) should be a good compromise. |
A gt table object with indentation applied.
tbl_bis <- RastaRocket::desc_var( iris, table_title = "test", quali = "Species") tbl_bis |> indent_gt_table(indent = 30) tbl_bis |> indent_gt_table(indent = 60)tbl_bis <- RastaRocket::desc_var( iris, table_title = "test", quali = "Species") tbl_bis |> indent_gt_table(indent = 30) tbl_bis |> indent_gt_table(indent = 60)
gtsummary tablesText indentation for gtsummary tables
indent_gtsummary_table(gts_table, indent = 4)indent_gtsummary_table(gts_table, indent = 4)
gts_table |
A |
indent |
A numerical value indicating how many space to indent text (default = 4). A value of 8 should be a good compromise. |
A gtsummary table object with indentation applied.
tbl <- iris |> dplyr::select(Species, Sepal.Length) |> RastaRocket::desc_var( table_title = "test", quali = "Species") tbl_1 <- tbl |> indent_gtsummary_table(indent = 4) tbl_2 <- tbl |> indent_gtsummary_table(indent = 8) tbl_3 <- tbl |> indent_gtsummary_table(indent = 16)tbl <- iris |> dplyr::select(Species, Sepal.Length) |> RastaRocket::desc_var( table_title = "test", quali = "Species") tbl_1 <- tbl |> indent_gtsummary_table(indent = 4) tbl_2 <- tbl |> indent_gtsummary_table(indent = 8) tbl_3 <- tbl |> indent_gtsummary_table(indent = 16)
Combines multiple descriptive tables into a single table with customized row group headers and styling.
This function accepts a list of tables and corresponding group headers, applies consistent styling,
and outputs a styled gt table.
intermediate_header( tbls, group_header, color = "#8ECAE6", size = 16, align = "center", weight = "bold" )intermediate_header( tbls, group_header, color = "#8ECAE6", size = 16, align = "center", weight = "bold" )
tbls |
A list of descriptive tables (generated by |
group_header |
A character vector specifying the headers for each group of tables.
Must be the same length as |
color |
A character string specifying the background color for the row group headers.
Default is |
size |
An integer specifying the font size for the row group headers. Default is |
align |
A character string specifying text alignment for the row group headers.
Options are |
weight |
A character string specifying the font weight for the row group headers.
Options include |
A styled gt table combining the input tables with row group headers.
# Load necessary libraries library(RastaRocket) library(dplyr) # Generate sample data data <- data.frame( Age = c(rnorm(45, mean = 50, sd = 10), rep(NA, 5)), sexe = sample(c("Femme", "Homme"), 50, replace = TRUE, prob = c(0.6, 0.4)), quatre_modalites = sample(c("A", "B", "C", "D"), 50, replace = TRUE) ) # Create descriptive tables tb1 <- data %>% dplyr::select(Age, sexe) %>% RastaRocket::desc_var(table_title = "Demographics", group = FALSE) tb2 <- data %>% dplyr::select(quatre_modalites) %>% RastaRocket::desc_var(table_title = "Modalities", group = FALSE) # Combine and style tables intermediate_header( tbls = list(tb1, tb2), group_header = c("Demographics", "Modalities") )# Load necessary libraries library(RastaRocket) library(dplyr) # Generate sample data data <- data.frame( Age = c(rnorm(45, mean = 50, sd = 10), rep(NA, 5)), sexe = sample(c("Femme", "Homme"), 50, replace = TRUE, prob = c(0.6, 0.4)), quatre_modalites = sample(c("A", "B", "C", "D"), 50, replace = TRUE) ) # Create descriptive tables tb1 <- data %>% dplyr::select(Age, sexe) %>% RastaRocket::desc_var(table_title = "Demographics", group = FALSE) tb2 <- data %>% dplyr::select(quatre_modalites) %>% RastaRocket::desc_var(table_title = "Modalities", group = FALSE) # Combine and style tables intermediate_header( tbls = list(tb1, tb2), group_header = c("Demographics", "Modalities") )
Creates a butterfly stacked bar plot to visualize the frequency of adverse event (AE) grades across patient groups, with system organ class (SOC) and preferred terms (PT) as labels.
plot_butterfly_stacked_barplot( df_pat_grp, df_pat_llt, ref_grp = NULL, max_text_width = 9, vec_fill_color = viridis::viridis(n = 4) )plot_butterfly_stacked_barplot( df_pat_grp, df_pat_llt, ref_grp = NULL, max_text_width = 9, vec_fill_color = viridis::viridis(n = 4) )
df_pat_grp |
A data frame of patient groups. Must contain columns |
df_pat_llt |
A data frame with USUBJID (subject ID), EINUM (AE ID), EILLTN (LLT identifier), EIPTN (PT identifier), EISOCPN (soc identifier) and EIGRDM (severity grade) |
ref_grp |
A character string specifying the reference group (used for alignment in the plot).
If NULL (default), the first level of |
max_text_width |
An integer specifying the maximum width (in characters) for SOC labels before wrapping to the next line. Default is 9. |
vec_fill_color |
A vector of colors used for filling the AE grade bars. Default is
|
The function processes input data to calculate the frequency of adverse events per patient group and AE grade. It then generates a stacked bar plot where:
The x-axis represents the percentage of patients experiencing an AE.
The y-axis represents PTs (with SOCs as facets).
Bars are stacked by AE grade.
Labels for PTs are displayed in the center.
The left and right panels correspond to different patient groups.
The function utilizes the ggh4x package to adjust panel sizes and axes for a symmetrical
butterfly plot.
A ggplot2 object representing the butterfly stacked bar plot.
df_pat_grp <- data.frame( USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 5), rep("B", 5)) ) df_pat_llt <- data.frame( USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt2", "llt1", "llt3", "llt4"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections"), EIGRDM = c(1, 3, 4, 2, 4) ) plot_butterfly_stacked_barplot(df_pat_grp, df_pat_llt)df_pat_grp <- data.frame( USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 5), rep("B", 5)) ) df_pat_llt <- data.frame( USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt2", "llt1", "llt3", "llt4"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections"), EIGRDM = c(1, 3, 4, 2, 4) ) plot_butterfly_stacked_barplot(df_pat_grp, df_pat_llt)
This function creates a dumbbell plot comparing the occurrence of adverse events across different patient groups. The plot includes the total number of adverse events, the proportion of patients affected, and the risk difference with confidence intervals.
plot_dumbell( df_pat_grp, df_pat_llt, ref_grp = NULL, colors_arm = c("#1b9e77", "#7570b3"), color_label = "Arm" )plot_dumbell( df_pat_grp, df_pat_llt, ref_grp = NULL, colors_arm = c("#1b9e77", "#7570b3"), color_label = "Arm" )
df_pat_grp |
A data frame of patient groups. Must contain columns |
df_pat_llt |
A data frame with USUBJID (subject ID), EINUM (AE ID), EILLTN (LLT identifier), EIPTN (PT identifier), EISOCPN (soc identifier) and EIGRDM (severity grade) |
ref_grp |
(Optional) A reference group for comparisons. Defaults to the first group in |
colors_arm |
A vector of colors for the patient groups. Defaults to |
color_label |
A string specifying the legend label for the groups. Defaults to |
A ggplot object displaying the dumbbell chart.
df_pat_grp <- data.frame( USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 5), rep("B", 5)) ) df_pat_llt <- data.frame( USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt2", "llt1", "llt3", "llt4"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections"), EIGRDM = c(1, 3, 4, 2, 4) ) plot_dumbell(df_pat_llt = df_pat_llt, df_pat_grp = df_pat_grp)df_pat_grp <- data.frame( USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 5), rep("B", 5)) ) df_pat_llt <- data.frame( USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt2", "llt1", "llt3", "llt4"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections"), EIGRDM = c(1, 3, 4, 2, 4) ) plot_dumbell(df_pat_llt = df_pat_llt, df_pat_grp = df_pat_grp)
This function visualizes the timeline of adverse events (AEs), treatments, and randomization for a selected patient. The span chart helps track AE duration and treatment events relative to randomization.
plot_patient_panchart( df_soc_pt, df_pat_grp_rando, df_pat_pt_grade_date, df_pat_treatment_date, pat_id, vec_fill_color = viridis::viridis(n = 4, direction = -1, end = 0.95, option = "magma") )plot_patient_panchart( df_soc_pt, df_pat_grp_rando, df_pat_pt_grade_date, df_pat_treatment_date, pat_id, vec_fill_color = viridis::viridis(n = 4, direction = -1, end = 0.95, option = "magma") )
df_soc_pt |
A data frame mapping System Organ Class (SOC) to Preferred Terms (PT). |
df_pat_grp_rando |
A data frame containing patient IDs, randomization groups, and randomization dates. |
df_pat_pt_grade_date |
A data frame with patient IDs, PT terms, AE grades, start and end dates of AEs. |
df_pat_treatment_date |
A data frame with patient IDs and treatment dates. |
pat_id |
A character string specifying the patient ID to plot. |
vec_fill_color |
A vector of colors for AE grades. Default is |
A ggplot object representing the patient span chart.
df_pat_grp_rando <- data.frame( id_pat = c("ID_1", "ID_2"), grp = c("A", "B"), rando_date = c("2020-12-01", "2021-01-03") ) df_pat_pt_grade_date <- data.frame( id_pat = c("ID_1", "ID_1", "ID_1", "ID_1", "ID_2"), pt = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), grade = c(4, 2, 1, 3, 4), start = c("2021-01-01", "2021-02-03", "2021-01-02", "2021-03-05", "2021-02-01"), end = c("2021-01-14", "2021-03-03", "2021-01-22", "2021-05-05", "2021-02-03") ) df_pat_treatment_date <- data.frame( id_pat = c("ID_1", "ID_1", "ID_1"), treatment_date = c("2021-01-25", "2021-03-01", "2021-01-20") ) df_soc_pt <- data.frame( pt = c("Arrhythmia", "Myocardial Infarction", "Pneumonia", "Sepsis"), soc = c("Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections") ) plot_patient_panchart( df_soc_pt = df_soc_pt, df_pat_grp_rando = df_pat_grp_rando, df_pat_pt_grade_date = df_pat_pt_grade_date, df_pat_treatment_date = df_pat_treatment_date, pat_id = "ID_1" )df_pat_grp_rando <- data.frame( id_pat = c("ID_1", "ID_2"), grp = c("A", "B"), rando_date = c("2020-12-01", "2021-01-03") ) df_pat_pt_grade_date <- data.frame( id_pat = c("ID_1", "ID_1", "ID_1", "ID_1", "ID_2"), pt = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), grade = c(4, 2, 1, 3, 4), start = c("2021-01-01", "2021-02-03", "2021-01-02", "2021-03-05", "2021-02-01"), end = c("2021-01-14", "2021-03-03", "2021-01-22", "2021-05-05", "2021-02-03") ) df_pat_treatment_date <- data.frame( id_pat = c("ID_1", "ID_1", "ID_1"), treatment_date = c("2021-01-25", "2021-03-01", "2021-01-20") ) df_soc_pt <- data.frame( pt = c("Arrhythmia", "Myocardial Infarction", "Pneumonia", "Sepsis"), soc = c("Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections") ) plot_patient_panchart( df_soc_pt = df_soc_pt, df_pat_grp_rando = df_pat_grp_rando, df_pat_pt_grade_date = df_pat_pt_grade_date, df_pat_treatment_date = df_pat_treatment_date, pat_id = "ID_1" )
Generates a volcano plot to visualize the association between adverse events and patient groups.
plot_volcano( df_pat_grp, df_pat_llt, ref_grp = NULL, colors_arm = c("#1b9e77", "#7570b3"), size = "nb_pat" )plot_volcano( df_pat_grp, df_pat_llt, ref_grp = NULL, colors_arm = c("#1b9e77", "#7570b3"), size = "nb_pat" )
df_pat_grp |
A data frame of patient groups. Must contain columns |
df_pat_llt |
A data frame with USUBJID (subject ID), EINUM (AE ID), EILLTN (LLT identifier), EIPTN (PT identifier), EISOCPN (soc identifier) and EIGRDM (severity grade) |
ref_grp |
(Optional) A reference group for comparisons. Defaults to the first group in |
colors_arm |
A character vector of length two specifying the colors for the two patient groups in the plot.
Default is |
size |
A character string specifying the metric used for point sizes in the plot. Options are:
|
The function first processes input data using df_builder_ae(), then calculates relevant statistics
such as risk difference (RD) and p-values. The volcano plot displays:
RD on the x-axis (risk difference between groups).
-log10(p-value) on the y-axis (significance level).
Point colors indicating which group has an increased risk.
Point sizes reflecting either the number of patients or events.
A horizontal dashed line at p = 0.05 for significance threshold.
A ggplot2 object representing the volcano plot.
df_pat_grp <- data.frame( USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 5), rep("B", 5)) ) df_pat_llt <- data.frame( USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt2", "llt1", "llt3", "llt4"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections"), EIGRDM = c(1, 3, 4, 2, 4) ) plot_volcano(df_pat_grp, df_pat_llt)df_pat_grp <- data.frame( USUBJID = paste0("ID_", 1:10), RDGRPNAME = c(rep("A", 5), rep("B", 5)) ) df_pat_llt <- data.frame( USUBJID = c("ID_1", "ID_1", "ID_2", "ID_4", "ID_9"), EINUM = c(1, 2, 1, 1, 1), EILLTN = c("llt1", "llt2", "llt1", "llt3", "llt4"), EIPTN = c("Arrhythmia", "Myocardial Infarction", "Arrhythmia", "Pneumonia", "Pneumonia"), EISOCPN = c("Cardiac Disorders", "Cardiac Disorders", "Cardiac Disorders", "Infections", "Infections"), EIGRDM = c(1, 3, 4, 2, 4) ) plot_volcano(df_pat_grp, df_pat_llt)
This function prepares a data frame for summarization by handling missing data
based on the show_missing_data argument and applying the specified data manipulation
(DM) option to factor variables. It provides flexibility for data cleaning and ordering
before summarizing with functions like gtsummary.
prepare_table( data1, by_group = FALSE, var_group = NULL, drop_levels = TRUE, show_missing_data = TRUE, include_all_na_cat = TRUE )prepare_table( data1, by_group = FALSE, var_group = NULL, drop_levels = TRUE, show_missing_data = TRUE, include_all_na_cat = TRUE )
data1 |
A data frame containing the dataset to be analyzed. |
by_group |
A boolean (default is FALSE) to analyse by group. |
var_group |
A variable used for grouping (if applicable). Defaults to |
drop_levels |
Boolean (default = TRUE). Drop unused levels. |
show_missing_data |
Default to
|
include_all_na_cat |
Should the categorical variable with a missing levels (all values are NA) be displayed. Default to |
The DM option defines the data manipulation to be applied to factor variables:
"tout": Both order factor levels and drop unused levels.
"tri": Only order factor levels.
"remove": Drop unused factor levels without ordering.
# Example usage with the iris dataset prepare_table(iris)# Example usage with the iris dataset prepare_table(iris)
Creates a transformation object for a reverse log scale, which can be used in ggplot2 scales.
reverselog_trans(base = exp(1))reverselog_trans(base = exp(1))
base |
A numeric value specifying the logarithm base. Default is the natural logarithm ( |
This function defines a reverse logarithmic transformation, where the transformation function is
and the inverse function is
. It is useful for cases where a decreasing log scale is needed.
A transformation object compatible with ggplot2 scales.
library(scales) rev_log <- reverselog_trans(10) rev_log$trans(100) # -2 rev_log$inverse(-2) # 100library(scales) rev_log <- reverselog_trans(10) rev_log$trans(100) # -2 rev_log$inverse(-2) # 100
A function from the fmsb package to compute risk difference. Calculate risk difference (a kind of attributable risk / excess risk) and its confidence intervals based on approximation, followed by null hypothesis (risk difference equals to 0) testing.
riskdifference(a, b, N1, N0, CRC = FALSE, conf.level = 0.95)riskdifference(a, b, N1, N0, CRC = FALSE, conf.level = 0.95)
a |
The number of disease occurence among exposed cohort. |
b |
The number of disease occurence among non-exposed cohort. |
N1 |
The population at risk of the exposed cohort. |
N0 |
The population at risk of the unexposed cohort. |
CRC |
Logical. If TRUE, calculate confidence intervals for each risk. Default is FALSE. |
conf.level |
Probability for confidence intervals. Default is 0.95. |
A list with the results
This function extends dplyr::select() by allowing the dynamic addition of one or more grouping
variables (var_group) to the selection.
select_plus(.data, ..., var_group = NULL)select_plus(.data, ..., var_group = NULL)
.data |
A data frame. |
... |
Columns to select (as in |
var_group |
A character string or vector of column names to additionally include,
typically one or more grouping variables. Can be |
It is especially useful when switching between an ungrouped analysis (e.g., all observations together) and a grouped analysis (e.g., stratified or including interaction terms), without rewriting code.
For instance, this allows you to write a single analysis command for both the RDD (Rapport de Démarrage des Données)
and the final report, simply by changing the .qmd file, without modifying the core analysis code.
A data frame with the selected columns, including var_group if specified.
library(dplyr) df <- data.frame(x = 1:3, y = 4:6, z = 7:9) # Simple selection select_plus(df, x, y) # Selection with grouping variable select_plus(df, x, var_group = "z")library(dplyr) df <- data.frame(x = 1:3, y = 4:6, z = 7:9) # Simple selection select_plus(df, x, y) # Selection with grouping variable select_plus(df, x, var_group = "z")
This function creates and writes a qmd file with css and html to report statistical analysis.
start_new_reporting( folder_path, output_folder, name = "report", structure = "USMR", path_logo = NULL, confidential = FALSE, report_type = "Data review report", study_id = "CHUBXYYYY/NN", study_name = "The Study Name", study_abbreviation = "TSN", investigator = "Investigator name", methodologist = "Jean Dupont", biostatistician = "George Frais", datamanager = "Peter Parker", methodologist_mail = NULL, biostatistician_mail = NULL, datamanager_mail = NULL, language = "fr" )start_new_reporting( folder_path, output_folder, name = "report", structure = "USMR", path_logo = NULL, confidential = FALSE, report_type = "Data review report", study_id = "CHUBXYYYY/NN", study_name = "The Study Name", study_abbreviation = "TSN", investigator = "Investigator name", methodologist = "Jean Dupont", biostatistician = "George Frais", datamanager = "Peter Parker", methodologist_mail = NULL, biostatistician_mail = NULL, datamanager_mail = NULL, language = "fr" )
folder_path |
The folder where this should be created |
output_folder |
The folder where the html will be recorded. |
name |
The name of the files |
structure |
Character string indicating the organizational structure, either "USMR" or "EUCLID". Default is "USMR". |
path_logo |
Character string specifying the path to the logo image. If NULL, a default logo is used. |
confidential |
Logical value indicating whether the report should be marked as confidential. Default is FALSE. |
report_type |
Character string specifying the type of report. Default is "Data review report". |
study_id |
Character string representing the study identifier. Default is "CHUBXYYYY/NN". |
study_name |
Character string specifying the name of the study. Default is "The Study Name". |
study_abbreviation |
Character string providing the abbreviation of the study. Default is "TSN". |
investigator |
Character string representing the investigator's name. Default is "Investigator name". |
methodologist |
Character string specifying the methodologist's name. Default is "Jean Dupont". |
biostatistician |
Character string specifying the biostatistician's name. Default is "George Frais". |
datamanager |
Character string specifying the data manager's name. Default is "Peter Parker". |
methodologist_mail |
Character string specifying the methodologist's email. If NULL, it is generated automatically. |
biostatistician_mail |
Character string specifying the biostatistician's email. If NULL, it is generated automatically. |
datamanager_mail |
Character string specifying the data manager's email. If NULL, it is generated automatically. |
language |
Character string indicating the language of the report, either "fr" (French) or "en" (English). Default is "fr". |
None. The function writes an HTML report to the specified file path.
This function creates and writes a CSS file with predefined styling for tables and text formatting.
write_css(path)write_css(path)
path |
Character string specifying the file path where the CSS file will be saved. |
None. The function writes a CSS file to the specified file path.
This function creates and writes an HTML report file based on specified study and structure details.
write_html_file( path, structure = "USMR", path_logo = NULL, confidential = FALSE, report_type = "Data review report", study_id = "CHUBXYYYY/NN", study_name = "The Study Name", study_abbreviation = "TSN", investigator = "Investigator name", methodologist = "Jean Dupont", biostatistician = "George Frais", datamanager = "Peter Parker", methodologist_mail = NULL, biostatistician_mail = NULL, datamanager_mail = NULL, language = "fr" )write_html_file( path, structure = "USMR", path_logo = NULL, confidential = FALSE, report_type = "Data review report", study_id = "CHUBXYYYY/NN", study_name = "The Study Name", study_abbreviation = "TSN", investigator = "Investigator name", methodologist = "Jean Dupont", biostatistician = "George Frais", datamanager = "Peter Parker", methodologist_mail = NULL, biostatistician_mail = NULL, datamanager_mail = NULL, language = "fr" )
path |
Character string specifying the file path where the HTML file will be saved. |
structure |
Character string indicating the organizational structure, either "USMR" or "EUCLID". Default is "USMR". |
path_logo |
Character string specifying the path to the logo image. If NULL, a default logo is used. |
confidential |
Logical value indicating whether the report should be marked as confidential. Default is FALSE. |
report_type |
Character string specifying the type of report. Default is "Data review report". |
study_id |
Character string representing the study identifier. Default is "CHUBXYYYY/NN". |
study_name |
Character string specifying the name of the study. Default is "The Study Name". |
study_abbreviation |
Character string providing the abbreviation of the study. Default is "TSN". |
investigator |
Character string representing the investigator's name. Default is "Investigator name". |
methodologist |
Character string specifying the methodologist's name. Default is "Jean Dupont". |
biostatistician |
Character string specifying the biostatistician's name. Default is "George Frais". |
datamanager |
Character string specifying the data manager's name. Default is "Peter Parker". |
methodologist_mail |
Character string specifying the methodologist's email. If NULL, it is generated automatically. |
biostatistician_mail |
Character string specifying the biostatistician's email. If NULL, it is generated automatically. |
datamanager_mail |
Character string specifying the data manager's email. If NULL, it is generated automatically. |
language |
Character string indicating the language of the report, either "fr" (French) or "en" (English). Default is "fr". |
None. The function writes an HTML report to the specified file path.
This function generates a Quarto Markdown (.qmd) file with predefined metadata and a sample table.
write_qmd(path, path_html, path_css, study_abbreviation, name)write_qmd(path, path_html, path_css, study_abbreviation, name)
path |
Character string specifying the output file path for the .qmd file. |
path_html |
Character string specifying the path to an HTML file to be included before the body of the document. |
path_css |
Character string specifying the path to a CSS file for styling the document. |
study_abbreviation |
Character string providing the abbreviation of the study. |
name |
The name of the files |
The function creates a Quarto Markdown file with metadata fields such as title, author, date, and format settings.
The HTML file specified in path_html is included before the body, and the CSS file specified in path_css
is used for styling. The generated document includes an example of a start of report.
None. The function writes a .qmd file to the specified path.
Write quarto extension
write_quarto_yml(path)write_quarto_yml(path)
path |
The path toward quarto yaml file |
nothing
A function to write a R file rendercopy
write_rendercopy(output_folder, path, name)write_rendercopy(output_folder, path, name)
output_folder |
The output folder |
path |
The path of the R script |
name |
The name of the files |
Nothing