Package 'tjmisc'

Title: TJ's Miscellany
Description: A collection of helper functions.
Authors: Tristan Mahr [aut, cre]
Maintainer: Tristan Mahr <[email protected]>
License: GPL-3
Version: 0.0.0.9000
Built: 2024-10-29 04:17:27 UTC
Source: https://github.com/tjmahr/tjmisc

Help Index


Annotating plots with a grey background

Description

Annotating plots with a grey background

Usage

annotate_label_grey(
  label,
  x,
  y,
  size = 4,
  fill = "#EBEBEB99",
  hjust = 0,
  vjust = 0,
  label.size = 0,
  ...
)

Arguments

label

Text to write on the plot.

x, y

x and y positions.

size, fill, hjust, vjust, label.size

Plotting aesthetics that this function handles. They can be overridden.

...

Other parameters to pass onto ggplot2::annotate().

Value

An annotation layer for a ggplot2 plot.


Compare pairs of categorical variables

Description

Compare pairs of categorical variables

Usage

compare_pairs(data, levels, values, f = `-`)

Arguments

data

a dataframe

levels

a column with a categorical variable. All pairs of values in levels will be compared.

values

a column with values to compare.

f

comparison function to apply to values in each pair. Defaults to - to compute the pairwise differences.

Value

a dataframe with pairwise comparisons

Examples

to_compare <- nlme::Machines %>%
  dplyr::group_by(Worker) %>%
  dplyr::summarise(avg_score = mean(score)) %>%
  print()

to_compare %>%
  compare_pairs(Worker, avg_score) %>%
  dplyr::rename(difference = value) %>%
  dplyr::mutate_if(is.numeric, round, 1)

Compare two vectors using R's set operations

Description

Compare two vectors using R's set operations

Usage

compare_sets(x, y)

Arguments

x, y

vectors to compare

Value

a list with lengths (the lengths of the other elements), x, y, unique(x), unique(y), setequal(x, y), setdiff(x, y), setdiff(y, x), intersect(x, y), union(x, y).

Examples

yours <- c(1, 2, 3, 4, 4)
mine <- c(3, 5, 6, 4)
compare_sets(yours, mine)

Count words in an Rmarkdown file

Description

These functions strips away code and non-prose elements before counting words.

Usage

count_words_in_rmd_file(path)

count_words_in_rmd_lines(lines)

simplify_rmd_lines(lines)

Arguments

path

path to an Rmarkdown file

lines

a character vector of text (from an Rmarkdown file)

Details

The helper function simplify_rmd_lines() strips down an Rmarkdown file so that dubious things do not contribute to the word count. It does the following.

  1. Remove all lines that fall between a pair of ⁠````⁠ lines. (These are used sometimes to show verbatim text from blocks with three tick marks).

  2. Remove all lines that fall between a pair of ⁠```⁠ lines.

  3. Lines that end with ⁠`r⁠ are merged with the following line.

  4. Inline code spans are replaced with a single word (`code`).

  5. Single-line HTML comments are deleted.

These steps are very ad hoc, updated and expanded as I run into new things that need to be excluded from my word counts. Let's not pretend that this thing is at all comprehensive.

The word-count is computed by stringi::stri_stats_latex().

Value

a data-frame with the counts of word, characters in words, and whitespace characters. simplify_rmd_lines() returns a character vector of simplified Rmarkdown lines.


Format the labels of a factor

Description

Format the labels of a factor

Usage

fct_glue_labels(xs, fmt = "{levels}", first_fmt = fmt)

fct_add_counts(xs, fmt = "{levels} ({counts})", first_fmt = fmt)

Arguments

xs

a factor

fmt

glue-style format to use. Defaults to "{levels}" for fct_glue_labels() and "{levels} ({counts})" for fct_add_counts().

first_fmt

glue-style format to use for very first label. Defaults to value of fmt.

Details

At this point, only the magic variables "{levels}" and "{counts}" are available ". In principle, others could be defined. fct_add_counts() is a special case of fct_glue_labels().

Value

a factor with the labels updated


Plot columns of a matrix

Description

Creates plots of matrices like graphics::matplot() but uses ggplot2, defaults to drawing lines, and can specify a column to use for the x-axis.

Usage

ggmatplot(x, x_axis_column = NULL, n_colors = 6, unique_rows = TRUE)

Arguments

x

A matrix.

x_axis_column

Index (number) of the column to plot for the x-axis. Defaults to NULL in which case it uses row index (number) as the x-axis.

n_colors

Number of colors to cycle through. Defaults to 6.

unique_rows

Whether to work first take the unique rows of the matrix. Defaults to TRUE.

Value

a ggplot2 plot.


Preview a file that would be created by ggsave()

Description

This function saves a plot to a temporary file with ggsave() and opens the temporary file in the system viewer. This function is useful for quickly previewing how a plot will look when it is saved to a file.

Usage

ggpreview(..., device = "png")

Arguments

...

options passed onto ggplot2::ggsave()

device

the file extention of the device to use. Defaults to "png".


Check for locally repeating values

Description

Check for locally repeating values

Usage

is_same_as_last(xs)

replace_if_same_as_last(xs, replacement = "")

Arguments

xs

a vector

replacement

a value used to replace a repeated value. Defaults to "".

Value

is_same_as_last() returns TRUE when xs[n] the same as xs[n-1].

Examples

xs <- c("a", "a", "a", NA, "b", "b", "c", NA, NA)
is_same_as_last(xs)
replace_if_same_as_last(xs, "")

Create a Jekyll draft post

Description

This is the function I use to create new posts for my website.

Usage

jekyll_create_rmd_draft(
  slug = NULL,
  date = NULL,
  dir_drafts = "./_R/_drafts",
  open = TRUE
)

Arguments

slug

A "slug" to use for the post. Should be a string consisting of ⁠"hypen-separated-content-words⁠. Defaults to NULL in which case a random slug is created.

date

Date string to use for the post. Default to NULL for the current date format(Sys.Date()).

dir_drafts

Relative path to the folder to store the drafts. Defaults to ⁠"./_R/_drafts⁠.

open

Whether to open the file for editing when using RStudio. Defaults to TRUE.

Value

The path to the created file is invisibly returned.


Randomly sample data from n sub-groups of data

Description

Randomly sample data from n sub-groups of data

Usage

sample_n_of(data, size, ...)

Arguments

data

a dataframe

size

number of groups to sample

...

variables to group by

Value

the data from subgroups

Examples

sample_data <- tibble::tibble(
  letter = rep(letters, 5),
  color = rep(c("red", "green", "yellow", "orange", "blue"), 26),
  value = rnorm(26 * 5)
)

# data from two letters
sample_data %>%
  sample_n_of(2, letter)

# data from two colors
sample_data %>%
  sample_n_of(2, color)

# data from 10 letter-colors pairs
sample_data %>%
  sample_n_of(10, letter, color)

Create a sequence along the rows of a dataframe

Description

Create a sequence along the rows of a dataframe

Usage

seq_along_rows(data)

Arguments

data

a dataframe

Value

a sequence of integers along the rows of a dataframe


Which lines fall in between a delimeter pattern

Description

Which lines fall in between a delimeter pattern

Usage

str_which_between(string, pattern)

Arguments

string

a character vector

pattern

a regular expression pattern to look for

Value

the lines that are contained between pairs of delimiter patterns

Examples

string <- "
```{r}
# some code
```

Here is more code.

```markdown
**bold!**
```
"

lines <- unlist(strsplit(string, "\n"))
str_which_between(lines, "^```")

Generate tidy correlations

Description

This function respects groupings from dplyr::group_by(). When the dataframe contains grouped data, the correlations are computed within each subgroup of data.

Usage

tidy_correlation(data, ..., type = c("pearson", "spearman"))

Arguments

data

a dataframe

...

columns to select, using dplyr::select() semantics.

type

type of correlation, either "pearson" (the default) or "spearman".

Value

a long dataframe (a tibble) with correlations calculated for each pair of columns.

Examples

tidy_correlation(ChickWeight, -Chick, -Diet)

tidy_correlation(ChickWeight, weight, Time)

ChickWeight %>%
  dplyr::group_by(Diet) %>%
  tidy_correlation(weight, Time)

Generate tidy quantiles for a dataframe column

Description

This function respects groupings from dplyr::group_by(). When the dataframe contains grouped data, the quantiles are computed within each subgroup of data.

Usage

tidy_quantile(data, var, probs = seq(0.1, 0.9, 0.2))

Arguments

data

a dataframe

var

a column in the dataframe

probs

quantiles to return. Defaults to c(.1, .3, .5, .7, .9)

Value

a long dataframe (a tibble) with quantiles for the variable.

Examples

tidy_quantile(sleep, extra)

sleep %>%
  dplyr::group_by(group) %>%
  tidy_quantile(extra)

Colors I like

Description

Colors I like

Usage

tjm_colors

Format

An object of class list of length 8.