Compare Numerical Similarity Across Lists — same

Computes similarity scores between two or more lists of numeric values. Applies various mathematical comparison methods including exact matching, percentage differences, normalized differences, fuzzy threshold-based matching, and exponential decay metrics.

Usage

same_number(
  ...,
  method = c("exact", "raw", "exp", "percent", "normalized", "fuzzy"),
  epsilon = 0.05,
  epsilon_pct = 0.02,
  max_diff = NULL,
  digits = 3
)

Arguments

...: Two or more lists containing numeric values to compare.
method: Character vector specifying similarity methods (default: c("exact", "raw", "exp", "percent", "normalized", "fuzzy")).
epsilon: Threshold for fuzzy matching (default: NULL for auto-calculation).
epsilon_pct: Relative epsilon percentile for "fuzzy" method (default: 0.02).
max_diff: Maximum difference for normalization (default: NULL for auto-calculation).
digits: Number of digits to round results (default: 3).

Value

An S3 object containing:

scores: A list of similarity scores for each method and list pair
summary: A list of statistical summaries for each method and list pair
methods: The similarity methods used
list_names: Names of the input lists
raw_values: The original input lists

Details

The available methods are:

exact: Binary similarity (1 if equal, 0 otherwise)
percent: Percentage difference relative to the larger value
normalized: Absolute difference normalized by a maximum difference value
fuzzy: Similarity based on an epsilon threshold
exp: Exponential decay based on absolute difference (e^-diff)
raw: Returns the raw absolute difference (|num1 - num2|) instead of a similarity score

Examples

list1 <- list(1, 2, 3)
list2 <- list(1, 2.1, 3.2)

# Using unnamed lists
result1 <- same_number(list1, list2)
#> ℹ Using auto-calculated max_diff: 2.2
#> ✔ Computed exact scores for "list1_list2" [mean: 0.333]
#> ✔ Computed raw scores for "list1_list2" [mean: 0.1]
#> ✔ Computed exp scores for "list1_list2" [mean: 0.908]
#> ✔ Computed percent scores for "list1_list2" [mean: 0.963]
#> ✔ Computed normalized scores for "list1_list2" [mean: 0.955]
#> ✔ Computed fuzzy scores for "list1_list2" [mean: 0.978]

# Using named lists for more control
result2 <- same_number("n1" = list1, "n2" = list2)
#> ℹ Using auto-calculated max_diff: 2.2
#> ✔ Computed exact scores for "n1_n2" [mean: 0.333]
#> ✔ Computed raw scores for "n1_n2" [mean: 0.1]
#> ✔ Computed exp scores for "n1_n2" [mean: 0.908]
#> ✔ Computed percent scores for "n1_n2" [mean: 0.963]
#> ✔ Computed normalized scores for "n1_n2" [mean: 0.955]
#> ✔ Computed fuzzy scores for "n1_n2" [mean: 0.978]