Compare Numerical Similarity Across Lists
same_number.Rd
Computes similarity scores between two or more lists of numeric values using multiple comparison methods.
Usage
same_number(
...,
method = c("exact", "raw", "exp", "percent", "normalized", "fuzzy"),
epsilon = 0.05,
epsilon_pct = 0.02,
max_diff = NULL,
digits = 3
)
Arguments
- ...
Two or more lists containing numeric values to compare. Can be named (e.g.,
"l1" = list1, "l2" = list2
) to control list names.- method
Character vector specifying similarity methods (default: all)
- epsilon
Threshold for fuzzy matching (default: NULL for auto-calculation)
- epsilon_pct
Relative epsilon percentile (default: 0.02 or 2%). Only used when method is "fuzzy"
- max_diff
Maximum difference for normalization (default: NULL for auto-calculation)
- digits
Number of digits to round results (default: 3)
Value
An S3 object containing:
scores
: A list of similarity scores for each method and list pairsummary
: A list of statistical summaries for each method and list pairmethods
: The similarity methods usedlist_names
: Names of the input listsraw_values
: The original input lists
Details
The available methods are:
exact
: Binary similarity (1 if equal, 0 otherwise)percent
: Percentage difference relative to the larger valuenormalized
: Absolute difference normalized by a maximum difference valuefuzzy
: Similarity based on an epsilon thresholdexp
: Exponential decay based on absolute difference (e^-diff)raw
: Returns the raw absolute difference (|num1 - num2|) instead of a similarity score
Examples
list1 <- list(1, 2, 3)
list2 <- list(1, 2.1, 3.2)
# Using unnamed lists
result1 <- same_number(list1, list2)
#> ℹ Using auto-calculated max_diff: 2.2
#> ✔ Computed exact scores for "list1_list2" [mean: 0.333]
#> ✔ Computed raw scores for "list1_list2" [mean: 0.1]
#> ✔ Computed exp scores for "list1_list2" [mean: 0.908]
#> ✔ Computed percent scores for "list1_list2" [mean: 0.963]
#> ✔ Computed normalized scores for "list1_list2" [mean: 0.955]
#> ✔ Computed fuzzy scores for "list1_list2" [mean: 0.978]
# Using named lists for more control
result2 <- same_number("n1" = list1, "n2" = list2)
#> ℹ Using auto-calculated max_diff: 2.2
#> ✔ Computed exact scores for "n1_n2" [mean: 0.333]
#> ✔ Computed raw scores for "n1_n2" [mean: 0.1]
#> ✔ Computed exp scores for "n1_n2" [mean: 0.908]
#> ✔ Computed percent scores for "n1_n2" [mean: 0.963]
#> ✔ Computed normalized scores for "n1_n2" [mean: 0.955]
#> ✔ Computed fuzzy scores for "n1_n2" [mean: 0.978]