Measure to compare two or more sets w.r.t. their similarity.

phi(sets, p, na_value = NaN, ...)



List of character or integer vectors. sets must have at least 2 elements.


Total number of possible elements.


Value that should be returned if the measure is not defined for the input (as described in the note). Default is NaN.


Additional arguments. Currently ignored.


Performance value as numeric(1).


The Phi Coefficient is defined as the Pearson correlation between the binary representation of two sets \(A\) and \(B\). The binary representation for \(A\) is a logical vector of length \(p\) with the i-th element being 1 if the corresponding element is in \(A\), and 0 otherwise.

If more than two sets are provided, the mean of all pairwise scores is calculated.

This measure is undefined if one set contains none or all possible elements.

Meta Information

  • Type: "similarity"

  • Range: \([-1, 1]\)

  • Minimize: FALSE


Package stabm which implements many more stability measures with included correction for chance.

Other Similarity Measures: jaccard()


sets = list(
  sample(letters[1:3], 1),
  sample(letters[1:3], 2)
phi(sets, p = 3)
#> [1] 0.5