Copyright | (c) CNRS 2017 - present |
---|---|

License | AGPL + CECILL v3 |

Maintainer | team@gargantext.org |

Stability | experimental |

Portability | POSIX |

Safe Haskell | Safe-Inferred |

Language | Haskell2010 |

Mainly reexport functions in `Data.Text.Metrics`

# Documentation

levenshtein :: Text -> Text -> Int #

This module provide metrics to compare Text starting as an API rexporting main functions of the great lib text-metrics of Mark Karpov

Levenshtein Similarity In information theory, Linguistics and computer science, the Levenshtein distance is a string metric for measuring the difference between two sequences. See: https://en.wikipedia.org/wiki/Levenshtein_distance

damerauLevenshtein :: Text -> Text -> Int #

Return Damerau-Levenshtein distance between two `Text`

values. The
function works like `levenshtein`

, but the collection of allowed
operations also includes transposition of two *adjacent* characters.
See also:
https://en.wikipedia.org/wiki/Damerau%E2%80%93Levenshtein_distance

overlap :: Text -> Text -> Ratio Int #

Return overlap coefficient for two `Text`

values. Returned value
is in the range from 0 (no similarity) to 1 (exact match). Return 1
if both `Text`

values are empty.

See also: https://en.wikipedia.org/wiki/Overlap_coefficient.

hamming :: Text -> Text -> Maybe Int #

Hamming Similarity In information theory, the Hamming distance between two strings of equal length is the number of positions at which the corresponding symbols are different. In other words, it measures the minimum number of substitutions required to change one string into the other See: https://en.wikipedia.org/wiki/Hamming_distance