Go to file
matthieugomez 18fd1ab91c correction for NormalizedQGram 2015-10-25 11:55:15 -04:00
benchmark default constructor 2015-10-24 08:59:44 -04:00
src correction for NormalizedQGram 2015-10-25 11:55:15 -04:00
test correction for NormalizedQGram 2015-10-25 11:55:15 -04:00
.travis.yml first commit 2015-10-22 12:12:44 -04:00
LICENSE.md first commit 2015-10-22 12:12:44 -04:00
README.md readme 2015-10-25 11:45:15 -04:00
REQUIRE first commit 2015-10-22 12:12:44 -04:00

README.md

Build Status Coverage Status

StringDistances allow to compute various distances between strings. It works with any string of type AbstractString (in particular ASCII and UTF-8)

Distances

  • Hamming Distance
  • Jaro Distance
  • Levenshtein Distance
  • Damerau-Levenshtein Distance
  • QGram Distance
  • Cosine Distance
  • Jaccard Distance

Syntax

  • The basic syntax follows the Distances package:

    using StringDistances
    evaluate(Hamming(), "martha", "marhta")
    evaluate(QGram(2), "martha", "marhta")
    
  • Normalize a distance between 0-1 with Normalized

    evaluate(Normalized(Hamming()), "martha", "marhta")
    evaluate(Normalized(QGram(2)), "martha", "marhta")
    
  • Add a Winkler adjustment with Winkler

    evaluate(Winkler(Jaro()), "martha", "marhta")
    evaluate(Winkler(Qgram(2)), "martha", "marhta")
    

    While the Winkler adjustment was originally defined in the context of the Jaro distance, it can be helpful with other distances too. Note: a distance is automatically normalized between 0 and 1 when used with a Winkler adjustment.