Levenshtein
is a great algorithm to detect typos in a search query. It operates
based on how distant one search term is from another term. Starting with
the “source” word, it counts the number of operations (additions,
subtractions, substitutions) it takes to arrive at the “destination”
word. This make the Levenshtein algorithm particularly good at catching
seach typos, and uncommon spellings.
(https://mrfrosti.com/2011/08/22/fuzzy-matching-in-postgresql-with-nicknames/)
Previamente habilitar la extensión:
CREATE EXTENSION fuzzystrmatch;
Ejemplo:
SELECT levenshtein('iden', 'idem');
Ejemplo: comparando el campo NOMBRE de dos tablas:
SELECT a.nombre1, b.nombre2, levenshtein(a.nombre1, b.nombre2) as leven
from tabla1 a, tabla2 b
where
and levenshtein(a.nombre_com, b.nombre_c_1) >= 0
order by leven
links relacionados:
https://www.postgresql.org/docs/9.3/static/functions-matching.html
https://www.laurivan.com/fuzzy-string-matching-in-postgresql/
https://www.rdegges.com/2013/easy-fuzzy-text-searching-with-postgresql/
No hay comentarios.:
Publicar un comentario