20210307 阅读整理

文章目录

Fuzzy Name Matching in Postgres

https://blog.crunchydata.com/blog/fuzzy-name-matching-in-postgresql
- Remember text_pattern_ops for indexed prefix filtering.
- Use lower() and functional indexes for case-insensitive queries.
- Combine indexes on soundex() with expensive tests like levenshtein() to get fast searches for fuzzy queries.


Fuzzy Name Matching in Postgres

https://blog.crunchydata.com/blog/fuzzy-name-matching-in-postgresql

  1. Prefix Matching
SELECT * FROM names WHERE surname LIKE 'Back%';
CREATE INDEX names_surname_txt ON names (surname);
-- The problem is that the default operator class for the "text" data type does not support pattern matching. 
-- If you are going to do prefix matching, you need to create an index using the "text_pattern_ops" operator class.
CREATE INDEX names_surname_txt ON names (surname text_pattern_ops);
  1. Case Insensitivity
SELECT * FROM names WHERE lower(surname) LIKE lower('Back%');
CREATE INDEX names_surname_txt ON names (lower(surname) text_pattern_ops);
  1. Levenshtein Matching
    • levenshtein(‘mister’,‘mister’) == 0
    • levenshtein(‘mister’,‘master’) == 1
    • levenshtein(‘mister’,‘maser’) == 2
    • levenshtein(‘mister’,‘laser’) == 3

Soundex Indexing
- soundex(‘Harrington’) = H652
- soundex(‘Harington’) = H652
- soundex(‘Herington’) = H652
- soundex(‘Heringtan’) = H652

Soundex + Levenshtein

CREATE INDEX surname_soundex ON names (soundex(surname));

WITH q AS (
  SELECT 'Robert' AS qgn, 'Harington' AS qsn
)
SELECT  
  levenshtein(lower(concat(surname,givenname)),lower(concat(qsn, qgn))) AS leven,
  names.*
FROM names, q
WHERE soundex(surname) = soundex(qsn)
AND levenshtein(lower(concat(surname,givenname)),lower(concat(qsn, qgn))) < 2

Conclusion
- Remember text_pattern_ops for indexed prefix filtering.
- Use lower() and functional indexes for case-insensitive queries.
- Combine indexes on soundex() with expensive tests like levenshtein() to get fast searches for fuzzy queries.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值