算法自动完成？算法、自动完成

2023-09-10 22:46:28 作者：撕心不用裂肺

我指的是用来给查询建议的算法，当用户在谷歌搜索词。

I am referring to the algorithm that is used to give query suggestions when a user types a search term in Google.

我主要感兴趣的是如何谷歌的算法是能够显示： 1.最重要的成果（最有可能的查询，而不是任何匹配） 2.匹配子 3.模糊匹配

I am mainly interested in how Google's algorithm is able to show: 1. Most important results (most likely queries rather than anything that matches) 2. Match substrings 3. Fuzzy matches

我知道你可以使用特里或全身线索找到匹配，但它不能满足上述要求...

I know you could use Trie or generalized trie to find matches, but it wouldn't meet the above requirements...

类似的问题问早些时候here

推荐答案

有关（嘿嘿）真棒模糊/部分字符串匹配算法，看看该死的冷静算法：

For (heh) awesome fuzzy/partial string matching algorithms, check out Damn Cool Algorithms:

http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levenshtein-Automata http://blog.notdot.net/2007/4/Damn-Cool-Algorithms-Part-1-BK-Trees http://blog.notdot.net/2010/07/Damn-Cool-Algorithms-Levenshtein-Automata

这些不替换尝试，而是prevent蛮力查找的尝试 - 这仍然是一个巨大的胜利。接下来，你可能想办法约束线索的大小：

These don't replace tries, but rather prevent brute-force lookups in tries - which is still a huge win. Next, you probably want a way to bound the size of the trie:

在保持全球使用近/前N个字的线索; 为每个用户，近期保持/顶N个字的线索为该用户。

最后，要prevent查找尽可能...

Finally, you want to prevent lookups whenever possible...

在缓存中查找结果：如果用户点击任何搜索结果，你可以成为那些非常快，然后异步获取完整的部分/模糊查找。 precompute查找结果：如果用户输入申请，他们很可能会继续与苹果，应用 prefetch数据：例如，一个web应用程序可以发送一组结果到浏览器的小，小到足以使蛮力搜索的JS可行。

上一篇：如何选择之间的哈希表和特里（preFIX树）？特里、如何选择、哈希表、preFIX

下一篇：采用均值漂移图像分割解释均值、图像

相关推荐