Python的：找到最接近的字符串（从列表中）到另一个字符串字符串、最接近、列表中、Python

2023-09-11 03:38:21 作者：别等时光非礼了梦想.

让我们说我有一个字符串 你好和一个列表

Let's say I have a string "Hello" and a list

words = ['hello', 'Hallo', 'hi', 'house', 'key', 'screen', 'hallo','question', 'Hallo', 'format']

我如何找到 n个字这是最接近你好和present在列表字？

How can I find the n words that are the closest to "Hello" and present in the list words ?

在这种情况下，我们将有 ['你好'，'你好'，'你好'，'喜'，'格式'...]

In this case, we would have ['hello', 'hallo', 'Hallo', 'hi', 'format'...]

所以，策略是从最接近的单词列表字排序，最远的。

So the strategy is to sort the list words from the closest word to the furthest.

我想过这样的事情

word = 'Hello'
for i, item in enumerate(words):
    if lower(item) > lower(word):
      ...

但它在大型列表非常慢。

but it's very slow in large lists.

更新 difflib 的作品，但它的速度很慢也。（单词列表内部有630000+字（排序，每行一个））。因此，检查清单需要5到7秒，每寻找最接近的词！

UPDATE difflib works but it's very slow also. (words list has 630000+ words inside (sorted and one per line)). So checking the list takes 5 to 7 seconds for every search for closest word!

推荐答案

使用difflib.get_close_matches.

>>> words = ['hello', 'Hallo', 'hi', 'house', 'key', 'screen', 'hallo', 'question', 'format']
>>> difflib.get_close_matches('Hello', words)
['hello', 'Hallo', 'hallo']

请看看文档，因为该函数返回3个或更少最接近的匹配默认情况下。

Please look at the documentation, because the function returns 3 or less closest matches by default.

上一篇：网格方块的最小精确覆盖;额外的削减网格、方块、精确、最小

下一篇：数据结构为O（log N）查找和更新，考虑到小一级缓存数据结构、考虑到、缓存、log

相关推荐