找到重复的字,词的无限流

2023-09-11 06:46:49 作者:靑春灬狠扯疍

您将得到一个无限供给的话,这一个来了一个,和长度的话,可能是巨大的,是未知的它是多么大了。您将如何发现,如果新词是重复的,你会使用到store.This什么数据结构是问题问我的采访。请帮我验证我的答案。

You are given an infinite supply of words, which are coming one by one, and length of words, can be huge and is unknown how big it is. How will you find if the new word is repeated, what data structure will you use to store.This was the question asked to me in the interview .please help me to verify my answer.

推荐答案

通常使用一个哈希表来跟踪每个单词的计数。因为你只需要回答的话是否是重复的,可以减少字数,以一个位掩码,这样你只存储一个比特的每一个哈希索引。

Normally use a hash-table to keep track of the count of each word. Since you only have to answer whether the words are duplicated, you can reduce the word count to a bitmask, so that you only store a single bit for each hash index.

如果问题涉及到大数据,想怎么写谷歌搜索引擎,你的回答可能需要涉及到马preduce或类似的分布式技术(它扎根多少有些相同的哈希表技术的描述以上)

If the question is related to big data, like how to write a search engine for Google, your answer may need to relate to MapReduce or similar distributed techniques (which takes root somewhat in same hash table techniques as described above)

相关推荐