算法来匹配输入文件与文件提供的数字文件、算法、数字

2023-09-11 23:03:51 作者:背后的伤痛谁能懂

上周我参加了一个面谈。我被困在算法一轮的问题之一。我回答这个问题,但面试官似乎并不服气。这就是为什么我共享相同的。

I had an interview last week. I was stuck in one of the question in algorithm round. I answered that question, but the interviewer did not seem convinced. That's why I am sharing the same.

请告诉我,任何优化方法这个问题,所以,这将帮助我在今后的采访。

Please tell me any optimized method for this question, so that it will help me in future interviews.

问题: -

有鉴于20文本文件,所有文件都是ASCII文本文件,其   尺寸小于10 ^ 9个字节。有一个输入端也给出,这是   还有1 ASCII文件,比方说,input.txt的。

There are 20 text files given, all files are ASCII text files, having size less than 10^9 bytes. There is one input also given, this is also one ASCII file , say, input.txt.

我们的任务是战略性该输入文件的内容匹配   定的20个文件,并打印最接近匹配的文件的名称。该   输入文件的内容可能只符合部分

Our task is to strategically match the content of this input file with given 20 files, and print the name of closest matching file. The contents of input file might only match partially

在此先感谢。寻找您的善意回复。

Thanks in advance. Looking for your kind reply.

推荐答案

您可以创建一些类型的索引(例如:线索)来概括输入文件。然后,您可以检查有多少指标符合整个文档。

You can create some kind of indexing (example: trie) to summarize the input file. Then you can check how many indices match across documents.

例如。创建一个线索输入文件的长度10.对于长度为10的每一个字符串(重叠)在文本文件中查看有多少人在线索相匹配。

Eg. Create a trie for input file for length 10. For every string of length 10 (overlapping) in the text files check how many of them match in the trie.