为什么我们热衷于需要多长时间到已排序的文件进行排序?文件、需要多长时间

2023-09-11 04:51:25 作者:初夏那抹浅蓝

这是要求谷歌在接受采访时我没有得到答案。更糟糕的是没有得到的问题。

This is asked in google interview I didn't get the answer. Even worse didn't got the question.

在讨论我们谈论已排序的文件的行为排序算法。为什么我们热衷于需要多长时间到已排序的文件进行排序?解释你的答案简单?

When discussing sorting algorithms we talk about the behavior on files that are already sorted. Why are we interested in how long it takes to sort a file that is already sorted? Explain your answer briefly ?

推荐答案

现在的问题基本上是:

为什么我们关心排序算法将如何表现在输入这已经是排序?

Why do we care how a sorting algorithm will behave on an input which is already sorted?

长话短说,文件往往会被排在具有较高的概率比预期的 1 / N!这是假设该文件是随机排列的 1 。

Long story short, files tend to be sorted with higher probability than the "expected" of 1/n! that is assuming the file is randomly permuted1.

下面是在这里请大家多多关照了算法的性能,如果数组/文件已经按两个用例:

Here are two use cases where we care a lot for the performance of the algorithm if the array/file is already sorted:

用户往往不检查自己的文件正在使用的API,并重新排序之前已经排序,将已排序(原料药),并因为它的几率是不是苗条(因为已经有人排序在某些时候),这种最坏情况的行为不是东西,是不可能发生的。这将使我们的比较我们的竞争对手API slowish谁也关心它。

Users (of API) don't tend to check if their file is already sorted before using the API and sorting it again, and since the probability of it to be already sorted is not that slim (because someone already sorted it at some point), this worst case behavior is not something that is unlikely to happen. This will make "our" API slowish comparing to our competitors who do care about it.

如果我们知道它是如何工作的一个排序的文件,它很可能会同样表现上几乎排序的文件,并再次 - 此输入更容易。假设用户有一个文件,追加了一些条目它,它再次发送给排序算法 - 文件几乎排序,而且性能会非常接近一个预期来分类的人

If we know how it works on a sorted file, it will most likely behave similarly on an almost sorted file, and again - this input is even more likely. Assume a user has a file, appends some entries to it, and send it to the sorting algorithm again - the file is almost sorted, and the performance will be very close to the one expected on sorted ones.

脚踏注意事项:

(1)这是一个经验事实,由于增量处理的性质,数学的支持是:随机生成的文件拥有的1 / n的概率!将已经排序。假定有一定的概率 P 该文件是自上次更新排序。这意味着,它被排序的概率不是1 / N!了,它是 P +(1-p)的1 / N!。假设 P> 0 ,则表示将已排序的文件比其他文件的概率高的概率

(1) It's an empirical fact due to the nature of incremental processing, a mathematical support is: A file that is randomly generated has a probability of 1/n! to be already sorted. Assume there is some probability p that the file was sorted since last update. It means that the probability of it being sorted is not 1/n! anymore, it is p + (1-p)1/n!. Assuming p>0, it means the probability for the file to be already sorted is higher than the probability of other files.