为什么埃拉托色尼的筛比简单的&QUOT更有效的哑QUOT;算法?算法、更有效、简单、埃拉托色尼

2023-09-11 01:49:11 作者:晨霧清白

如果你需要生成素数从1到N,哑巴的方式做到这一点是通过所有的号码重复2至N和检查数字divisable通过迄今为止所发现的任何素数是比有问题的数的平方根以下。

If you need to generate primes from 1 to N, the "dumb" way to do it would be to iterate through all the numbers from 2 to N and check if the numbers are divisable by any prime number found so far which is less than the square root of the number in question.

在我看来,筛埃拉托色尼的不一样的,除了圆别的办法 - 当它发现一个素N,它标志着关闭所有的N个倍数的数

As I see it, sieve of Eratosthenes does the same, except other way round - when it finds a prime N, it marks off all the numbers that are multiples of N.

但是,你是否标志着关闭x当你发现N,或者你检查X是divisable由N,根本复杂性,大O保持不变。你还在做每一个数字贷对中的一个固定时间操作。事实上,哑算法脱落,只要找到一个素数,但筛埃拉托色尼标志着每个号码多次 - 一次为每个主要是divisable通过。这是一个最低限度的两倍多操作的,除了素数每一个数字。

But whether you mark off X when you find N, or you check if X is divisable by N, the fundamental complexity, the big-O stays the same. You still do one constant-time operation per a number-prime pair. In fact, the dumb algorithm breaks off as soon as it finds a prime, but sieve of Eratosthenes marks each number several times - once for every prime it is divisable by. That's a minimum of twice as many operations for every number except primes.

我误解的东西在这里?

推荐答案

在审判庭算法,可能需要最多的工作,确定了一批 N 是否黄金正在测试整除的素数高达约的sqrt(N)

In the trial division algorithm, the most work that may be needed to determine whether a number n is prime, is testing divisibility by the primes up to about sqrt(n).

这是最坏的情况下得到满足时, N 是一个素数或几乎相同的尺寸(包括素数的平方)的两个素数的乘积。如果 N 有超过两个素数的因素,还是很不同大小的两个主要因素,其中至少有一个是比开方小得多(N) ,所以即使需要对所有这些数字的积累的工作(这构成全数字绝大多数高达 N ,对于足够大的 N )是相对不重要,我会忽略这一点,与小说的合数都没有做任何工作,确定工作(两个近似相等的素数的产品数量很少,所以尽管单独他们的成本高达素大小相似,完全是工作的一个可以忽略不计)。

That worst case is met when n is a prime or the product of two primes of nearly the same size (including squares of primes). If n has more than two prime factors, or two prime factors of very different size, at least one of them is much smaller than sqrt(n), so even the accumulated work needed for all these numbers (which form the vast majority of all numbers up to N, for sufficiently large N) is relatively insignificant, I shall ignore that and work with the fiction that composite numbers are determined without doing any work (the products of two approximately equal primes are few in number, so although individually they cost as much as a prime of similar size, altogether that's a negligible amount of work).

那么,多少工作做的素数的测试高达 N 走?

So, how much work does the testing of the primes up to N take?

到了素数定理,素数的数< = N 是(对 N 足够大) ,约 N /日志ñ(这是 N /日志N +低阶项)。相反,这意味着的 K 的-th主要是(对的 K 的不是太小)关于 K * 1o9氏/ code>( +低阶项)。

By the prime number theorem, the number of primes <= n is (for n sufficiently large), about n/log n (it's n/log n + lower order terms). Conversely, that means the k-th prime is (for k not too small) about k*log k (+ lower order terms).

因此​​,测试的 K 的-th总理要求审判庭由 PI(开方(p_k)),约 2 * SQRT(K /日志K),素数。总结,对于 K&LT; =圆周率(N)〜N /日志N 产生大约 4/3 * N ^(3/2)/(日志N)^ 2 分裂的总和。所以忽略了复合材料,我们发现,发现的素数高达 N 由审判庭(只用素数),是欧米茄(N ^ 1.5 / (日志N)^ 2)。复合材料的仔细分析表明,它的的Theta(N ^ 1.5 /(日志N)^ 2)。使用滚轮减小恒定的因素,但不改变的复杂度。

Hence, testing the k-th prime requires trial division by pi(sqrt(p_k)), approximately 2*sqrt(k/log k), primes. Summing that for k <= pi(N) ~ N/log N yields roughly 4/3*N^(3/2)/(log N)^2 divisions in total. So by ignoring the composites, we found that finding the primes up to N by trial division (using only primes), is Omega(N^1.5 / (log N)^2). Closer analysis of the composites reveals that it's Theta(N^1.5 / (log N)^2). Using a wheel reduces the constant factors, but doesn't change the complexity.

在筛,在另一方面中,每个复合物划掉作为至少一个素数的倍数。根据您是否开始 2 * P 穿越关闭或 P * P ,复合材料被划掉一样多倍,它有不同的素因子或不同的素因子&LT; =开方(N)。由于任何数字最多有一个主要因素超过的sqrt(N),所不同的是没有那么大,它具有复杂性没有影响,但也有很多数字只有两个素数因子(或三个有一个大于的sqrt(N)),从而使运行时间明显的区别。总之,一些 N'GT; 0 只有几个不同的素因子,一个简单的估计表明,不同的素因子的数量为界 LGñ(基2对数) ,这样的上限数量口岸过滤网的作用是 N * LGñ

In the sieve, on the other hand, each composite is crossed off as a multiple of at least one prime. Depending on whether you start crossing off at 2*p or at p*p, a composite is crossed off as many times as it has distinct prime factors or distinct prime factors <= sqrt(n). Since any number has at most one prime factor exceeding sqrt(n), the difference isn't so large, it has no influence on complexity, but there are a lot of numbers with only two prime factors (or three with one larger than sqrt(n)), thus it makes a noticeable difference in running time. Anyhow, a number n > 0 has only few distinct prime factors, a trivial estimate shows that the number of distinct prime factors is bounded by lg n (base-2 logarithm), so an upper bound for the number of crossings-off the sieve does is N*lg N.

通过计算没有多久每个组合被划掉,但每个主要的许多倍如何划掉,因为IVlad已经做了,人们很容易发现的口岸客的数量实际上是在的Theta (N *日志日志N)。再次,使用滚轮不改变复杂但却降低了恒定的因素。但是,这里有比审判庭更大的影响力,所以至少跳过埃文斯应该做(除了减少工作,也降低了存储容量,因此提高了缓存位置)。

By counting not how often each composite gets crossed off, but how many multiples of each prime are crossed off, as IVlad already did, one easily finds that the number of crossings-off is in fact Theta(N*log log N). Again, using a wheel doesn't change the complexity but reduces the constant factors. However, here it has a larger influence than for the trial division, so at least skipping the evens should be done (apart from reducing the work, it also reduces storage size, so improves cache locality).

因此​​,即使不考虑该除法比加法和乘法更加昂贵,我们看到,操作筛要求的数量小于由试验除法所需操作的数目小得多(如果限制不是太小)。

So, even disregarding that division is more expensive than addition and multiplication, we see that the number of operations the sieve requires is much smaller than the number of operations required by trial division (if the limit is not too small).

总的来讲: 审判庭除以素数做徒劳的工作,筛反复穿越过复合材料做徒劳的工作。有相对较少的素数,但许多复合材料,因此人们可能会认为审判庭废物更少的工作。 但是:复合材料只有几个不同的质因数,同时有以下多个质数的sqrt(P)

Summarising: Trial division does futile work by dividing primes, the sieve does futile work by repeatedly crossing off composites. There are relatively few primes, but many composites, so one might be tempted to think trial division wastes less work. But: Composites have only few distinct prime factors, while there are many primes below sqrt(p).