为什么浮动分裂慢？

2023-09-10 23:29:51 作者：寻一夜情管饭

什么是算法中的步骤做浮点除法？

What are the steps in the algorithm to do floating point division?

为什么比说，乘法？

时它做我们做师用手工的一样吗？通过由除数重复进行分流，减去的结果，以获得一个余数，再次对准数和持续直到余数是小于一个特定值α

Is it done the same way we do division by hand? By repeatedly dividing by the divisor, subtracting the result to obtain a remainder, aligning the number again and continuing till the remainder is less than a particular value?

另外，为什么我们得到的性能，如果不是做

Also, why do we gain on performance if instead of doing

a = b / c

我们

d = 1 / c
a = b * d

编辑：基本上我是问，因为有人问我分发基于权重的分配竞争者之间的值。我做这一切的整数，后来要求转换为浮动，这就造成了经济放缓的表现。我知道如何将C或C ++执行这些操作，将导致缓慢只是有兴趣。

Basically I was asking because someone asked me to distribute a value among contenders based on the assignment of weights. I did all this in integers and was later asked to convert to float, which caused a slowdown in performance. I was just interested in knowing how would C or C++ do these operations that would cause the slowness.

推荐答案

从图区划硬件点是一个迭代算法，与所花费的时间正比于比特数。最快的部门，目前各地采用radix4算法生成每次迭代的结果4位。对于32位除法需要8个步骤最少。

From a hardware point of view division is a iterative algorithm, and the time it takes is proportional to the number of bits. The fastest division that is currently around uses the radix4 algorithm which generates 4 bit of result per iteration. For a 32 bit divide you need 8 steps at least.

乘法可以并行完成到一定程度。如果没有细谈，你可以打破一个大的倍增成几个小的，独立的。这些乘法可以直到你在一个位级，或前面停下来，在硬件上使用小查找表再次被分解。这使得乘法硬件从一个硅房地产点重，但速度非常快为好。这是经典的大小/速度的权衡。

Multiplication can be done in parallel to a certain degree. Without going into detail you can break up a large multiplication into several smaller, independent ones. These multiplications can again be broken down until you're at a bit-level, or you stop earlier and use a small lookup-table in hardware. This makes the multiplication hardware heavy from a silicon real estate point of view but very fast as well. It's the classic size/speed tradeoff.

您需要LOG2步骤结合并行计算的结果，所以32位乘法需要5个逻辑步骤（如果你去到最小）。幸运的是这5个步骤是一个很好的协议比分裂步骤（它只是增加）简单。这意味着，在实践中的乘法甚至更快。

You need log2 steps to combine the parallel computed results, so a 32 bit multiply need 5 logical steps (if you go down to the minimum). Fortunately these 5 steps are a good deal simpler than the division steps (it's just additions). That means in practice multiplies are even faster.

上一篇：产生随机数，而无需使用任何外部功能随机数、功能

下一篇：当双链表比单向链表更加有效？链表、有效

相关推荐

精彩图集

精彩推荐

图片推荐