最简单的转换为十进制浮点到位再presentation手动基于IEEE 754,不使用任何库方式浮点、最简单、转换为、方式

2023-09-11 06:08:31 作者:作业,你被判死刑了

我知道有一些方法可以读取用书面库一个IEEE 754浮点数的每一位。

I know there are number ways to read every bit of a IEEE 754 float using written libraries.

我不希望这样,我希望能够为手动转换为十进制浮点基于IEEE 754。二重presentation

I don't want that, and I want to be able to manually convert a decimal float to binary representation based on IEEE 754.

我理解IEEE 754是如何工作的,我只是想运用它。

I understand how IEEE 754 works and I am just trying to apply it.

我问这个问题,在这里只是想看看我的方式是正常的还是愚蠢的,我也想知道如何快速的PC做它。

I ask this question here just want to see whether my way is normal or stupid and I am also wondering how PC does it quickly.

如果我给一个十进制浮点的字符串中的的,我需要弄清楚什么的电子的是,什么的 M 的是。

If I am given a decimal float in a string, I need to figure out what the E is and what the M is.

获得两部分进行:整数部分和小数部分 F

处理 F 。我不断地多2 并获得整数部分(0或1),并取出的整数部分,然后重复,直到它变成0

deal with f. I constantly multiple 2 and get the integer part (either 0 or 1) and remove the integer part and then repeat, until it becomes 0

转换来位。这是很容易我只是不断地模2 DIV 2 获得的所有位我

convert i to bits. This is easy I just constantly mod 2 and div 2 to get all bits of i.

例如,转换 F 部分。

0.390625 * 2 = 0.78125 0 0.78125 * 2 = 1.5625 1 0.5625 * 2 = 1.125 1 0.125 * 2 = 0.25 0 0.25 * 2 = 0.5 0 0.5 * 2 = 1 1 0

0.390625 * 2 = 0.78125 0 0.78125 * 2 = 1.5625 1 0.5625 * 2 = 1.125 1 0.125 * 2 = 0.25 0 0.25 * 2 = 0.5 0 0.5 * 2 = 1 1 0

在这种情况下, 0.390625 的temparay位是 0 1 1 0 0 1

In this case, the temparay bits of 0.390625 is 0 1 1 0 0 1.

现在,我有位 F

如果所有的位 0,则对位数F 我shift_left,直到第一个 1 走了,根据默认是隐藏的 M 1 。我得到 M ,然后给予转移到E,考虑电子当然的基准值。

If all bits for i is 0, then on bits of f I shift_left it until the first 1 is gone, according to the default hidden 1 of M. I get M, then give the value of shifting to E, considering of E's baseline of course.

如果不为0的话,我串联两个位的一部分,计算有多少shift_right需要做的,使级联位为1,然后给这个值以电子

If i is not 0, then I concatenate both bits part and calculate how many shift_right I need to do to make the concatenated bits to be 1, then give this value to E

我想我所有的步骤都没有错。但我觉得这很麻烦。

I guess all my steps are not wrong. But I feel it very troublesome.

有一个简单而干净的方式?

Is there a easy and clean way?

如何做电脑做它?

推荐答案

请参阅文件的src / lib目录/ floating_point.ml 的src / lib目录/floating_point.mli 在邮资-C 。它们实现从十进制重新presentation浮点的单precision和双precision(你无法获得因的双舍入问题),无需任何外部库。这些文件涵盖了LGPL 2.1。这个实现是一对夫妇的博客上的讨论开始于这个和这个。

See the files src/lib/floating_point.ml and src/lib/floating_point.mli in Frama-C. They implement the conversion from decimal representation to floating-point for single-precision and double-precision (you cannot obtain the former from the latter because of double rounding issues), without any external library. The files are covered by the LGPL 2.1. This implementation is the subject of a couple of blog posts starting at this one and continuing with this one.

这是可能接近最简单的转换功能可以使,在写这个功能,我没有性能限制,只希望保持code一样简单和尽可能正确,不希望走向依赖现有的库如MPFR。

This is probably close to the simplest conversion function one can make, as in writing this function, I had no performance constraints and only hoped to keep the code as simple and as correct as possible, without wanting a dependence towards an existing library such as MPFR.

...
type parsed_float = {
  f_nearest : float ;
  f_lower : float ;
  f_upper : float ;
}

val single_precision_of_string: string -> parsed_float
val double_precision_of_string: string -> parsed_float
...
 
精彩推荐
图片推荐