原子能x86指令对齐要求原子能、指令

2023-09-11 07:32:12 作者:乖乖哒~

Microsoft提供的InterlockedCompareExchange函数执行原子比较并交换操作。还有一个_InterlockedCompareExchange 内在的。

Microsoft offers the InterlockedCompareExchange function for performing atomic compare-and-swap operations. There is also an _InterlockedCompareExchange intrinsic.

在86以下使用 CMPXCHG 指令来实现。

On x86 these are implemented using the cmpxchg instruction.

不过,通过这三种方法的文档阅读,他们似乎不同意对对齐要求。

However, reading through the documentation on these three approaches, they don't seem to agree on the alignment requirements.

英特尔参考手册一无所知对齐(除了说的如果启用对齐检查和未对齐的内存引用,将产生一个异常)

Intel's reference manual says nothing about alignment (other than that if alignment checking is enabled and an unaligned memory reference is made, an exception is generated)

我也看了起来锁定 preFIX,其中特别指出,

I also looked up the lock prefix, which specifically states that

的LOCK preFIX的完整性的不可以受内存字段的对齐。

The integrity of the LOCK prefix is not affected by the alignment of the memory field.

(重点煤矿)的

所以英特尔似乎是说对齐方式无关。该操作将原子无论是什么。

So Intel seems to say that alignment is irrelevant. The operation will be atomic no matter what.

_InterlockedCompareExchange 内在文档还一无所知对齐,但是 InterlockedCompareExchange 的功能的说:指出,

The _InterlockedCompareExchange intrinsic documentation also says nothing about alignment, however the InterlockedCompareExchange function states that

的参数,这个功能必须在32位边界对齐;否则,该函数的行为未predictably在多处理器的x86系统和任何非x86系统。

The parameters for this function must be aligned on a 32-bit boundary; otherwise, the function will behave unpredictably on multiprocessor x86 systems and any non-x86 systems.

那么怎么办? 对于 InterlockedCompareExchange 只是为了确保该函数将工作甚至在pre-486 CPU的其中 CMPXCHG 指令不可用? 这看起来可能是基于上述信息,但我想以确保之前我依赖它。 :)

So what gives? Are the alignment requirements for InterlockedCompareExchange just to make sure the function will work even on pre-486 CPU's where the cmpxchg instruction isn't available? That seems likely based on the above information, but I'd like to be sure before I rely on it. :)

或者需要由ISA对齐,以保证原子性,而我只是在寻找英特尔的参考手册错了地方?

Or is alignment required by the ISA to guarantee atomicity, and I'm just looking the wrong places in Intel's reference manuals?

推荐答案

您是从引用的 PDF 从1999年显然是不合时宜的。

The PDF you are quoting from is from 1999 and CLEARLY outdated.

借助了最新的英特尔文档,具体的卷-3A 讲述了一个不同的故事。

The up-to-date Intel documentation, specifically Volume-3A tells a different story.

例如,在睿i7处理器,你仍然必须确保你的数据不会不跨越高速缓存行,否则操作不保证是原子。

For example, on a Core-i7 processor, you STILL have to make sure your data doesn't not span over cache-lines, or else the operation is NOT guaranteed to be atomic.

在卷3A,系统编程,对于x86 / x64操作系统的英特尔明确指出:

On Volume 3A, System Programming, For x86/x64 Intel clearly states:

英特尔486处理器(并且更新的处理器)保证以下   基本的内存操作总是会进行原子:

8.1.1 Guaranteed Atomic Operations

什么是位逻辑指令 plc位逻辑指令有哪些 plc位逻辑指令应用方法图解

The Intel486 processor (and newer processors since) guarantees that the following basic memory operations will always be carried out atomically:   在读取或写入字节   在读取或写入在16位边界对齐的字   在读取或写入在32位边界对齐的双字    Reading or writing a byte Reading or writing a word aligned on a 16-bit boundary Reading or writing a doubleword aligned on a 32-bit boundary

奔腾处理器(和自更新的处理器)保证下面   额外的内存操作总是会进行原子:

The Pentium processor (and newer processors since) guarantees that the following additional memory operations will always be carried out atomically:

  在读取或写入在64位边界对齐的四字   在16位访问,以适应在一个32位数据总线,非高速缓存存储器位置   

P6系列处理器(和较新的处理器以来)保证以下   额外的内存操作总是会进行原子:

The P6 family processors (and newer processors since) guarantee that the following additional memory operation will always be carried out atomically:

  未对齐的16位,32位和64位访问缓存内存适合高速缓存内   行    Unaligned 16-, 32-, and 64-bit accesses to cached memory that fit within a cache line

访问到缓存的内存被分成跨越高速缓存行和页边界   不能保证是原子的英特尔酷睿2双核,英特尔®凌动™,英特尔酷睿   双核,奔腾M,奔腾4,英特尔至强,P6系列,奔腾和英特尔486处理器。   英特尔酷睿2双核,英特尔凌动,英特尔酷睿双核,奔腾M,奔腾4,英特尔至强,   和P6系列处理器提供了允许外部存储器总线的控制信号   子系统进行拆分访问原子;然而,不结盟的数据访问意志   严重影响处理器的性能,应当避免

Accesses to cacheable memory that are split across cache lines and page boundaries are not guaranteed to be atomic by the Intel Core 2 Duo, Intel® Atom™, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, P6 family, Pentium, and Intel486 processors. The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium M, Pentium 4, Intel Xeon, and P6 family processors provide bus control signals that permit external memory subsystems to make split accesses atomic; however, nonaligned data accesses will seriously impact the performance of the processor and should be avoided