不可变的数据结构的性能数据结构、性能

2023-09-03 09:08:22 作者:转身的微笑

我不明白怎么能东西作为一个集是不变的,还有一个可接受的性能。

I don't get how can something as a Set be immutable and still have an acceptable performance.

这是我读过的F#中集内部使用红黑树及其实施。如果我们要添加一些新的红色黑树,我们必须从根本上重新创建它每一次,怎么能有过不错的表现?我缺少的是在这里吗?

From what I've read in F# Sets internally use Red Black Trees as their implementation. If each time we want to add something new to a Red Black Tree we have to basically recreate it, how can it have ever good performance? What am I missing here?

虽然我是问这对F#的设置,我觉得这是有或使用不可变的数据结构的任何其他语言相关的。

Although I am asking this for F#'s Sets, I think this is as relevant in any other language which has or uses immutable data structures.

感谢

推荐答案

几乎所有的不可变的集合是某种形式的平衡树。要创建一个新的树,你要重新分配路径上节点的变化(插入,删除,更新)的根。只要树平衡这需要对数时间。如果你碰到这样的2-3-4树(类似于红黑树),预计出度三,你可以只用10的分配处理一百万元。

Almost all immutable collections are some form of balanced tree. To create a new tree, you have to reallocate nodes on the path from the change (insert, remove, "update") to the root. As long as the tree is balanced this takes logarithmic time. If you have something like a 2-3-4 tree (similar to red-black trees) with expected outdegree three, you can handle a million elements using only 10 allocations.

和在其中的数据结构预计是纯的语言,它们确保分配是快的。分配四个元素节点是要花费一个比较,增量和四家商店。而且在很多情况下,你可以分摊的成本比较了几种分配。

And in languages where data structures are expected to be pure, they make sure allocation is fast. Allocating a four-element node is going to cost a compare, an increment, and four stores. And in many cases you can amortize the cost of a compare over several allocations.

如果您想了解更多有关这些结构是如何工作的,一个优秀的来源是Purely功能性数据结构的通过克里斯Okasaki。

If you want to know more about how these structures work, an excellent source is Purely Functional Data Structures by Chris Okasaki.