为什么我的嵌套的for循环这么长时间来计算?我的、嵌套、长时间、for

2023-09-11 23:16:17 作者:光头是第一个知道下雨的人

我有一个code生成所有的4个整数介于0和36种可能的组合。

I have a code that generates all of the possible combinations of 4 integers between 0 and 36.

这将是37 ^ 4号= 1874161。

This will be 37^4 numbers = 1874161.

我的code被写入在MATLAB:

My code is written in MATLAB:

i=0;
for a = 0:36
    for b= 0:36
        for c = 0:36
            for d = 0:36
                i=i+1;
                combination(i,:) = [a,b,c,d];             
            end          
        end
    end
end

我测试过这种使用数 3 而不是数量 36 和它工作得很好。

I've tested this with using the number 3 instead of the number 36 and it worked fine.

如果有1874161的组合,并与过度提醒的100个时钟周期做了补充和写入值的猜测,那么如果我有一个2.3GHz的个人电脑,这就是:

If there are 1874161 combinations, and with An overly cautions guess of 100 clock cycles to do the additions and write the values, then if I have a 2.3GHz PC, this is:

1874161 *(2300000000分之1)* 100 = 0.08148526086

1874161 * (1/2300000000) * 100 = 0.08148526086

第二次的一小部分。但是,它已经运行了大约半小时为止。

A fraction of a second. But It has been running for about half an hour so far.

我确实收到了警告,组合改变大小每次循环,考虑predefining其规模速度,但这并不能影响它那么多就可以了?

I did receive a warning that combination changes size every loop iteration, consider predefining its size for speed, but this can't effect it that much can it?

推荐答案

作为@horchler建议你需要preallocate目标磁盘阵列

这是因为你的程序是不是 O(N ^ 4)无preallocation。每次添加新行阵列就需要被调整,所以新的更大的数组创建(如MATLAB不知道有多大的数组这将是它可能是由1项只会增加),然后旧的阵列复制到它,最后旧的阵列将被删除。所以,当你有10个项目在数组中,加入11,然后是10项的拷贝将被添加到迭代...如果我没有弄错,导致类似 O(N ^ 12)这是大规模更为庞大

This is because your program is not O(N^4) without preallocation. Each time you add new line to array it need to be resized, so new bigger array is created (as matlab do not know how big array it will be it probably increase only by 1 item) and then old array is copied into it and lastly old array is deleted. So when you have 10 items in array and adding 11th, then a copying of 10 items is added to iteration ... if I am not mistaken that leads to something like O(N^12) which is massively more huge

估计为(N ^ 4)*(1 + 2 + 3 + ... + N ^ 4)=((N ^ 4)^ 3)/ 2

另外,重新分配过程中增加大小违约CACHE障碍放缓更加随上面的每个高速缓存大小的障碍。

Also the reallocation process is increasing in size breaching CACHE barriers slowing down even more with increasing i above each CACHE size barrier.

的唯一解决方案,这一点没有preallocation是结果存储在链表

不知道Matlab的有这个选项,但是这需要每个项目(32/64位值),这使得您的阵列 2 + 倍大单/双指针。

Not sure Matlab has this option but that will need one/two pointer per item (32/64 bit value) which renders your array 2+ times bigger.

如果您需要更多的速度,然后有一些方法(可能不是Matlab的):

使用多线程的阵列填充完全parallelisable 使用的内存块拷贝(代表MOVSD )或DMA数据是周期性重复 您也可以考虑从我计算值上运行,而不是记住整个阵列,根据使用它可以更快地在某些情况下... use multi-threading for array filling is fully parallelisable use memory block copy (rep movsd) or DMA the data is periodically repeating You can also consider to compute the value from i on the run instead of remember the whole array, depending on the usage it can be faster in some cases...