好吧,这不是如何让所有的独有或者如何从我的数组中删除重复在PHP的问题。这是一部关于时间复杂度的问题。
Okay this is not a question of "how to get all uniques" or "How to remove duplicates from my array in php". This is a question about the time complexity.
我想通了 array_unique
是有点为O(n ^ 2 - N),这里是我的实现:
I figured that the array_unique
is somewhat O(n^2 - n) and here's my implementation:
function array_unique2($array)
{
$to_return = array();
$current_index = 0;
for ( $i = 0 ; $i < count($array); $i++ )
{
$current_is_unique = true;
for ( $a = $i+1; $a < count($array); $a++ )
{
if ( $array[$i] == $array[$a] )
{
$current_is_unique = false;
break;
}
}
if ( $current_is_unique )
{
$to_return[$current_index] = $array[$i];
}
}
return $to_return;
}
然而,当对这个标杆的 array_unique
我得到了以下结果:
However when benchmarking this against the array_unique
i got the following result:
测试(array_unique2)...操作了0.52146291732788秒。
Testing (array_unique2)... Operation took 0.52146291732788 s.
测试(array_unique)...操作了0.28323101997375秒。
Testing (array_unique)... Operation took 0.28323101997375 s.
这使得array_unique快两倍,我的问题是,为什么(两者有相同的随机数据)?
Which makes the array_unique twice as fast, my question is, why ( Both had the same random data ) ?
和我的一个朋友写了如下:
And a friend of mine wrote the following:
function array_unique2($a)
{
$n = array();
foreach ($a as $k=>$v)
if (!in_array($v,$n))
$n[$k]=$v;
return $n;
}
这是快两倍的内置于一体的PHP。
which is twice as fast as the built in one in php.
我想知道,为什么?
什么是array_unique的时间复杂度和in_array?
What is the time-complexity of array_unique and in_array?
修改 予除去从两个环路的计数($阵列),只是用在函数的顶部的变量,即获得了2秒100 000元素!
Edit I removed the count($array) from both loops and just used a variable in the top of the function, that gained 2 seconds on 100 000 elements!
虽然我不能代表本地array_unique功能,我可以告诉你,你的朋友的算法是速度更快,因为:
While I can't speak for the native array_unique function, I can tell you that your friends algorithm is faster because:
在他使用,而不是你的双for()循环一个foreach循环。 在foreach循环往往表现比在PHP中的循环速度更快。 在他用一个单一的,如果(!),而使用了两比较,如果()结构 在唯一的附加功能打电话给你的朋友发了in_array而你所谓的计数()的两倍。 您做了三变量声明,你的朋友都没得($ A,$ current_is_unique,$ current_index)尽管所有这些因素,单是巨大的,我能看到的累积效应会使得你的算法需要比你的朋友更长的时间。
While none of these factors alone is huge, I can see where the cumulative effect would make your algorithm take longer than your friends.