让我们有大小的矢量 N
。例如:
X =兰特(N,1)
我要计算的矢量长度 K
的一个子集的最低标准偏差。
在 N
和 K
小,很容易找到最好的子集,因为我可以使用 nchoosek(N,K)
来列举所有可能的子集。但是,当 N
和 K
的值大于假设 N = 50
和 K = 25
, nchoosek
不计算组合,因为可能子集的大小都巨大的。
我不知道是否有更好的算法来计算能够有效地给出了最小的标准偏差的数组的子集。例如,通过动态规划。任何想法?
更新
我已经实现了它在一个循环aftter的答案和比较,组合的解决方案。结果总是相同的,但速度增益是联合国precedented。
N = 20;
设k = 10;
X =兰特(N,1);
C = nchoosek(X,K);
抽搐
分钟= realmax;
对于i = 1:尺寸(C,1)
S = STD(C(I,:));
如果S<分钟
分钟=秒;
bestC = C(I,:);
结束
结束
TOC
抽搐
[X2,J] =排序(X);
mins2 = realmax;
对于i = 1:(N-K + 1)个
S = STD(2次(i为第(i + k-1个)));
如果S< mins2
mins2 =秒;
IDX = j的((ⅰ:第(i + k-1个)));
结束
结束
TOC
如果分钟== mins2
'等于'
结束
给
过去的时间7.786579秒。
经过时间0.002068秒。
ANS =
等于
解决方案
排序数组,然后计算该一次性将与轧制长度 K
的窗口。
我敢肯定,这会给你正确的答案,会想如果我能证明这一点。
手工波浪卷发的说法,有可能差距逻辑中的扩展这个部分:
考虑元素 X
从你的列表中。让我们尝试找出一组包含此元素大小为2的最低标准偏差。我们将通过选择 X
和最近的元素 X
得到这个。扩展为 K
元素,我们会得到一组的排序列表,其中包含的连接部 X
。要选择 k的最小的子集
元素(即任何 X
),因此我们只需要遍历排序列表如前所述
Let us have a vector of size N
. For example:
x = rand(N,1)
I want to compute the minimum standard deviation of a subset of length K
in the vector.
When N
and K
are small, it is easy to find the best subset since I can use nchoosek(N,K)
to enumerate all possible subsets. But when the values of N
and K
are larger than let's say N=50
and K=25
, nchoosek
fails to compute the combinations since the size of the possible subsets are huge.
I wonder if there is a better algorithm to compute the subset that gives the smallest standard deviation in an array efficiently. For example via dynamic programming. Any ideas?
Update:
I've implemented it in a loop aftter the answers and compared to the combinatorial solution. The results are always the same but the speed gain are unprecedented.
n = 20;
k = 10;
x = rand(n,1);
C = nchoosek(x, k);
tic
mins = realmax;
for i = 1:size(C,1)
s = std(C(i,:));
if s < mins
mins = s;
bestC = C(i,:);
end
end
toc
tic
[x2, j] = sort(x);
mins2 = realmax;
for i = 1:(n-k+1)
s = std(x2(i:(i+k-1)));
if s < mins2
mins2 = s;
idx = j((i:(i+k-1)));
end
end
toc
if mins == mins2
'Equal'
end
gives
Elapsed time is 7.786579 seconds.
Elapsed time is 0.002068 seconds.
ans =
Equal
解决方案
Sort the array and then compute this in one pass with rolling window of length K
.
I'm sure this will give you the right answer, will think if I can prove it.
Hand-wavey argument, with a possible gap in logic in the "extend this" part:
Consider an element x
from your list. Let's try and find the minimum standard deviation of a set of size 2 containing this element. We'll get this by choosing x
and the closest element to x
. Extend this to k
elements and we'll get a set that is a contiguous part of the sorted list that contains x
. To pick the smallest subset of k
elements (i.e for any x
) we therefore just have to iterate over the sorted list as described before.