发现最近点的指数x和y坐标的numpy的阵列阵列、标的、指数、发现

2023-09-10 23:28:08 作者：姿态动人

我有两个2d中numpy的阵列：x_array包含在x方向的位置信息，y_array包含在y方向的位置

I have two 2d numpy arrays: x_array contains positional information in the x-direction, y_array contains positions in the y-direction.

我再有X，Y点的一个长长的清单。

I then have a long list of x,y points.

对于列表中的每一个点，我需要找到的位置（在数组指定）的数组索引是最接近这一点。

For each point in the list, I need to find the array index of the location (specified in the arrays) which is closest to that point.

我还天真地产生了一些code，在此基础上的问题而工作：发现numpy的数组中最接近的数值

I have naively produced some code which works, based on this question: find nearest value in numpy array

即

import time
import numpy

def find_index_of_nearest_xy(y_array, x_array, y_point, x_point):
    distance = (y_array-y_point)**2 + (x_array-x_point)**2
    idy,idx = numpy.where(distance==distance.min())
    return idy[0],idx[0]

def do_all(y_array, x_array, points):
    store = []
    for i in xrange(points.shape[1]):
        store.append(find_index_of_nearest_xy(y_array,x_array,points[0,i],points[1,i]))
    return store


# Create some dummy data
y_array = numpy.random.random(10000).reshape(100,100)
x_array = numpy.random.random(10000).reshape(100,100)

points = numpy.random.random(10000).reshape(2,5000)

# Time how long it takes to run
start = time.time()
results = do_all(y_array, x_array, points)
end = time.time()
print 'Completed in: ',end-start

我这样做在一个大的数据集，真的想加快了位。任何人都可以优化吗？

I'm doing this over a large dataset and would really like to speed it up a bit. Can anyone optimize this?

感谢。

更新：解继@silvado和@justin建议（下）

UPDATE: SOLUTION following suggestions by @silvado and @justin (below)

# Shoe-horn existing data for entry into KDTree routines
combined_x_y_arrays = numpy.dstack([y_array.ravel(),x_array.ravel()])[0]
points_list = list(points.transpose())


def do_kdtree(combined_x_y_arrays,points):
    mytree = scipy.spatial.cKDTree(combined_x_y_arrays)
    dist, indexes = mytree.query(points)
    return indexes

start = time.time()
results2 = do_kdtree(combined_x_y_arrays,points_list)
end = time.time()
print 'Completed in: ',end-start

这code以上加速了我的code（在100x100的矩阵寻找5000点）的100倍。有趣的是，使用scipy.spatial.KDTree（而不是scipy.spatial.cKDTree）给媲美定时到我天真的解决方案，所以它使用cKDTree版是绝对值得...

This code above sped up my code (searching for 5000 points in 100x100 matrices) by 100 times. Interestingly, using scipy.spatial.KDTree (instead of scipy.spatial.cKDTree) gave comparable timings to my naive solution, so it is definitely worth using the cKDTree version...

发现最近点的指数x和y坐标的numpy的阵列阵列、标的、指数、发现

推荐答案