获取在Python列表中较小的n个元素较小、元素、列表中、Python

2023-09-11 02:03:17 作者：只顾着爱伱，却忘了自己ゝ

我需要在Python列表中较小的N个。我需要这是非常快，因为它在性能的一个关键组成部分，它需要被重复了很多次了。

氮通常不大于10和列表通常具有左右20000元件。该列表总是互不相我调用该函数的时间。排序不可能的地方进行。

起初，我写了这个功能：

 高清分钟（物品，N）：
    分钟= [浮动（'INF'）] * N
    在项目的项目：
        对于我，在最小枚举（分钟）：
如果项目＆lt;分：
mins.insert（一，项目）
mins.pop（）
		        打破
    返回分钟

不过这个功能不能打败一个简单的排序（项目）[：n]，这对整个列表进行排序。下面是我的测试：

 随机进口randint，随机
进口时间

test_data = [randint（10，50）+随机（）为i的范围（20000）]

的init = time.time（）
分钟=分钟（test_data，8）
打印'分钟（物品，N）：，time.time（） - 初始化

的init = time.time（）
分钟=排序（test_data）[：8]
打印排序（项目）：N]：'，time.time（） - 初始化

结果：

 分钟（物品，N）：0.0632939338684
排序（项目）：N]：0.0231449604034

排序（）[：n]为快三倍。我相信这是因为：

插入（）操作成本高，因为Python列表没有链表。排序（）是一个优化的C函数和我是纯粹的蟒蛇。

有没有什么办法可以击败排序（）：N]？我应该使用一个C扩展，或耐热玻璃或Psyco的或类似的东西？

在此先感谢您的回答。

解决方案

您真正想要分钟的排序序列。

 分钟=项目[：N]
mins.sort（）
在我的项目[N：]：
    如果我＆LT;分钟[-1]：
        mins.append㈠
        mins.sort（）
        分钟=分钟[：N]

这运行的多的速度更快，因为你甚至不看分钟，除非它可证明有一个比给定项目较大。约1 /第十原始算法的时间

这跑在零时间在我的戴尔。我不得不跑10次获得可测量的运行时间。

 分钟（物品，N）：0.297000169754
排序（项目）：N]：0.109999895096
mins2（项）：N]：0.0309998989105

使用 bisect.insort ，而不是追加和排序可能会进一步加快这一个发型。

I need to get the lesser n numbers of a list in Python. I need this to be really fast because it's in a critical part for performance and it needs to be repeated a lot of times.

n is usually no greater than 10 and the list usually has around 20000 elements. The list is always different each time I call the function. Sorting can't be made in place.

Initially, I have written this function:

def mins(items, n):
    mins = [float('inf')]*n
    for item in items:
        for i, min in enumerate(mins):
	        if item < min:
		        mins.insert(i, item)
		        mins.pop()
		        break
    return mins

But this function can't beat a simple sorted(items)[:n] which sort the entire list. Here is my test:

from random import randint, random
import time

test_data = [randint(10, 50) + random() for i in range(20000)]

init = time.time()
mins = mins(test_data, 8)
print 'mins(items, n):', time.time() - init

init = time.time()
mins = sorted(test_data)[:8]
print 'sorted(items)[:n]:', time.time() - init

Results:

mins(items, n): 0.0632939338684
sorted(items)[:n]: 0.0231449604034

sorted()[:n] is three times faster. I believe this is because:

insert() operation is costly because Python lists are not linked lists. sorted() is an optimized c function and mine is pure python.

Is there any way to beat sorted()[:n] ? Should I use a C extension, or Pyrex or Psyco or something like that?

Thanks in advance for your answers.

解决方案

You actually want a sorted sequence of mins.

mins = items[:n]
mins.sort()
for i in items[n:]:
    if i < mins[-1]: 
        mins.append(i)
        mins.sort()
        mins= mins[:n]

This runs much faster because you aren't even looking at mins unless it's provably got a value larger than the given item. About 1/10th the time of the original algorithm.

This ran in zero time on my Dell. I had to run it 10 times to get a measurable run time.

mins(items, n): 0.297000169754
sorted(items)[:n]: 0.109999895096
mins2(items)[:n]: 0.0309998989105

Using bisect.insort instead of append and sort may speed this up a hair further.

上一篇：C＃中 - 添加水印的照片通过特殊的方式水印、特殊、方式、照片

下一篇：优化回溯算法解决数独算法、数独

相关推荐

精彩图集

精彩推荐

图片推荐

全民TV帝师是谁，孙豪杰中国最牛风海军是假