可这广度优先搜索进行得更快?广度、更快

2023-09-11 03:34:36 作者:啃猪蹄的小仙女

我有一个较大的加权循环图发生在大约5-6路径循环的周期的数据集。它由约8000节点和每个节点有1-6个(通常为约4-5)连接。我做一对最短路径计算,并已实现了以下code做广度优先搜索。

I have a data set which is a large unweighted cyclic graph The cycles occur in loops of about 5-6 paths. It consists of about 8000 nodes and each node has from 1-6 (usually about 4-5) connections. I'm doing single pair shortest path calculations and have implemented the following code to do a breadth-first search.

from Queue import Queue

q = Queue()
parent = {}
fromNode = 'E1123'
toNode = 'A3455'

# path finding
q.put(fromNode)
parent[fromNode] = 'Root'

while not q.empty():
  # get the next node and add its neighbours to queue
  current = q.get()
  for i in getNeighbours(current):
    # note parent and only continue if not already visited
    if i[0] not in parent:
      parent[i[0]] = current
      q.put(i[0])

  # check if destination
  if current == toNode:
    print 'arrived at', toNode
    break

以上code使用了Python 2.6队列模块和getNeighbours()仅仅是一个子程序,使得单个MySQL调用和返回邻国的元组如清单(('富',),('巴',))。 SQL调用快。

The above code uses the Python 2.6 Queue module and getNeighbours() is simply a subroutine that makes a single MySQL call and returns the neighbours as a list of tuples e.g. (('foo',),('bar',)). The SQL call is quick.

在code工作却可以检测到低至深度约7层,大约需要20秒的运行(2.5GHz的英特尔4GB内存OS X 10.6)

The code works ok however testing to down to depths of about 7 layers takes about 20 seconds to run (2.5GHz Intel 4GB RAM OS X 10.6)

我会欢迎有关如何改善这一code运行时的任何意见。

I'd welcome any comments about how to improve the run time of this code.

推荐答案

那么,给出注释的upvotes,我会让现在的答案。

Well, given the upvotes on the comment, I'll make it an answer now.

在紧密循环的SQL是绝对放慢你失望。我不在乎通话有多快。想想看 - 你问一个查询被解析,查找要运行 - 以最快的速度,也就是说,它仍然在紧密循环。什么是您的数据集是什么样子?你能刚 SELECT 整个数据集到内存,或者至少工作,它外面的MySQL?

The SQL in the tight loop is definitely slowing you down. I don't care how fast the call is. Think about it -- you're asking for a query to be parsed, a lookup to be run -- as fast as that is, it's still in a tight loop. What does your data set look like? Can you just SELECT the entire data set into memory, or at least work with it outside of MySQL?

如果您使用内存中的数据,你会看到一个显著的性能增益。

If you work with that data in memory, you will see a significant performance gain.