我有一个列表的列表,我想组或群集根据它们的项目。嵌套列表开始一个新的小组,如果没有一个元素是previous小组。
输入:
路径= [
[D,B,A,H],
['D','B','A','C'],
['H','A','C'],
['E','G','我'],
[F,G,I]
我的失败code:
路径= [
[D,B,A,H],
['D','B','A','C'],
['H','A','C'],
['E','G','我'],
['F','G','I'〕
]
基= []
paths_clone =路径
在路径的路径:
在路径节点:
对于path_clone在paths_clone:
如果path_clone节点:
如果没有路径== path_clone:
groups.append([路径,path_clone])
其他:
groups.append(路径)
打印群体
预计输出:
[
[
[D,B,A,H],
['D','B','A','C'],
['H','A','C']
]
[
['E','G','我'],
['F','G','I'〕
]
]
另外一个例子:
路径= ['转换器','桶',桶式移位器'],
[ARM,桶,桶形移位'],
['知识产权力量,IP,权力]
[ARM,桶,移']]
预计输出组:
输出= [
[['移器,桶,桶形移位'],
[ARM,桶,桶形移位'],
[ARM,桶,移位],
[知识产权力量,IP,权力],
]
解决方案
您是基于集合分组,所以用一组来检测新的群体:
高清石斑鱼(顺序):
组成员= [],设置()
在顺序题目:
如果组和members.isdisjoint(项目):
#新集团,产量和启动新的
产量组
组成员= [],设置()
group.append(项目)
members.update(项目)
产量组
这给了:
>>>对于组石斑鱼(路径):
...打印组
...
['D','B','A','H'],['D','B','A','C'],['H','A','C'] ]
['E','G','我'],['F','G','我']
或者你可以再次将它转换到一个列表:
输出=列表(石斑鱼(路径))
这假设组是连续的。如果你有不相交的组,你需要处理整个列表,并遍历所有组到目前为止每个项目的构建:
高清石斑鱼(顺序):
结果= []#将举行(成员组),元组
在顺序题目:
为会员,组中的结果:
如果members.intersection(项目):#重叠
members.update(项目)
group.append(项目)
打破
其他:#没有组发现,增加新
result.append((集(项目),[项目]))
回报[组成员,组结果]
I have a list of lists and I am trying to group or cluster them based on their items. A nested list starts a new group if none of the elements are in the previous group.
Input:
paths = [
['D', 'B', 'A', 'H'],
['D', 'B', 'A', 'C'],
['H', 'A', 'C'],
['E', 'G', 'I'],
['F', 'G', 'I']]
My failed Code:
paths = [
['D', 'B', 'A', 'H'],
['D', 'B', 'A', 'C'],
['H', 'A', 'C'],
['E', 'G', 'I'],
['F', 'G', 'I']
]
groups = []
paths_clone = paths
for path in paths:
for node in path:
for path_clone in paths_clone:
if node in path_clone:
if not path == path_clone:
groups.append([path, path_clone])
else:
groups.append(path)
print groups
Expected Output:
[
[
['D', 'B', 'A', 'H'],
['D', 'B', 'A', 'C'],
['H', 'A', 'C']
],
[
['E', 'G', 'I'],
['F', 'G', 'I']
]
]
Another Example:
paths = [['shifter', 'barrel', 'barrel shifter'],
['ARM', 'barrel', 'barrel shifter'],
['IP power', 'IP', 'power'],
['ARM', 'barrel', 'shifter']]
Expected Output Groups:
output = [
[['shifter', 'barrel', 'barrel shifter'],
['ARM', 'barrel', 'barrel shifter'],
['ARM', 'barrel', 'shifter']],
[['IP power', 'IP', 'power']],
]
解决方案
You are grouping based on sets, so use a set to detect new groups:
def grouper(sequence):
group, members = [], set()
for item in sequence:
if group and members.isdisjoint(item):
# new group, yield and start new
yield group
group, members = [], set()
group.append(item)
members.update(item)
yield group
This gives:
>>> for group in grouper(paths):
... print group
...
[['D', 'B', 'A', 'H'], ['D', 'B', 'A', 'C'], ['H', 'A', 'C']]
[['E', 'G', 'I'], ['F', 'G', 'I']]
or you could cast it to a list again:
output = list(grouper(paths))
This assumes that the groups are contiguous. If you have disjoint groups, you need to process the whole list and loop over all groups constructed so far for each item:
def grouper(sequence):
result = [] # will hold (members, group) tuples
for item in sequence:
for members, group in result:
if members.intersection(item): # overlap
members.update(item)
group.append(item)
break
else: # no group found, add new
result.append((set(item), [item]))
return [group for members, group in result]