什么是最有效的/优雅的方式来解析一台台成大树?一台、最有效、成大、优雅

2023-09-10 22:20:45 作者:判命

假设你有一个平坦的表,用于存储一个有序的树层次结构:

Assume you have a flat table that stores an ordered tree hierarchy:

Id   Name         ParentId   Order
 1   'Node 1'            0      10
 2   'Node 1.1'          1      10
 3   'Node 2'            0      20
 4   'Node 1.1.1'        2      10
 5   'Node 2.1'          3      10
 6   'Node 1.2'          1      20

下面是一个图,我们有 [ID]名称。根节点0是虚构的。

Here's a diagram, where we have [id] Name. Root node 0 is fictional.


                       [0] ROOT
                          /    \ 
              [1] Node 1          [3] Node 2
              /       \                   \
    [2] Node 1.1     [6] Node 1.2      [5] Node 2.1
          /          
 [4] Node 1.1.1

你会用什么简约的方式输出到HTML(或文本,对于这个问题)作为一个正确排序,正确缩进树?

What minimalistic approach would you use to output that to HTML (or text, for that matter) as a correctly ordered, correctly indented tree?

进一步假设你只有基本的数据结构(数组和包含HashMap),与父母/子女的参考,没有任何的ORM,没有任何框架,只是你的两只手没有花哨的对象。该表重新psented作为结果集,它可以随机访问​​$ P $。

Assume further you only have basic data structures (arrays and hashmaps), no fancy objects with parent/children references, no ORM, no framework, just your two hands. The table is represented as a result set, which can be accessed randomly.

伪code或纯英文是好的,这纯粹是一个概念性的问题。

Pseudo code or plain English is okay, this is purely a conceptional question.

奖金的问题:有没有从根本上更好的方式来存储一个树状结构这样的RDBMS

Bonus question: Is there a fundamentally better way to store a tree structure like this in a RDBMS?

编辑和增加

要回答一个评论者的(马克·贝西的)问题:根节点是没有必要的,因为它永远不会在显示效果。的ParentId = 0是该公约的前preSS这些都是顶级水平。顺序列定义如何使用相同的父节点将要进行排序。

To answer one commenter's (Mark Bessey's) question: A root node is not necessary, because it is never going to be displayed anyway. ParentId = 0 is the convention to express "these are top level". The Order column defines how nodes with the same parent are going to be sorted.

结果集我讲的可描绘成包含HashMap数组(留在这一术语)。在我的例子的意思是已经存在。一些答案走一英里,并先构造它,但那没关系。

The "result set" I spoke of can be pictured as an array of hashmaps (to stay in that terminology). For my example was meant to be already there. Some answers go the extra mile and construct it first, but thats okay.

树可以有任意的深度。每个节点都可以有N个孩子。我并没有完全拥有数以百万计的条目树记住,虽然。

The tree can be arbitrarily deep. Each node can have N children. I did not exactly have a "millions of entries" tree in mind, though.

请不要误会我的选择节点命名(节点1.1.1)的东西可以依靠。该节点也同样可以称为'弗兰克'或'鲍勃',没有命名结构是隐含的,这仅仅是使其可读。

Don't mistake my choice of node naming ('Node 1.1.1') for something to rely on. The nodes could equally well be called 'Frank' or 'Bob', no naming structure is implied, this was merely to make it readable.

我已经发布了自己的解决方案,这样你们可以把它拆开来。的

推荐答案

有多种方式来存储树结构数据在关系数据库中。你在你的例子表明什么使用两种方法:

There are several ways to store tree-structured data in a relational database. What you show in your example uses two methods:

邻接表(父列)和 路径枚举(点分数字在你的名字列)。 Adjacency List (the "parent" column) and Path Enumeration (the dotted-numbers in your name column).

另一解决方案被称为的嵌套集下,并且可以存储在同一个表中了。阅读树木和层次结构在SQL for Smarties一乔Celko对于很多这些设计的更多信息。

Another solution is called Nested Sets, and it can be stored in the same table too. Read "Trees and Hierarchies in SQL for Smarties" by Joe Celko for a lot more information on these designs.

我通常preFER一个名为设计的关闭表(又名邻接关系),用于存储树形结构的数据。它需要另一张表,但后来查询树是pretty的方便。

I usually prefer a design called Closure Table (aka "Adjacency Relation") for storing tree-structured data. It requires another table, but then querying trees is pretty easy.

我盖关闭表中的分层数据我presentation 模型的SQL和PHP 在我的书 SQL反模式:避免数据库编程上的缺陷

I cover Closure Table in my presentation Models for Hierarchical Data with SQL and PHP and in my book SQL Antipatterns: Avoiding the Pitfalls of Database Programming.

CREATE TABLE ClosureTable (
  ancestor_id   INT NOT NULL REFERENCES FlatTable(id),
  descendant_id INT NOT NULL REFERENCES FlatTable(id),
  PRIMARY KEY (ancestor_id, descendant_id)
);

存放在封闭表,那里是从一个节点到另一个节点的直接祖先的所有路径。包括行为每个节点引用自身。例如,使用你提问中呈现的数据集:

Store all paths in the Closure Table, where there is a direct ancestry from one node to another. Include a row for each node to reference itself. For example, using the data set you showed in your question:

INSERT INTO ClosureTable (ancestor_id, descendant_id) VALUES
  (1,1), (1,2), (1,4), (1,6),
  (2,2), (2,4),
  (3,3), (3,5),
  (4,4),
  (5,5),
  (6,6);

现在,你可以得到一个树开始结点1是这样的:

Now you can get a tree starting at node 1 like this:

SELECT f.* 
FROM FlatTable f 
  JOIN ClosureTable a ON (f.id = a.descendant_id)
WHERE a.ancestor_id = 1;

(在MySQL客户端)的输出如下所示:

The output (in MySQL client) looks like the following:

+----+
| id |
+----+
|  1 | 
|  2 | 
|  4 | 
|  6 | 
+----+

在换句话说,节点3和5被排除在外,因为它们是单独的层次结构的一部分,从节点1不降

In other words, nodes 3 and 5 are excluded, because they're part of a separate hierarchy, not descending from node 1.

回复:从电子SATIS有关直接子(或直接母公司)发表评论。您可以添加一个 PATH_LENGTH 列到 ClosureTable 来使其更容易专门查询立即子女或父母(或任何其他距离)。

Re: comment from e-satis about immediate children (or immediate parent). You can add a "path_length" column to the ClosureTable to make it easier to query specifically for an immediate child or parent (or any other distance).

INSERT INTO ClosureTable (ancestor_id, descendant_id, path_length) VALUES
  (1,1,0), (1,2,1), (1,4,2), (1,6,1),
  (2,2,0), (2,4,1),
  (3,3,0), (3,5,1),
  (4,4,0),
  (5,5,0),
  (6,6,0);

然后就可以查询特定节点的直接子在搜索加上一个期限。这是后人的 PATH_LENGTH 1

SELECT f.* 
FROM FlatTable f 
  JOIN ClosureTable a ON (f.id = a.descendant_id)
WHERE a.ancestor_id = 1
  AND path_length = 1;

+----+
| id |
+----+
|  2 | 
|  6 | 
+----+

再从@ashraf评论:怎么样[按名称]排序整棵树?

Re comment from @ashraf: "How about sorting the whole tree [by name]?"

下面是一个例子查询返回的节点1的后裔所有节点,加入他们的行列,以包含和排序名其他节点的属性,如名称FlatTable

Here's an example query to return all nodes that are descendants of node 1, join them to the FlatTable that contains other node attributes such as name, and sort by the name.

SELECT f.name
FROM FlatTable f 
JOIN ClosureTable a ON (f.id = a.descendant_id)
WHERE a.ancestor_id = 1
ORDER BY f.name;

从@Nate回复意见:

Re comment from @Nate:

SELECT f.name, GROUP_CONCAT(b.ancestor_id order by b.path_length desc) AS breadcrumbs
FROM FlatTable f 
JOIN ClosureTable a ON (f.id = a.descendant_id) 
JOIN ClosureTable b ON (b.descendant_id = a.descendant_id) 
WHERE a.ancestor_id = 1 
GROUP BY a.descendant_id 
ORDER BY f.name

+------------+-------------+
| name       | breadcrumbs |
+------------+-------------+
| Node 1     | 1           |
| Node 1.1   | 1,2         |
| Node 1.1.1 | 1,2,4       |
| Node 1.2   | 1,6         |
+------------+-------------+

今天的用户建议的编辑。 SO版主批准了修改,但我倒了。

投票丨投票程序已启动,快来为青白江区工商联打call

A user suggested an edit today. SO moderators approved the edit, but I am reversing it.

编辑建议ORDER BY中的最后一个查询上述应 ORDER BY b.path_length,f.name ,presumably,以确保有序的比赛的层次结构。但是,这并不工作,因为它会命令节点1.1.1节点1.2之后。

The edit suggested that the ORDER BY in the last query above should be ORDER BY b.path_length, f.name, presumably to make sure the ordering matches the hierarchy. But this doesn't work, because it would order "Node 1.1.1" after "Node 1.2".

如果你想订购的层次结构匹配一个明智的方式,这是可能的,但不能简单地由路径长度订购。例如,看到我的回答MySQL关闭表层次型数据库 - 如何把信息在正确的顺序

If you want the ordering to match the hierarchy in a sensible way, that is possible, but not simply by ordering by the path length. For example, see my answer to MySQL Closure Table hierarchical database - How to pull information out in the correct order.