代码改变世界

2-3-4树

2014-06-24 20:03  youxin  阅读(705)  评论(0编辑  收藏  举报

2-3-4 树计算机科学中是阶为 4 的B树

大体上同B树一样,2-3-4 树是可以用做字典的一种自平衡数据结构。它可以在O(log n)时间内查找、插入和删除,这里的 n 是树中元素的数目。

2-3-4 树在多数编程语言中实现起来相对困难,因为在树上的操作涉及大量的特殊情况。红黑树实现起来更简单一些,所以可以用它来替代。

(http://en.wikipedia.org/wiki/2%E2%80%933%E2%80%934_tree)

 In a binary tree, each node has one data item and can have up to two children. If we allow more data items and children per node, the result is a multiway tree.

  The 2, 3, and 4 in the name 2-3-4 tree refer to how many links to child nodes can potentially be contained in a given node. For non-leaf nodes, three arrangements are possible:
1. A node with one data item always has two children
2. A node with two data items always has three children
3. A node with three data items always has four children

  In a 2-3-4 tree, on the other hand, nodes with a single link are not permitted. A node with one data item must always have two links, unless it's a leaf, in which case it has no links.
(因为小于4就在一个节点中,大于4就会分裂)
(二叉树中,节点最多有两个子节点的链接。它当然可以只有一个链接,指向它的左子节点或右子节点。它的另一个链接可以是null值。然而,在2-3-4树中,不允许只有一个链接。有一个数据项的节点必须总是保持有两个链接,除非它是叶节点,在那种情况下没有链接。
有两个链接的节点称为2-节点,有3个链接的称为3-节点,有四个节点的称为4-节点,但没有成为1-节点的节点)

Insertion
  New data items are always inserted in leaves, which are on the bottom row of the tree. If items were inserted in nodes with children, then the number of children would need to be changed to maintain the structure of the tree, which stipulates that there should be one more child than data items in a node.
(新的数据项总是插在叶节点里,在树的最底层。如果插入到有子节点的节点里,子节点的编号就要发生变化以此来保持树的结构,这保证了节点的子节点比数据项多1.)

  the process begins by searching for the appropriate leaf node. If no full nodes are encountered during the search, insertion is easy. When the appropriate leaf node is reached, the new data item is simply inserted into it.

Insertion may involve moving one or two other items in a node so the keys will be
in the correct order after the new item is inserted. In this example the 23 had to be
shifted right to make room for the 18.

(3)
Node Splits
  Insertion becomes more complicated if a full node is encountered on the path down to the insertion point. When this happens, the node must be split. It's this splitting process that keeps the tree balanced. The kind of 2-3-4 tree we're discussing here is often called a top-down 2-3-4 tree because nodes are split on the way down to the insertion point.
(如果往下寻找要插入位置的路途中,节点已经满了,插入就变得复杂了。发生这种情况时,节点必须分裂(split)。正是这种分裂过程保证了树的平衡。这里讨论的2-3-4树的是一种称为自顶向下的(top-down)2-3-4树,因为是在向下找到插入点的路途中节点发生分裂。)

遇到满节点就分裂,

Let’s name the data items in the node that’s about to be split A, B, and C. Here’s
what happens in a split. (We assume the node being split is not the root; we’ll
examine splitting the root later.)
• A new, empty node is created. It’s a sibling of the node being split, and is
placed to its right.
• Data item C is moved into the new node.

•Data item B is moved into the parent of the node being split.
• Data item A remains where it is.
• The rightmost two children are disconnected from the node being split and
connected to the new node.
An example of a node split is shown in Figure 10.5. Another way of describing a
node split is to say that a 4-node has been transformed into two 2-nodes.

FIGURE 10.5 Splitting a node.
Notice that the effect of the node split is to move data up and to the right. It is this
rearrangement that keeps the tree balanced.
Here the insertion required only one node split, but more than one full node may be
encountered on the path to the insertion point. When this is the case, there will be
multiple splits.

 

Splitting the Root
When a full root is encountered at the beginning of the search for the insertion
point, the resulting split is slightly more complicated:

search for the insertion
point, the resulting split is slightly more complicated:
• A new root is created. It becomes the parent of the node being split.
• A second new node is created. It becomes a sibling of the node being split.
• Data item C is moved into the new sibling.
• Data item B is moved into the new root.
• Data item A remains where it is.
• The two rightmost children of the node being split are disconnected from it
and connected to the new right-hand node.

Figure 10.6 shows the root being split. This process creates a new root that’s at a
higher level than the old one. Thus, the overall height of the tree is increased by
one. Another way to describe splitting the root is to say that a 4-node is split into
three 2-nodes.

FIGURE 10.6 Splitting the root.
Following a node split, the search for the insertion point continues down the tree. In
Figure 10.6, the data item with a key of 41 is inserted into the appropriate leaf.

 

 

Splitting on the Way Down
  Notice that, because all full nodes are split on the way down, a split can't cause an effect that ripples back up through the tree. The parent of any node that's being split is guaranteed not to be full, and can therefore accept data item B without itself needing to be split. Of course, if this parent already had two children when its child was split, it will become full. However, that just means that it will be split when the next search encounters it.
(注意,因为所有满的节点都是在下行路途中分裂的,分裂不可能向回波及到树上面的节点。任何要分裂节点的父节点肯定不是满的,因此该节点不需要分裂就可以插入数据项B。当然,如果父节点的子节点分裂时它已经有两个子节点了,它就变满了。但是,这只是意味着下次查找碰到它时才需要分裂。)

 

http://blog.sina.com.cn/s/blog_441997d20100ehar.html

http://blog.chinaunix.net/uid-23629988-id-3152495.html?page=2

http://blog.csdn.net/v_JULY_v/article/details/6531399