递归创建决策树

一、什么是递归？

在函数内部，可以调用其他函数，如果一个函数内部调用自己本身，这个函数就叫做递归函数。
- PS : 在函数内部调用其他函数不是函数的嵌套，而在函数的内部定义子函数才是函数的嵌套。
递归的特性：
- 递归函数必须有一个明确的结束条件
- 每进入更深一层的递归时，问题规模相对于上一次递归都应减少
- 相邻两次重复之间有紧密的联系，前一次要为后一次做准备（通常前一次的输出作为后一次的输入）
- 递归的效率不高，递归层次过多会导致栈溢出（在计算机中，函数调用是通过栈（stack）这种数据结构实现的，每当进入一次方法调用，栈就会加一层栈帧，每当返回一层栈帧，栈就会减一层栈帧。由于栈的大小不是无限的，所以，递归调用的次数过多，会导致栈溢出）

先看一个例子，一个关于实现叠加的两种方法的例子：

import sys
#通过循环来实现叠加
def sum1(n):
    '''
    1 to n,The sum function
    '''
    sum = 0
    for i in range(1,n + 1):
        sum += i
    return sum

#通过函数的递归来实现叠加
def sum2(n):
    '''
    1 to n,The sum function
    '''
    if n > 0:
        return n + sum_recu(n - 1)　　　　#调用函数自身
    else:
        return 0

print("循环叠加-->",sum1(100))
print("递归叠加-->",sum2(100))

#两者实现的效果均是：5050

从上述的例子可以看出，两者都实现了叠加的效果，那么后者相对于前者有什么优点和缺点？

二、递归函数有什么优缺点？

递归函数的优点
- 定义简单，逻辑（logic）清晰。理论上，所有的递归都可以写成循环的方式，但循环的逻辑不如递归清晰
递归的缺点
- 递归调用的次数过多，会导致栈溢出（stackoverflow）

三、我们使用递归函数创建决策树

Implement the function build_tree(rows). This is the function we use to actually build our tree. Please follow the steps below,
- We will be using recursive function here (递归函数)
- Find the best split using the method we implemented before, store information gain and the question to a local variable
- Define the ending condition. If there is no gain, i.e. gain == 0, return a leaf node Leaf(rows)
- Otherwise, get the partition of the tree at the current node with the best question(Determine object that we got before)
- We use DFS(Depth First Search) to build the tree, and do the true_branch recursively first.
- We then split the false_branch recursively
- At last, we need to return something. We will return a DecisionNode object here since the starting point is also a DecisionNode
- Notes:
  - This function might take you some time and thinking. Be patient
  - You need to understand the logic behind our DT before you even start to think. Talk to me if you are not feeling confident enough
  - Look up recursive function and depth first search if necessary.

code is as follows

def build_tree(rows):
    """
    开始创建我们的决策树，使用递归法
    Building our tree recursively
    :param rows: 一部分数据 a subset of our data set
    :return: recursively return a decision node and finally a tree
    """
    #  Your code here**-**
    #  找到这组数据的最佳分割点   looking for the datasets best split
    #  此处build_tree_best_question本身就是一对象，可以直接使用
    build_tree_best_gain, build_tree_best_question = find_best_split(rows)
    # When info_gain = 0, return Leaf(rows)
    if build_tree_best_gain == 0:
        return Leaf(rows)
    # 按照最佳分割点进行分割
    true_node, false_node = partition(rows,build_tree_best_question)
    left_tree = build_tree(true_node)
    right_tree = build_tree(false_node)
    # otherwise return DecisionNode
    return DecisionNode(build_tree_best_question,left_tree,right_tree)

JAN 1.9

posted on 2019-01-09 17:37 海纳百川_有容乃大阅读(273) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

递归创建决策树

一、什么是递归？

先看一个例子，一个关于实现叠加的两种方法的例子：

二、递归函数有什么优缺点？

三、我们使用递归函数创建决策树

JAN 1.9

导航

公告

递归创建决策树

一、什么是递归？

先看一个例子，一个关于实现叠加的两种方法的例子：

二 、递归函数有什么优缺点？

三、我们使用递归函数创建决策树

JAN 1.9

导航

公告

二、递归函数有什么优缺点？