Fork me on GitHub

决策树可视化

python机器学习决策树可视化

前置安装

  • 安装graphviz服务

    1. 下载安装包(windows的msi安装包) https://graphviz.org/
    2. 执行安装
    3. 将安装目录(graphviz的根目录下的bin文件夹路径添加到环境变量path中)
  • 安装python的graphviz插件 pip install graphviz

  • 安装python的pydotplus插件 pip install pydotplus

构建一个简单的决策树模型

import pandas as pd

# 读取数据
datas = pd.read_csv("../data/iris/iris.data", header=None, sep=",")

# 提取特征变量和目标变量
x = datas.iloc[:, 0:-1]
y = pd.Categorical(datas[4]).codes

# 划分训练集和测试集
from sklearn.model_selection import train_test_split

train_x, test_x, train_y, test_y = train_test_split(x, y, test_size=0.2, random_state=42)

# 模型训练与搭建
from sklearn import tree
from sklearn.preprocessing import MinMaxScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.decomposition import PCA

decision_tree_model = DecisionTreeClassifier(criterion="gini", random_state=42, max_depth=5, min_samples_split=10)
pca_best = PCA(n_components=2)
mms_best = MinMaxScaler()
train_xx = pca_best.fit_transform(train_x)
test_xx = pca_best.transform(test_x)
decision_tree_model.fit(train_xx, train_y)
test_y_pred = decision_tree_model.predict(test_xx)

可视化构建

# 方式1:输出形成dot文件,然后使用graphviz的dot命令将dot文件转换为pdf文件
from sklearn import tree
with open("iris.dot", "w") as f:
    # 将模型model输出到指定的文件中
    f = tree.export_graphviz(decision_tree_model,out_file=f)
# 命令行执行 dot -Tpdf iris.dot -o iris.pdf
# 方式2:直接使用pydotplus插件生成pdf文件
import pydotplus
from sklearn import tree
dot_data = tree.export_graphviz(
    decision_tree_model,
    out_file=None,
    filled=True,
    rounded=True,
    special_characters=True,
)
graph = pydotplus.graph_from_dot_data(dot_data)
graph.write_pdf("iris.pdf")

结果示例
image

posted @ 2025-10-18 16:16  Hui_Li  阅读(13)  评论(0)    收藏  举报