Suricata源码分析-协议解析模块

注：以下分析基于了解flow engine的前提，请先阅读上篇《流管理引擎flow engine》

配置初始化

协议解析模块配置信息的初始化，和flow engine的初始化在同一层的函数调用中，即SuricataMain* -> PostConfLoadedSetup*（包含一些配置被加载后需要运行的代码） -> PreRunInit*（主模式和unix-socket模式的初始化代码） ->* StreamTcpInitConfig*（初始化stream全局配置数据）*

StreamTcpInitConfig

该函数会首先填充全局配置结构stream_config，结构体类型为TcpStreamCnf

typedef struct TcpStreamCnf_ {
    /** stream tracking
     *
     * max stream mem usage
     */
    // stream跟踪功能所能使用的最大内存，默认为32MB
    SC_ATOMIC_DECLARE(uint64_t, memcap);
    // 流重组的最大内存用量，默认为64MB
    SC_ATOMIC_DECLARE(uint64_t, reassembly_memcap); 

    // 新的流标志将被初始化到这里
    uint16_t stream_init_flags; 

    /* coccinelle: TcpStreamCnf:flags:STREAMTCP_INIT_ */
    uint8_t flags;
    // 能存放在队列中的SYN/ACK的最大数量，默认为5
    uint8_t max_synack_queued;  

    // 为每个stream线程预分配的session数量
    uint32_t prealloc_sessions; 
    // 为每个stream线程预分配的segment数量
    uint32_t prealloc_segments;
    // 是否允许接手midstream session，默认为关闭
    int midstream;      
    // 是否打开异步stream处理，默认为关闭
    int async_oneside;  
    // 重组流的深度，默认为1MB
    uint32_t reassembly_depth;

    uint16_t reassembly_toserver_chunk_size;
    uint16_t reassembly_toclient_chunk_size;

    bool streaming_log_api;

    StreamingBufferConfig sbcnf;
} TcpStreamCnf;

除此之外，还有以下功能：

调用 StreamTcpReassembleInit 对重组功能进行初始化
调用 FlowSetProtoFreeFunc，设置flow engine对TCP的自定义状态处理函数和清理函数

模块初始化

stream模块的初始化，在flow worker线程的初始化函数中：FlowWorkerThreadInit -> StreamTcpThreadInit，函数流程为：

创建一个StreamTcpThread类型结构体stt作为该模块的在本线程的context。
设置stt->ssn_pool_id为-1。
注册一些性能计数器，例如TCP会话数、无效checksum包数、syn包数、synack包数、rst包数等。
调用 StreamTcpReassembleInitThreadCtx 初始化重组的context，保存在stt->ra_ctx。
若ssn_pool为NULL，则调用 PoolThreadExpand 为它创建一个新的PoolThread，预分配大小即为stream_config.prealloc_sessions，而创建的对象为TcpSession。
若不为NULL，则调用 PoolThreadExpand 扩充原有的ssn_pool。

注：Pool是一个suricata中实现的通用的池存储，可以避免不断malloc和free的开销，并且显著减少堆碎片。PoolThread只是在Pool上做了一层包装，允许多个同类线程共用一个Pool数组，每个线程对应其中一项

模块入口及核心代码

协议解析的入口也在 FlowWorker 函数中，调用堆栈：FlowWorker -> FlowWorkerStreamTCPUpdate -> StreamTcp，在执行完流查找和流更新之后，就开始进行协议解析。而在协议解析之后，则会进行规则检测和信息的输出。截取部分代码：

    // 处理TCP和应用层协议
    if (p->flow && PKT_IS_TCP(p)) {
        SCLogDebug("packet %"PRIu64" is TCP. Direction %s", p->pcap_cnt, PKT_IS_TOSERVER(p) ? "TOSERVER" : "TOCLIENT");
        DEBUG_ASSERT_FLOW_LOCKED(p->flow);

        // 如果检测被禁用，我们需要在第一个数据包上对流设置文件标志
        if (detect_thread == NULL &&
                ((PKT_IS_TOSERVER(p) && (p->flowflags & FLOW_PKT_TOSERVER_FIRST)) ||
                 (PKT_IS_TOCLIENT(p) && (p->flowflags & FLOW_PKT_TOCLIENT_FIRST))))
        {
            DisableDetectFlowFileFlags(p->flow);
        }

        FlowWorkerStreamTCPUpdate(tv, fw, p, detect_thread);

FlowWorkerStreamTCPUpdate

代码及注释如下：

static inline void FlowWorkerStreamTCPUpdate(ThreadVars *tv, FlowWorkerThreadData *fw,
        Packet *p, void *detect_thread)
{
    FLOWWORKER_PROFILING_START(p, PROFILE_FLOWWORKER_STREAM);
    StreamTcp(tv, p, fw->stream_thread, &fw->pq);
    FLOWWORKER_PROFILING_END(p, PROFILE_FLOWWORKER_STREAM);

    // 检查是否为流设置了改变proto的标志
    if (FlowChangeProto(p->flow)) {
        // 在两个方向上创建数据包，以便在切换协议前刷新记录和检测
        StreamTcpDetectLogFlush(tv, fw->stream_thread, p->flow, p, &fw->pq);
        AppLayerParserStateSetFlag(p->flow->alparser, APP_LAYER_PARSER_EOF_TS);
        AppLayerParserStateSetFlag(p->flow->alparser, APP_LAYER_PARSER_EOF_TC);
    }

    // 这里的数据包可以安全地访问p->flow，因为它被锁定了
    SCLogDebug("packet %"PRIu64": extra packets %u", p->pcap_cnt, fw->pq.len);
    Packet *x;
    // 如果数据包队列没有上锁，取出数据包
    while ((x = PacketDequeueNoLock(&fw->pq))) {
        SCLogDebug("packet %"PRIu64" extra packet %p", p->pcap_cnt, x);

        // 规则检测
        if (detect_thread != NULL) {
            FLOWWORKER_PROFILING_START(x, PROFILE_FLOWWORKER_DETECT);
            Detect(tv, x, detect_thread);
            FLOWWORKER_PROFILING_END(x, PROFILE_FLOWWORKER_DETECT);
        }

        // 输出模块
        OutputLoggerLog(tv, x, fw->output_thread);

        // 把这些数据包放在preq队列中，以便它们在数据包'p'之前被其他线程模块接受。
        PacketEnqueueNoLock(&tv->decode_pq, x);
    }
}

StreamTcp -> StreamTcpPacket

StreamTcp 实现流重组、协议识别和报文结构解析，实际工作主要由 StreamTcpPacket 完成

首先调用进行重组（这里不深入分析tcp重组的部分）
重组完成后，调用 AppLayerHandleTCPData 进行应用层处理
1. 调用 TCPProtoDetect 进行协议识别
2. 调用 AppLayerParserParse 进行协议解析（比如http解析锚点、html头部等）

附：流分析的两个特点

流分析并不是分析包的信息，而是分析相反方向上流的信息。举个例子，收到了一个客户端发送到服务器端的报文，那么分析的数据是服务器端对客户端发送数据的缓存内容，因为收到了客户端的报文，说明反方向上的数据接收完毕了，可以进行分析。
先进行协议的检测，通过识别到协议，然后调用该协议注册的解析函数进行相关的解析工作，协议识别的调用过程如下：

协议解析核心代码

整体流程概述

函数调用堆栈： -> StreamTcpPacket -> StreamTcpStateDispatch（处理每一个session的state逻辑） -> StreamTcpPacketStateEstablished*（处理Established状态的函数） ->* HandleEstablishedPacketToServer*/HandleEstablishedPacketToClient*（分别处理两个方向Established状态的报文） -> StreamTcpReassembleHandleSegment* -> StreamTcpReassembleHandleSegmentUpdateACK*（基于接收到的ACK报文更新应用层） -> StreamTcpReassembleAppLayer*（收到数据包之后更新流重组） ->* AppLayerHandleTCPData*（处理应用层TCP数据）*

在 StreamTcpReassembleAppLayer 函数中通过其他的函数，最终也会调用到 AppLayerHandleTCPData

注：FlowWorker中会判断tcp或udp调用不同的函数，tcp会按照上述层级调用 AppLayerHandleTCPData，udp则会调用 AppLayerHandleUdp（如dns协议、http协议），但最终都会调用 AppLayerParserParse 来进行协议的解析。

协议识别

解析需要先进行协议的识别，判断出流的协议后，在结构体中设置标志，给后面的解析函数使用，协议识别主要分为三种方式。函数调用堆栈：*AppLayerHandleTCPData* -> *TCPProtoDetect* （协议类型检测）-> *AppLayerProtoDetectGetProto* ->

*AppLayerProtoDetectPMGetProto*：通过特征串匹配
*AppLayerProtoDetectPPGetProto* ：通过探测方式识别，主动发送报文探测结果
*AppLayerProtoDetectPEGetProto*

应用层协议保存在全局变量alp_ctx中，存放了协议识别使用的各种数据，如：字符串、状态模式等，类型为AppLayerParserProtoCtx，是一个二维数组，横坐标是流协议的映射，纵坐标是应用层协议。

typedef struct AppLayerParserCtx_ {
    AppLayerParserProtoCtx ctxs[FLOW_PROTO_MAX][ALPROTO_MAX];
} AppLayerParserCtx;
// 应用层协议解析器内容
typedef struct AppLayerParserProtoCtx_
{
    /* 0 - to_server, 1 - to_client. */
    AppLayerParserFPtr Parser[2];   // 指向协议解析函数
    ...
} AppLayerParserProtoCtx;

具体结构详见 app-layer-parser.c line96

// 协议检索
AppProto AppLayerProtoDetectGetProto(AppLayerProtoDetectThreadCtx *tctx, Flow *f,
        const uint8_t *buf, uint32_t buflen, uint8_t ipproto, uint8_t flags, bool *reverse_flow)
{
    SCEnter();
    SCLogDebug("buflen %u for %s direction", buflen,
            (flags & STREAM_TOSERVER) ? "toserver" : "toclient");

    // 定义返回值
    AppProto alproto = ALPROTO_UNKNOWN;
    // 记录PM模式获取的协议号
    AppProto pm_alproto = ALPROTO_UNKNOWN;

    if (!FLOW_IS_PM_DONE(f, flags)) {
        AppProto pm_results[ALPROTO_MAX];
        // 通过特征串模式匹配协议
        uint16_t pm_matches = AppLayerProtoDetectPMGetProto(
                tctx, f, buf, buflen, flags, pm_results, reverse_flow);
        if (pm_matches > 0) {
            // 获取检测结果
            alproto = pm_results[0];

            // 重新运行其他方向的协议解析器，如果未知
            uint8_t reverse_dir = (flags & STREAM_TOSERVER) ? STREAM_TOCLIENT : STREAM_TOSERVER;
            if (FLOW_IS_PP_DONE(f, reverse_dir)) {
                AppProto rev_alproto = (flags & STREAM_TOSERVER) ? f->alproto_tc : f->alproto_ts;
                if (rev_alproto == ALPROTO_UNKNOWN) {
                    FLOW_RESET_PP_DONE(f, reverse_dir);
                }
            }

            // HACK：如果检测到的协议是dcerpc/udp，我们也运行PP模式，
            // 以避免误将DNS检测为dcerpc。
            if (!(ipproto == IPPROTO_UDP && alproto == ALPROTO_DCERPC))
                goto end;

            // 为PM模式协议号赋值
            pm_alproto = alproto;

            /* fall through */
            // 失败
        }
    }

    if (!FLOW_IS_PP_DONE(f, flags)) {
        bool rflow = false;
        // 尝试为这个流调用探测解析器
        alproto = AppLayerProtoDetectPPGetProto(f, buf, buflen, ipproto, flags, &rflow);
        // 如果alproto协议号有效
        if (AppProtoIsValid(alproto)) {
            if (rflow) {
                *reverse_flow = true;
            }
            goto end;
        }
    }

    // 查看期望列表中是否有流被发现
    if (!FLOW_IS_PE_DONE(f, flags)) {
        // 调用probing expectation
        alproto = AppLayerProtoDetectPEGetProto(f, ipproto, flags);
    }

end:
    // 如果alproto协议号无效，将pm_alproto赋值给他并返回
    if (!AppProtoIsValid(alproto))
        alproto = pm_alproto;

    SCReturnUInt(alproto);
}

特征串方式

协议识别初始化-注册特征串

特征串和探测两种方式的初始化都是由 PostConfLoadedSetup（见上文配置初始化部分） 中的 AppLayerSetup 函数实现，其中调用的函数为：

// 初始化单模多模算法等
AppLayerProtoDetectSetup();
AppLayerParserSetup();

// 注册应用层协议解析器
AppLayerParserRegisterProtocolParsers();
// 添加特征到状态模式并编译
AppLayerProtoDetectPrepareState();

AppLayerSetupCounters();

这里需要展开讲一下 AppLayerParserRegisterProtocolParsers，这个函数为每个协议注册字符串，会调用各协议的 Register*Parsers 函数。而这些注册函数会通过调用 AppLayerProtoDetectPMRegisterPatternCI 或 AppLayerProtoDetectPMRegisterPatternCS 来注册字符串，它们最终都会调用到 AppLayerProtoDetectPMAddSignature 将其转换为签名并添加到线程ctx中。

举个栗子：RegisterHTPParsers -> HTPRegisterPatternsForProtocolDetection -> AppLayerProtoDetectPMRegisterPatternCI -> AppLayerProtoDetectPMRegisterPattern -> AppLayerProtoDetectPMAddSignature

识别过程：AppLayerProtoDetectPMGetProto

核心匹配函数 PMGetProtoInspect，该函数会调用注册好的多模搜索函数（注①），进行关键字符串的查找，并将匹配的结果返回。
根据搜索函数返回的结果，调用 AppLayerProtoDetectPMMatchSignature 提取出协议号。

注①：suricata中初始化函数PostConfLoadedSetup 中调用 MpmTableSetup 注册了多模式匹配表（多模匹配算法有两种，默认是Aho-Corasick算法，另一种是Hyperscan），单模匹配表的注册由 SpmTableSetup 完成（单模算法为Boyer-Moore算法和Hyperscan算法）

探测方式

协议识别初始化

探测方式的识别以DNP3为例，RegisterDNP3Parsers 函数中调用 AppLayerProtoDetectPPParseConfPorts 注册了 DNP3ProbingParser 的探测函数：

if (!AppLayerProtoDetectPPParseConfPorts("tcp", IPPROTO_TCP,
        proto_name, ALPROTO_DNP3, 0, sizeof(DNP3LinkHeader),
        DNP3ProbingParser, DNP3ProbingParser)) {
    return;
}

函数内部执行流程：AppLayerProtoDetectPPParseConfPorts -> AppLayerProtoDetectPPRegister -> AppLayerProtoDetectInsertNewProbingParser -> AppLayerProtoDetectProbingParserElementDuplicate

AppLayerParserRegisterParser 注册应用层协议函数，给Parser赋值协议处理函数

AppLayerParserRegisterParser(IPPROTO_TCP, ALPROTO_DNP3, STREAM_TOSERVER,
    DNP3ParseRequest);
AppLayerParserRegisterParser(IPPROTO_TCP, ALPROTO_DNP3, STREAM_TOCLIENT,
    DNP3ParseResponse);
int AppLayerParserRegisterParser(uint8_t ipproto, AppProto alproto,
                      uint8_t direction,
                      AppLayerParserFPtr Parser)
{
    SCEnter();

    alp_ctx.ctxs[FlowGetProtoMapping(ipproto)][alproto].
        Parser[(direction & STREAM_TOSERVER) ? 0 : 1] = Parser;

    SCReturnInt(0);
}

*识别过程：AppLayerProtoDetectPPGetProto*

获取四层协议
获取端口信息
获取协议、端口解析函数
调用解析函数来解析，获取协议类型。

AppProto alproto = ALPROTO_UNKNOWN;
if (flags & STREAM_TOSERVER && pe->ProbingParserTs != NULL) {
    alproto = pe->ProbingParserTs(f, flags, buf, buflen, rdir);
} else if (flags & STREAM_TOCLIENT && pe->ProbingParserTc != NULL) {
    alproto = pe->ProbingParserTc(f, flags, buf, buflen, rdir);
}

协议解析

（具体会在下篇smtp协议解析模块介绍，此处只介绍流程）所有应用层协议解析模块的初始化也是在 AppLayerSetup -> AppLayerParserRegisterProtocolParsers 函数中完成。之后经过上面协议识别，确定了协议类型后，则会调用 AppLayerParserParse 函数进行具体的解析工作。

获取到结构体指针p

AppLayerParserProtoCtx *p = &alp_ctx.ctxs[f->protomap][alproto];

调用各协议注册的解析函数

// 调用递归分析器，但只针对数据。我们可能会在EOF时得到空的msgs
if (input_len > 0 || (flags & STREAM_EOF)) {
    // 调用解析器
    AppLayerResult res = p->Parser[direction](f, alstate, pstate,
            input, input_len,
            alp_tctx->alproto_local_storage[f->protomap][alproto],
            flags);
    if (res.status < 0) {
        goto error;
    } else if (res.status > 0) {
        DEBUG_VALIDATE_BUG_ON(res.consumed > input_len);
        DEBUG_VALIDATE_BUG_ON(res.needed + res.consumed < input_len);
        DEBUG_VALIDATE_BUG_ON(res.needed == 0);
        // 不完整只支持TCP
        DEBUG_VALIDATE_BUG_ON(f->proto != IPPROTO_TCP);

        // 在不当使用返回代码时，将协议置于错误状态。
        if (res.consumed > input_len || res.needed + res.consumed < input_len) {
            goto error;
        }

        if (f->proto == IPPROTO_TCP && f->protoctx != NULL) {
            TcpSession *ssn = f->protoctx;
            SCLogDebug("direction %d/%s", direction,
                    (flags & STREAM_TOSERVER) ? "toserver" : "toclient");
            if (direction == 0) {
                /* 解析器告诉我们在它所消耗的数据之上还需要多少数据。所以我们需要在
                 * 下次调用之前告诉流引擎我们需要多少数据
                 */
                ssn->client.data_required = res.needed;
                SCLogDebug("setting data_required %u", ssn->client.data_required);

调用的函数类似于：

static int IEC104ParseRequest(Flow *f, void *state,    AppLayerParserState *pstate, uint8_t *input, uint32_t input_len,
    void *local_data)

posted @ 2022-04-22 14:18 6c696e 阅读(1156) 评论(0) 收藏举报

刷新页面返回顶部

cetacean