ZhangZhihui's Blog  

Apache Doris is a real-time analytical database with an architecture simplified into two primary components: the FE (Frontend) and the BE (Backend).

Together, they form a massively parallel processing (MPP) system where the FE acts as the "brain" and the BE acts as the "muscle."


🧠 Frontend (FE)

The FE is the management layer of the cluster. It is responsible for the "control plane" tasks and does not store the actual data rows.

  • User Interface: It handles client connections and is compatible with the MySQL protocol, allowing you to connect via standard MySQL clients.

  • Query Planning: When a SQL query comes in, the FE parses it, analyzes it, and develops a distributed execution plan (telling the BEs which data to grab and how to process it).

  • Metadata Management: It stores all the information about the cluster, including database/table schemas, partition info, and the location of data replicas.

  • Node Management: It monitors the health of the BE nodes, managing their online/offline status and balancing data replicas across them.

  • High Availability: FEs operate in roles like Master (handles writes), Follower (can become Master), and Observer (scales read capacity).


💪 Backend (BE)

The BE is the execution and storage layer. It handles the heavy lifting of the data processing.

  • Data Storage: It stores the physical data in a columnar format. Data is divided into shards (called Tablets) and distributed across multiple BE nodes for redundancy.

  • Query Execution: It receives fragments of the execution plan from the FE and executes them locally on its shards, performing tasks like filtering, aggregation, and sorting.

  • Data Ingestion: While the FE coordinates the transactions, the actual data files (from streams or batch loads) are typically written directly to the BE nodes to maximize throughput.

  • Self-Healing: BE nodes handle background tasks like Compaction (merging data versions) and data movement if the FE tells them a replica needs to be migrated.


🔄 How They Work Together

  1. Request: A user sends a SQL query to an FE node via MySQL protocol.

  2. Plan: The FE looks at its metadata to find where the required data is stored, creates a plan, and sends it to the relevant BE nodes.

  3. Execute: The BE nodes scan the data on their local disks, process it, and send the results back.

  4. Aggregate: The FE (or a coordinator BE) collects the results from all participating nodes and returns the final result to the user.

 

posted on 2025-12-24 10:58  ZhangZhihuiAAA  阅读(2)  评论(0)    收藏  举报