Apache Doris is a real-time analytical database with an architecture simplified into two primary components: the FE (Frontend) and the BE (Backend).
Together, they form a massively parallel processing (MPP) system where the FE acts as the "brain" and the BE acts as the "muscle."
🧠 Frontend (FE)
The FE is the management layer of the cluster. It is responsible for the "control plane" tasks and does not store the actual data rows.
-
User Interface: It handles client connections and is compatible with the MySQL protocol, allowing you to connect via standard MySQL clients.
-
Query Planning: When a SQL query comes in, the FE parses it, analyzes it, and develops a distributed execution plan (telling the BEs which data to grab and how to process it).
-
Metadata Management: It stores all the information about the cluster, including database/table schemas, partition info, and the location of data replicas.
-
Node Management: It monitors the health of the BE nodes, managing their online/offline status and balancing data replicas across them.
-
High Availability: FEs operate in roles like Master (handles writes), Follower (can become Master), and Observer (scales read capacity).
💪 Backend (BE)
The BE is the execution and storage layer. It handles the heavy lifting of the data processing.
-
Data Storage: It stores the physical data in a columnar format. Data is divided into shards (called Tablets) and distributed across multiple BE nodes for redundancy.
-
Query Execution: It receives fragments of the execution plan from the FE and executes them locally on its shards, performing tasks like filtering, aggregation, and sorting.
-
Data Ingestion: While the FE coordinates the transactions, the actual data files (from streams or batch loads) are typically written directly to the BE nodes to maximize throughput.
-
Self-Healing: BE nodes handle background tasks like Compaction (merging data versions) and data movement if the FE tells them a replica needs to be migrated.
🔄 How They Work Together
-
Request: A user sends a SQL query to an FE node via MySQL protocol.
-
Plan: The FE looks at its metadata to find where the required data is stored, creates a plan, and sends it to the relevant BE nodes.
-
Execute: The BE nodes scan the data on their local disks, process it, and send the results back.
-
Aggregate: The FE (or a coordinator BE) collects the results from all participating nodes and returns the final result to the user.

浙公网安备 33010602011771号