Prometheus存储
本地存储
Prometheus 的本地时间序列数据库以自定义、高效的格式将数据存储在本地存储中。
默认情况下,Prometheus将采集的数据存储在本地的TSDB数据库中,路径默认为Prometheus安装目录的data目录。
磁盘布局
采集的样本被分组为两个小时的block。每个两小时的block由一个目录组成,该目录包含一个包含该时间窗口的所有时间序列样本的block子目录、一个元数据文件和一个索引文件(将度量名称和标签索引到块目录中的时间序列)。chunks 目录中的样本默认组合成一个或多个段文件,每个段文件最大为 512MB。当通过 API 删除系列时,删除记录存储在单独的 tombstone 文件中(而不是立即从块段中删除数据)。
传入样本的当前块保存在内存中,并且没有完全持久化。它通过预写日志 (WAL) 防止崩溃,当 Prometheus 服务器重新启动时可以重放该日志。预写日志文件wal以 128MB 段的形式存储在目录中。这些文件包含尚未压缩的原始数据;因此它们比常规块文件大得多。Prometheus 将至少保留三个预写日志文件。高流量服务器可能会保留三个以上的 WAL 文件,以便保留至少两个小时的原始数据。
Prometheus 服务器的数据目录如下所示:
./data
├── 01BKGV7JBM69T2G1BGBGM6KB12 # 块
│ └── meta.json # 元数据
├── 01BKGTZQ1SYQJTR4PB43C8PD98 # 块
│ ├── chunks # 样本数据
│ │ └── 000001 # 数据目录,每个大小为512M超过会被切分为多个
│ ├── tombstones # 逻辑数据
│ ├── index # 索引文件
│ └── meta.json # 元数据
├── 01BKGTZQ1HHWHV8FBJXW1Y3W0K # 块
│ └── meta.json # 元数据
├── 01BKGV7JC0RY8A6MACW02A2PJD # 块
│ ├── chunks # 样本数据
│ │ └── 000001
│ ├── tombstones # 逻辑数据
│ ├── index # 索引文件
│ └── meta.json # 元数据
├── chunks_head
│ └── 000001
└── wal
├── 000000002
└── checkpoint.00000001
└── 00000000
block介绍
每个block为一个data目录中以01开头的存储目录。
block特性
block会压缩、合并历史数据库,以及删除过期的快,随着压缩、合并,block的数量会减少,在压缩过程中会发生三件事:定期执行压缩、合并小的block到大的block、清理过期的块。
本地存储配置参数
--storage.tsdb.path: Prometheus 写入数据库的地方。默认为data/.
--storage.tsdb.retention.time: 何时删除旧数据。默认为15d. storage.tsdb.retention如果此标志设置为默认值以外的任何值,则覆盖。
--storage.tsdb.retention.size:要保留的存储块的最大字节数。最旧的数据将首先被删除。默认为0或禁用。支持的单位:B、KB、MB、GB、TB、PB、EB。例如:“512MB”。基于 2 的幂,所以 1KB 是 1024B。尽管 WAL 和 m 映射的块被计入总大小,但仅删除持久块以兑现此保留。wal所以对磁盘的最低要求是(WAL 和 Checkpoint)和chunks_head(m-mapped Head chunks)目录组合占用的峰值空间(每 2 小时峰值)。
--storage.tsdb.retention: 不赞成使用storage.tsdb.retention.time。
--storage.tsdb.wal-compression:启用预写日志 (WAL) 的压缩。根据您的数据,您可以期望 WAL 大小减半而几乎没有额外的 cpu 负载。此标志在 2.11.0 中引入,并在 2.20.0 中默认启用。请注意,一旦启用,将 Prometheus 降级到 2.11.0 以下的版本将需要删除 WAL。
--query.timeout=2m: 最大查询超时时间
--query.max-concurrency=20:最大查询并发数
--web.read-timeout=5m:最大空闲超时时间
--web.max-connections=512:最大并发连接数
--web.enable-lifecycle:启动API动态加载配置功能
远端存储
Prometheus 的本地存储仅限于单个节点的可扩展性和持久性。Prometheus 本身并没有尝试解决集群存储问题,而是提供了一组允许与远程存储系统集成的接口。
概述
Prometheus 通过三种方式与远程存储系统集成:
- Prometheus 可以将其摄取的样本以标准化格式写入远程 URL。
- Prometheus 可以从其他 Prometheus 服务器以标准化格式接收样本。
- Prometheus 可以以标准化格式从远程 URL 读取(返回)样本数据。

读取和写入协议都使用基于 HTTP 的快速压缩协议缓冲区编码。这些协议还没有被认为是稳定的 API,将来可能会更改为使用 gRPC over HTTP/2,届时 Prometheus 和远程存储之间的所有跃点都可以安全地假定支持 HTTP/2。
--web.enable-remote-write-receiver可以通过设置命令行标志来启用内置的远程写入接收器。启用后,远程写入接收器端点为/api/v1/write.
配置文件
<remote_write>
# The URL of the endpoint to send samples to.
url: <string>
# Timeout for requests to the remote write endpoint.
[ remote_timeout: <duration> | default = 30s ]
# Custom HTTP headers to be sent along with each remote write request.
# Be aware that headers that are set by Prometheus itself can't be overwritten.
headers:
[ <string>: <string> ... ]
# List of remote write relabel configurations.
write_relabel_configs:
[ - <relabel_config> ... ]
# Name of the remote write config, which if specified must be unique among remote write configs.
# The name will be used in metrics and logging in place of a generated value to help users distinguish between
# remote write configs.
[ name: <string> ]
# Enables sending of exemplars over remote write. Note that exemplar storage itself must be enabled for exemplars to be scraped in the first place.
[ send_exemplars: <boolean> | default = false ]
# Sets the `Authorization` header on every remote write request with the
# configured username and password.
# password and password_file are mutually exclusive.
basic_auth:
[ username: <string> ]
[ password: <secret> ]
[ password_file: <string> ]
# Optional `Authorization` header configuration.
authorization:
# Sets the authentication type.
[ type: <string> | default: Bearer ]
# Sets the credentials. It is mutually exclusive with
# `credentials_file`.
[ credentials: <secret> ]
# Sets the credentials to the credentials read from the configured file.
# It is mutually exclusive with `credentials`.
[ credentials_file: <filename> ]
# Optionally configures AWS's Signature Verification 4 signing process to
# sign requests. Cannot be set at the same time as basic_auth, authorization, or oauth2.
# To use the default credentials from the AWS SDK, use `sigv4: {}`.
sigv4:
# The AWS region. If blank, the region from the default credentials chain
# is used.
[ region: <string> ]
# The AWS API keys. If blank, the environment variables `AWS_ACCESS_KEY_ID`
# and `AWS_SECRET_ACCESS_KEY` are used.
[ access_key: <string> ]
[ secret_key: <secret> ]
# Named AWS profile used to authenticate.
[ profile: <string> ]
# AWS Role ARN, an alternative to using AWS API keys.
[ role_arn: <string> ]
# Optional OAuth 2.0 configuration.
# Cannot be used at the same time as basic_auth, authorization, or sigv4.
oauth2:
[ <oauth2> ]
# Configures the remote write request's TLS settings.
tls_config:
[ <tls_config> ]
# Optional proxy URL.
[ proxy_url: <string> ]
# Configure whether HTTP requests follow HTTP 3xx redirects.
[ follow_redirects: <boolean> | default = true ]
# Configures the queue used to write to remote storage.
queue_config:
# Number of samples to buffer per shard before we block reading of more
# samples from the WAL. It is recommended to have enough capacity in each
# shard to buffer several requests to keep throughput up while processing
# occasional slow remote requests.
[ capacity: <int> | default = 2500 ]
# Maximum number of shards, i.e. amount of concurrency.
[ max_shards: <int> | default = 200 ]
# Minimum number of shards, i.e. amount of concurrency.
[ min_shards: <int> | default = 1 ]
# Maximum number of samples per send.
[ max_samples_per_send: <int> | default = 500]
# Maximum time a sample will wait in buffer.
[ batch_send_deadline: <duration> | default = 5s ]
# Initial retry delay. Gets doubled for every retry.
[ min_backoff: <duration> | default = 30ms ]
# Maximum retry delay.
[ max_backoff: <duration> | default = 5s ]
# Retry upon receiving a 429 status code from the remote-write storage.
# This is experimental and might change in the future.
[ retry_on_http_429: <boolean> | default = false ]
# Configures the sending of series metadata to remote storage.
# Metadata configuration is subject to change at any point
# or be removed in future releases.
metadata_config:
# Whether metric metadata is sent to remote storage or not.
[ send: <boolean> | default = true ]
# How frequently metric metadata is sent to remote storage.
[ send_interval: <duration> | default = 1m ]
# Maximum number of samples per send.
[ max_samples_per_send: <int> | default = 500]
<remote_read>
# The URL of the endpoint to query from.
url: <string>
# Name of the remote read config, which if specified must be unique among remote read configs.
# The name will be used in metrics and logging in place of a generated value to help users distinguish between
# remote read configs.
[ name: <string> ]
# An optional list of equality matchers which have to be
# present in a selector to query the remote read endpoint.
required_matchers:
[ <labelname>: <labelvalue> ... ]
# Timeout for requests to the remote read endpoint.
[ remote_timeout: <duration> | default = 1m ]
# Custom HTTP headers to be sent along with each remote read request.
# Be aware that headers that are set by Prometheus itself can't be overwritten.
headers:
[ <string>: <string> ... ]
# Whether reads should be made for queries for time ranges that
# the local storage should have complete data for.
[ read_recent: <boolean> | default = false ]
# Sets the `Authorization` header on every remote read request with the
# configured username and password.
# password and password_file are mutually exclusive.
basic_auth:
[ username: <string> ]
[ password: <secret> ]
[ password_file: <string> ]
# Optional `Authorization` header configuration.
authorization:
# Sets the authentication type.
[ type: <string> | default: Bearer ]
# Sets the credentials. It is mutually exclusive with
# `credentials_file`.
[ credentials: <secret> ]
# Sets the credentials to the credentials read from the configured file.
# It is mutually exclusive with `credentials`.
[ credentials_file: <filename> ]
# Optional OAuth 2.0 configuration.
# Cannot be used at the same time as basic_auth or authorization.
oauth2:
[ <oauth2> ]
# Configures the remote read request's TLS settings.
tls_config:
[ <tls_config> ]
# Optional proxy URL.
[ proxy_url: <string> ]
# Configure whether HTTP requests follow HTTP 3xx redirects.
[ follow_redirects: <boolean> | default = true ]
# Whether to use the external labels as selectors for the remote read endpoint.
[ filter_external_labels: <boolean> | default = true ]

浙公网安备 33010602011771号