Redis HASH+RediSearch 设计自定义索引

一、核心设计：用 Hash 替代 Set 的结构定义
- 1. 结构设计原则
- 2. 为何此结构能替代 Set？
二、关键操作实现：用 Hash 完成唯一性校验
- 1. 依赖注入（基础配置）
- 2. 业务层调用示例（以 Customer 表为例）
三、集成 RediSearch：无需转换，直接索引 Hash
- 1. 创建 RediSearch 索引
- 2. 模糊搜索实现（Java 代码）
四、Hash 方案 vs Set 方案：核心优势
五、注意事项
总结

在 Redis 作为 MySQL 外部自定义索引存储的场景中，完全可以用 Hash 存储代替 Set，且能直接省去“Set 转 Hash”的转换步骤——因为 Hash 的特性可以同时满足“多租户唯一性校验”和“后续集成 RediSearch 搜索”的核心需求，无需额外维护两种结构。

要理解这一点，需先明确：Set 的核心价值是“保证元素唯一性”，而 Hash 可以通过 field 的唯一性+原子操作，完全模拟 Set 的这一核心能力，同时天然适配 RediSearch（无需转换）。以下是详细设计方案、实现逻辑及对比分析：

一、核心设计：用 Hash 替代 Set 的结构定义

1. 结构设计原则

保留原 Set 方案中“租户-表-字段组合”的隔离逻辑，但将存储载体从 Set 改为 Hash，利用 Hash 的 field 特性存储“唯一值组合”，具体结构如下：

维度	Hash 结构设计	作用说明
Hash 键（Key）	`unique:{tenantId}:{tableName}:{fields}`	与原 Set 键完全一致，确保多租户、表、字段组合的隔离（如 `unique:t1:customer:a,b`）。
Hash 字段（Field）	`a=值&b=值`（如 `a=1&b=2`）	存储“字段值组合”，利用 Hash 的 field 唯一性保证“同一租户-表-字段组合下，不会有重复的字段值”（Hash 不允许同一键下有重复 field）。
Hash 值（Value）	占位符（如 `1` 或 `exists`）	无实际业务意义，仅用于满足 Hash “field-value” 的键值对结构（Hash 必须存储 field 和 value 的映射，不能只存 field）。

示例：
租户 t1 的 customer 表 (a,b) 字段索引，存储 a=1&b=2 和 a=3&b=4 两个唯一值，Hash 结构如下：

Hash 键：unique:t1:customer:a,b
Hash 内容：{"a=1&b=2": "1", "a=3&b=4": "1"}

2. 为何此结构能替代 Set？

对比 Set 和 Hash 在“唯一性校验”核心需求上的能力，两者完全等价：

核心需求	Set 实现方式	Hash 实现方式	结论（等价性）
原子性添加	`SADD key value`：不存在则添加，返回1；存在则不添加，返回0。	`HSETNX key field value`：不存在则添加field，返回1；存在则不添加，返回0。（`NX`=Not Exists）	完全等价，均支持原子性校验+添加
存在性判断	`SISMEMBER key value`：判断value是否在Set中，O(1)。	`HEXISTS key field`：判断field是否在Hash中，O(1)。	完全等价，均为O(1)低延迟
去重能力	Set 天然去重，不允许重复value。	Hash 天然去重，不允许同一key下重复field。	完全等价，均通过结构特性保证唯一性
批量获取所有值	`SMEMBERS key`：返回所有value。	`HKEYS key`：返回所有field（即字段值组合）。	功能等价，仅命令不同
删除指定值	`SREM key value`：删除指定value。	`HDEL key field`：删除指定field。	功能等价，仅命令不同

可见，Hash 通过 field 替代 Set 的 value，通过 HSETNX/HEXISTS 替代 Set 的 SADD/SISMEMBER，完全具备“唯一性校验”的核心能力，且结构更贴合 RediSearch 需求。

二、关键操作实现：用 Hash 完成唯一性校验

以下是基于 Hash 的核心业务操作（新增、校验、删除）的具体实现，以 Java + Spring Data Redis 为例：

1. 依赖注入（基础配置）

import org.springframework.data.redis.core.RedisTemplate;
import org.springframework.stereotype.Component;

@Component
public class HashUniqueIndexUtil {
    // 注入RedisTemplate（需提前配置序列化方式，如StringRedisSerializer）
    private final RedisTemplate<String, String> redisTemplate;

    public HashUniqueIndexUtil(RedisTemplate<String, String> redisTemplate) {
        this.redisTemplate = redisTemplate;
    }

    // 1. 构建Hash键（与原Set键一致）
    private String buildHashKey(String tenantId, String tableName, String fieldList) {
        return String.format("unique:%s:%s:%s", tenantId, tableName, fieldList);
    }

    // 2. 原子性添加唯一值（校验+添加）
    public boolean addUniqueValue(String tenantId, String tableName, String fieldList, String valueStr) {
        String hashKey = buildHashKey(tenantId, tableName, fieldList);
        // HSETNX：原子操作，field不存在则添加，返回true；存在则返回false
        return Boolean.TRUE.equals(redisTemplate.opsForHash().putIfAbsent(hashKey, valueStr, "1"));
    }

    // 3. 唯一性校验（判断是否已存在）
    public boolean isUniqueValueExists(String tenantId, String tableName, String fieldList, String valueStr) {
        String hashKey = buildHashKey(tenantId, tableName, fieldList);
        // HEXISTS：判断field是否存在，O(1)时间复杂度
        return Boolean.TRUE.equals(redisTemplate.opsForHash().hasKey(hashKey, valueStr));
    }

    // 4. 删除指定唯一值
    public void deleteUniqueValue(String tenantId, String tableName, String fieldList, String valueStr) {
        String hashKey = buildHashKey(tenantId, tableName, fieldList);
        // HDEL：删除指定field
        redisTemplate.opsForHash().delete(hashKey, valueStr);
    }

    // 5. 批量获取某租户-表-字段组合下的所有唯一值
    public List<String> getAllUniqueValues(String tenantId, String tableName, String fieldList) {
        String hashKey = buildHashKey(tenantId, tableName, fieldList);
        // HKEYS：获取所有field（即字段值组合），等价于Set的SMEMBERS
        return new ArrayList<>(redisTemplate.opsForHash().keys(hashKey));
    }
}

2. 业务层调用示例（以 Customer 表为例）

@Service
public class CustomerService {
    private final HashUniqueIndexUtil hashUniqueIndexUtil;
    private final CustomerRepository customerRepository;

    // 构造函数注入依赖
    public CustomerService(HashUniqueIndexUtil hashUniqueIndexUtil, CustomerRepository customerRepository) {
        this.hashUniqueIndexUtil = hashUniqueIndexUtil;
        this.customerRepository = customerRepository;
    }

    // 新增客户（含唯一性校验）
    @Transactional
    public Customer createCustomer(String tenantId, Customer customer) {
        // 1. 构建字段值组合（如a=1&b=2）
        String valueStr = String.format("a=%s&b=%s", customer.getA(), customer.getB());
        // 2. 唯一性校验：若已存在，抛异常
        if (hashUniqueIndexUtil.isUniqueValueExists(tenantId, "customer", "a,b", valueStr)) {
            throw new RuntimeException("租户" + tenantId + "的customer表中，a=" + customer.getA() + "&b=" + customer.getB() + "已存在");
        }
        // 3. 保存数据库
        customer.setTenantId(tenantId);
        Customer saved = customerRepository.save(customer);
        // 4. 原子性添加到Redis Hash（确保数据库与Redis一致）
        hashUniqueIndexUtil.addUniqueValue(tenantId, "customer", "a,b", valueStr);
        return saved;
    }

    // 删除客户（同步删除Redis Hash）
    @Transactional
    public void deleteCustomer(String tenantId, Long customerId) {
        // 1. 查询待删除客户
        Customer customer = customerRepository.findByIdAndTenantId(customerId, tenantId)
                .orElseThrow(() -> new RuntimeException("客户不存在"));
        // 2. 构建字段值组合
        String valueStr = String.format("a=%s&b=%s", customer.getA(), customer.getB());
        // 3. 删除Redis Hash中的field
        hashUniqueIndexUtil.deleteUniqueValue(tenantId, "customer", "a,b", valueStr);
        // 4. 删除数据库记录
        customerRepository.delete(customer);
    }
}

三、集成 RediSearch：无需转换，直接索引 Hash

这是 Hash 方案的核心优势——无需任何结构转换，可直接为 Hash 建立 RediSearch 索引，实现对“字段值组合”（Hash 的 field）的模糊搜索。

1. 创建 RediSearch 索引

针对 Hash 结构，直接为 field（字段值组合）或 value（占位符，无意义）建立索引。由于 Hash 的 field 是我们需要搜索的内容（如 a=1&b=2），需通过 RediSearch 的 FIELDS 配置指定索引 field（默认 RediSearch 索引 Hash 的 field-value 中的 value，需显式配置索引 field）。

索引创建命令（Redis CLI）：

# 创建索引：idx_unique_hash（针对Hash类型）
FT.CREATE idx_unique_hash
  ON HASH  # 索引对象为Hash
  PREFIX 1 "unique:"  # 只索引键名以"unique:"开头的Hash（即我们的自定义索引）
  SCHEMA  # 索引字段配置
    __key__ AS hash_key TEXT  # 可选：索引Hash键（如unique:t1:customer:a,b），用于过滤租户/表/字段组合
    # 关键：索引Hash的field（字段值组合，如a=1&b=2），需用特殊语法 `@field` 或通过 `FIELDS` 配置
    # RediSearch 2.4+ 支持通过 `FIELD` 关键字指定索引Hash的field，语法如下：
    FIELD value_str AS field TEXT  # 将Hash的field（字段值组合）映射为索引字段 `value_str`，设为TEXT类型支持模糊搜索
    # 同时索引元数据（从Hash键中提取，或在Hash中新增field存储）
    # 若需更灵活的过滤，可在Hash中新增tenant_id、table_name、field_list字段，示例：
    tenant_id TAG  # 新增Hash的field：tenant_id，存储租户ID，设为TAG类型
    table_name TAG  # 新增Hash的field：table_name，存储表名
    field_list TAG  # 新增Hash的field：field_list，存储字段组合（如a,b）

补充说明：
若你的 RediSearch 版本不支持直接索引 Hash 的 field，可在 Hash 中新增一个与 field 内容完全一致的 value_str 字段（如 Hash 的 field="a=1&b=2"，同时存储 value_str="a=1&b=2"），然后索引 value_str 字段——这种方式更兼容低版本 RediSearch，且无需修改核心逻辑。

2. 模糊搜索实现（Java 代码）

基于上述索引，直接执行模糊搜索，过滤条件包含租户、表名、字段组合，确保多租户隔离：

import org.springframework.data.redis.search.SearchResult;
import org.springframework.data.redis.search.SearchQuery;
import org.springframework.data.redis.search.impl.SearchQueryBuilder;

@Component
public class HashUniqueSearchUtil {
    private final RedisTemplate<String, String> redisTemplate;

    public HashUniqueSearchUtil(RedisTemplate<String, String> redisTemplate) {
        this.redisTemplate = redisTemplate;
    }

    // 模糊搜索唯一值（如搜索a=1开头的组合）
    public List<String> searchUniqueValues(String tenantId, String tableName, String fieldList, String keyword) {
        // 1. 构建查询条件：模糊匹配 + 多租户/表/字段组合过滤
        // 示例：搜索value_str（Hash的field）包含keyword，且租户=tenantId、表=tableName、字段组合=fieldList
        String queryStr = String.format(
            "value_str:%s @tenant_id:{%s} @table_name:{%s} @field_list:{%s}",
            keyword,          // 模糊关键词（如"a=1*"表示前缀匹配，"*b=2"表示后缀匹配）
            tenantId,         // 租户过滤（TAG类型，精确匹配）
            tableName,        // 表名过滤
            fieldList         // 字段组合过滤
        );

        // 2. 构建搜索查询（分页、返回指定字段）
        SearchQuery query = SearchQueryBuilder
            .query(queryStr)
            .returnFields("value_str")  // 只返回搜索结果的value_str字段（即a=值&b=值）
            .limit(0, 50);  // 分页：从第0条开始，最多返回50条

        // 3. 执行搜索（索引名为idx_unique_hash）
        SearchResult result = redisTemplate.opsForSearch().search("idx_unique_hash", query);

        // 4. 提取结果并返回
        return result.getDocuments().stream()
            .map(doc -> (String) doc.getFieldValue("value_str"))
            .collect(Collectors.toList());
    }
}

调用示例：
搜索租户 t1 的 customer 表 (a,b) 字段中，a 以 1 开头的所有唯一值组合：

List<String> results = hashUniqueSearchUtil.searchUniqueValues(
    "t1", "customer", "a,b", "a=1*"
);
// 结果可能为：["a=1&b=2", "a=10&b=3", "a=11&b=5"]

四、Hash 方案 vs Set 方案：核心优势

对比维度	Set 方案	Hash 方案	优势结论
结构复杂度	需维护 Set + Hash（为了搜索），数据冗余。	仅需维护 Hash，无冗余。	Hash 更简洁，减少维护成本
RediSearch 集成	需先将 Set 转为 Hash，才能索引。	直接索引 Hash，无需转换。	Hash 集成更高效，无中间步骤
扩展性	若需存储额外元数据（如创建时间），需新增结构。	可直接在 Hash 中新增 field（如 create_time），无需修改核心逻辑。	Hash 扩展性更强
操作一致性	Set 与 Hash 需同步更新，存在不一致风险。	仅操作 Hash，无同步风险。	Hash 一致性更高

五、注意事项

Hash 的 field 长度限制：
Redis 中 Hash 的 field 最大长度为 512MB，实际业务中“字段值组合”（如 a=1&b=2）通常很短，完全满足需求，无需担心长度问题。
原子性保障：
务必使用 HSETNX（putIfAbsent）而非 HSET 添加 field，确保“判断不存在+添加”的原子性，避免并发场景下的重复数据。
数据一致性修复：
若数据库与 Redis 因故障（如网络中断）出现不一致，可通过“全量比对”修复：遍历数据库中某租户-表-字段组合的所有记录，构建 valueStr，检查 Redis Hash 中是否存在；若不存在则添加，若存在但数据库无对应记录则删除。
RediSearch 版本兼容性：
低版本 RediSearch 可能不支持直接索引 Hash 的 field，需在 Hash 中新增 value_str 字段存储相同内容，再索引 value_str，兼容性更好。

总结

完全可以用 Hash 替代 Set 存储自定义索引，且优势显著：

省去 Set 转 Hash 的步骤，结构更简洁，维护成本低；
完全保留 Set 的“原子性唯一性校验”能力，性能（O(1)）一致；
天然适配 RediSearch，无需任何转换即可实现模糊搜索；
扩展性更强，可灵活添加元数据字段（如租户、表名），便于过滤和索引。

此方案是 Redis 自定义索引存储的更优选择，尤其适合需要集成 RediSearch 的场景。

posted @ 2025-09-28 17:46 向着朝阳阅读(63) 评论(0) 收藏举报

刷新页面返回顶部

aibi1