ZhangZhihui's Blog  

 

regexp_replace(zczb,'([^\\u4E00-\\u9FA5]+)','')*10000

 

1️⃣ regexp_replace function

Syntax:

regexp_replace(string, pattern, replacement)

 

  • string: The input string (zczb in your case)

  • pattern: The regular expression to match

  • replacement: The string to replace each match with

It replaces all parts of the string that match the pattern with the replacement.


2️⃣ The regular expression

([^\\u4E00-\\u9FA5]+)

Breaking it down:

  • \\u4E00-\\u9FA5 → Unicode range for Chinese characters (from to )

  • [^ ... ]negation, i.e., anything not in this range

  • +one or more occurrences

  • () → capturing group (not strictly needed here, just for grouping)

✅ So ([^\\u4E00-\\u9FA5]+) matches any sequence of characters that are NOT Chinese characters.


3️⃣ Replacement string

  • Replaces all non-Chinese sequences with an empty string → effectively keeps only Chinese characters.


4️⃣ Multiply by 10000

*10000
  • After removing non-Chinese characters, the result is likely a numeric string extracted from Chinese-formatted text (maybe Chinese numbers like “万” or other formatting removed earlier).

  • Multiplying by 10000 converts the cleaned value to a numeric scale (common in financial data where figures are written in 万).

 

posted on 2025-08-26 18:11  ZhangZhihuiAAA  阅读(4)  评论(0)    收藏  举报