regexp_replace(zczb,'([^\\u4E00-\\u9FA5]+)','')*10000
1️⃣ regexp_replace
function
Syntax:
-
string: The input string (
zczb
in your case) -
pattern: The regular expression to match
-
replacement: The string to replace each match with
It replaces all parts of the string that match the pattern with the replacement.
2️⃣ The regular expression
Breaking it down:
-
\\u4E00-\\u9FA5
→ Unicode range for Chinese characters (from一
to龥
) -
[^ ... ]
→ negation, i.e., anything not in this range -
+
→ one or more occurrences -
()
→ capturing group (not strictly needed here, just for grouping)
✅ So ([^\\u4E00-\\u9FA5]+)
matches any sequence of characters that are NOT Chinese characters.
3️⃣ Replacement string
-
Replaces all non-Chinese sequences with an empty string → effectively keeps only Chinese characters.
4️⃣ Multiply by 10000
-
After removing non-Chinese characters, the result is likely a numeric string extracted from Chinese-formatted text (maybe Chinese numbers like “万” or other formatting removed earlier).
-
Multiplying by
10000
converts the cleaned value to a numeric scale (common in financial data where figures are written in 万).