ZhangZhihui's Blog  

These two SparkSQL functions both deal with transforming data into encrypted or encoded forms, but they operate very differently and serve different purposes. Let’s break it down carefully.


1️⃣ base64encrypt()

  • Purpose: This is not really encryption, it’s encoding. Base64 is just a way to represent binary data (bytes) as ASCII text.

  • Input: Any string or binary data.

  • Output: Base64-encoded string.

  • Security: None. Base64 encoding does not hide the content; it’s trivial to decode. Think of it like putting your data in a different alphabet.

  • Use Case: When you need to store or transmit binary data as text, e.g., in JSON, XML, or CSV.

Example in SparkSQL:

SELECT base64encrypt('hello') AS encoded;
-- Returns: aGVsbG8=

 

Decoding aGVsbG8= back to hello is trivial.


2️⃣ aes_encrypt()

  • Purpose: This is true encryption using the AES algorithm (Advanced Encryption Standard).

  • Input: Plain text and a secret key.

  • Output: Encrypted binary (often stored as bytes or converted to Base64).

  • Security: Strong, as long as the key is secret. Without the key, decrypting is essentially impossible.

  • Use Case: Securely storing sensitive data like passwords, credit card numbers, or personal information.

Example in SparkSQL:

SELECT aes_encrypt('hello', 'mysecretkey') AS encrypted;
-- Returns: binary data like 0x8B1A9953C4611296A...

 

To read it back:

SELECT aes_decrypt(encrypted_column, 'mysecretkey') AS decrypted;

 


✅ Key Differences

Featurebase64encrypt()aes_encrypt()
Type Encoding Encryption
Security None Strong (with secret key)
Output Text (Base64 string) Binary (or Base64 if converted)
Use Case Data transport/format conversion Data security/privacy
Reversibility Easy (decode) Only with key (decrypt)

💡 Tip: A common pattern is to AES encrypt sensitive data and then Base64 encode the encrypted bytes if you need to store or transmit them as text.

 

Let’s do a side-by-side SparkSQL example showing how aes_encrypt() and base64() (same as base64encrypt()) can work together.


🔹 Example: Encrypt, Encode, Decrypt

-- Encrypt with AES (returns binary)
SELECT aes_encrypt('hello', 'mysecretkey') AS aes_encrypted;
-- Example output: [B@4af9fcb7   (binary bytes, not human-readable)

-- Encode the encrypted binary into Base64 (text-friendly)
SELECT base64(aes_encrypt('hello', 'mysecretkey')) AS aes_encrypted_b64;
-- Example output: fV9Bv9p9q5nX8ZzA1MZVgA==   (safe to store as text)

-- Decode from Base64 back to binary
SELECT unbase64('fV9Bv9p9q5nX8ZzA1MZVgA==') AS binary_value;

-- Decrypt AES (requires the same secret key)
SELECT aes_decrypt(unbase64('fV9Bv9p9q5nX8ZzA1MZVgA=='), 'mysecretkey') AS original_value;
-- Output: hello

 


🔹 Flow Summary

  1. AES encrypt → Produces binary (secure, but not easy to store as plain text).

  2. Base64 encode → Converts binary into text for storage/transmission.

  3. Base64 decode → Convert text back into binary.

  4. AES decrypt → Recover original string (only with the correct key).


👉 So, the combo is usually:

base64(aes_encrypt(...)) → to store securely in text form
and
aes_decrypt(unbase64(...), key) → to get the original value back

 

posted on 2025-08-22 15:20  ZhangZhihuiAAA  阅读(49)  评论(0)    收藏  举报