现代密码学-Cryptography 实验一

简介

本学期开设了现代密码学课程,老师布置的实验作业难度较大,因此写下博客以记录自己解题的思路。

实验一主要涉及许多古典密码,比如一次一密、维吉尼亚密码等。

T1

题目

Let us see what goes wrong when a stream cipher key is used more than once. Below are eleven hex-encoded ciphertexts that are the result of encrypting eleven plaintexts with a stream cipher, all with the same stream cipher key. Your goal is to decrypt the last ciphertext, and submit the secret message within it as solution.

Hint: XOR the ciphertexts together, and consider what happens when a space is XORed with a character in [a-z,A-Z].

这段描述后给出了10个加密后的密文,以及一个待解密的密文,需要求解明文,老师额外要求第四个密文对应的明文。

题目链接T1-Many time pad

思路

经过学习我们知道一次一密是具有完美保密性的,但只要多条消息被同一密钥加密多次,就有泄露消息的可能,假设有两条密文\(P_{1}\)\(P_{2}\),用同一密钥k加密有

\[P_{1} \oplus k = C_{1} \]

\[P_{2} \oplus k = C_{2} \]

那么容易有

\[P_{1} \oplus P_{2} = C_{1} \oplus C_{2} \]

上述推导我们得知可以通过密文异或得到明文的异或。再根据题目的提示,容易得知字母与空格异或时,大写字母会变成小写字母,小写字母会变成大写字母。因此,如果两明文同一位置有一个空格,就能得到英文字母,否则大概率得到非英文字母。

由此,我们将10条密文一一与待解密的密文异或,得到10条明文与待解密的明文异或的结果。收集每个结果的第i位,进行统计。如果该位置可见字母较多,说明待解密的明文的第i位为空格,否则大概率为字母。另外,由于英文除了句首的字母大概率是小写,如果该位置的字母遇见其他明文的空格,会生成大写字母,所以优先猜测大写字母对应的小写字母为明文文本。

经处理得到明文文本应该如下,“?”为不确定的字母

m="Th? se??et messag? is? w?en us?n? a stre?m cip?er? never use ?he key more than once"

那么合理猜测明文应该是(后续验证是对的)

m="The secret message is: when using a stream cipher, never use the key more than once"

那么将明文和密文异或便能得到密钥k,其他明文信息也容易得到。

完整代码如下:

点击查看代码
#密文
c="32510ba9babebbbefd001547a810e67149caee11d945cd7fc81a05e9f85aac650e9052ba6a8cd8257bf14d13e6f0a803b54fde9e77472dbff89d71b57bddef121336cb85ccb8f3315f4b52e301d16e9f52f904"
c_1="315c4eeaa8b5f8aaf9174145bf43e1784b8fa00dc71d885a804e5ee9fa40b16349c146fb778cdf2d3aff021dfff5b403b510d0d0455468aeb98622b137dae857553ccd8883a7bc37520e06e515d22c954eba5025b8cc57ee59418ce7dc6bc41556bdb36bbca3e8774301fbcaa3b83b220809560987815f65286764703de0f3d524400a19b159610b11ef3e"
c_2="234c02ecbbfbafa3ed18510abd11fa724fcda2018a1a8342cf064bbde548b12b07df44ba7191d9606ef4081ffde5ad46a5069d9f7f543bedb9c861bf29c7e205132eda9382b0bc2c5c4b45f919cf3a9f1cb74151f6d551f4480c82b2cb24cc5b028aa76eb7b4ab24171ab3cdadb8356f"
c_3="32510ba9a7b2bba9b8005d43a304b5714cc0bb0c8a34884dd91304b8ad40b62b07df44ba6e9d8a2368e51d04e0e7b207b70b9b8261112bacb6c866a232dfe257527dc29398f5f3251a0d47e503c66e935de81230b59b7afb5f41afa8d661cb"
c_4="32510ba9aab2a8a4fd06414fb517b5605cc0aa0dc91a8908c2064ba8ad5ea06a029056f47a8ad3306ef5021eafe1ac01a81197847a5c68a1b78769a37bc8f4575432c198ccb4ef63590256e305cd3a9544ee4160ead45aef520489e7da7d835402bca670bda8eb775200b8dabbba246b130f040d8ec6447e2c767f3d30ed81ea2e4c1404e1315a1010e7229be6636aaa"
c_5="3f561ba9adb4b6ebec54424ba317b564418fac0dd35f8c08d31a1fe9e24fe56808c213f17c81d9607cee021dafe1e001b21ade877a5e68bea88d61b93ac5ee0d562e8e9582f5ef375f0a4ae20ed86e935de81230b59b73fb4302cd95d770c65b40aaa065f2a5e33a5a0bb5dcaba43722130f042f8ec85b7c2070"
c_6="32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd2061bbde24eb76a19d84aba34d8de287be84d07e7e9a30ee714979c7e1123a8bd9822a33ecaf512472e8e8f8db3f9635c1949e640c621854eba0d79eccf52ff111284b4cc61d11902aebc66f2b2e436434eacc0aba938220b084800c2ca4e693522643573b2c4ce35050b0cf774201f0fe52ac9f26d71b6cf61a711cc229f77ace7aa88a2f19983122b11be87a59c355d25f8e4"
c_7="32510bfbacfbb9befd54415da243e1695ecabd58c519cd4bd90f1fa6ea5ba47b01c909ba7696cf606ef40c04afe1ac0aa8148dd066592ded9f8774b529c7ea125d298e8883f5e9305f4b44f915cb2bd05af51373fd9b4af511039fa2d96f83414aaaf261bda2e97b170fb5cce2a53e675c154c0d9681596934777e2275b381ce2e40582afe67650b13e72287ff2270abcf73bb028932836fbdecfecee0a3b894473c1bbeb6b4913a536ce4f9b13f1efff71ea313c8661dd9a4ce"
c_8="315c4eeaa8b5f8bffd11155ea506b56041c6a00c8a08854dd21a4bbde54ce56801d943ba708b8a3574f40c00fff9e00fa1439fd0654327a3bfc860b92f89ee04132ecb9298f5fd2d5e4b45e40ecc3b9d59e9417df7c95bba410e9aa2ca24c5474da2f276baa3ac325918b2daada43d6712150441c2e04f6565517f317da9d3"
c_9="271946f9bbb2aeadec111841a81abc300ecaa01bd8069d5cc91005e9fe4aad6e04d513e96d99de2569bc5e50eeeca709b50a8a987f4264edb6896fb537d0a716132ddc938fb0f836480e06ed0fcd6e9759f40462f9cf57f4564186a2c1778f1543efa270bda5e933421cbe88a4a52222190f471e9bd15f652b653b7071aec59a2705081ffe72651d08f822c9ed6d76e48b63ab15d0208573a7eef027"
c_10="466d06ece998b7a2fb1d464fed2ced7641ddaa3cc31c9941cf110abbf409ed39598005b3399ccfafb61d0315fca0a314be138a9f32503bedac8067f03adbf3575c3b8edc9ba7f537530541ab0f9f3cd04ff50d66f1d559ba520e89a2cb2a83"
c_list=[c_1,c_2,c_3,c_4,c_5,c_6,c_7,c_8,c_9,c_10]
x_list=[]

#异或不同长度字符串
def strxor(a, b):
    if len(a) > len(b):
       return hex(int(a[:len(b)],16) ^ int(b,16))
    else:
       return hex(int(a,16) ^ int(b[:len(a)],16))

#得到密文文本及长度
ans_c=""
for i in range(0,len(c),2):
    asc=int(c[i:i+2],16)
    if(asc>=65 and asc<=90):
        ans_c=ans_c+chr(asc)
    elif(asc>=97 and asc<=122):
        ans_c=ans_c+chr(asc)
    else:
        ans_c=ans_c+"?"

#将所有密文与最后一个密文异或,得到密文异或,结果存入x_list
for ct in c_list:
    tmp=strxor(c,ct)[2:]
    ans=""
    for i in range(0,len(tmp),2):
        asc=int(tmp[i:i+2],16)
        if(asc>=65 and asc<=90 or asc>=97 and asc<=122):
            ans=ans+chr(asc)
        else:
            ans=ans+"?"
    #将长度补到与最后一个文本同长
    if(len(ans)<len(ans_c)):
        for i in range(len(ans_c)-len(ans)):
            ans="?"+ans
    #存储结果
    x_list.append(ans)

m=""
#统计,得到部分字母和空格位置
for i in range(len(ans_c)):
    m=""
    for j in range(len(x_list)):
        m=m+x_list[j][i]
    print(m)
m="Th? se??et messag? is? w?en us?n? a stre?m cip?er? never use ?he key more than once"
m="The secret message is: when using a stream cipher, never use the key more than once"

#求得密钥
h=""
for i in m:
    h=h+hex(ord(i))[2:]
k=int(c,16)^int(h,16)
print(hex(k)[2:])

#求m_4的明文
m_4=strxor(c_4,hex(k)[2:])
ans=""
for i in range(2,len(m_4),2):
    asc=int(m_4[i:i+2],16)
    ans=ans+chr(asc)
print(ans)

T2

题目

Write a program that allows you to "crack" ciphertexts generated using a Vigenere-like cipher, where byte-wise XOR is used instead of addition modulo 26.

密文为(Python中的写法,实际上没有'\')

'F96DE8C227A259C87EE1DA2AED57C93FE5DA36ED4EC87EF2C63AAE5B9A\
7EFFD673BE4ACF7BE8923CAB1ECE7AF2DA3DA44FCF7AE29235A24C963FF0DF3CA3599A\
70E5DA36BF1ECE77F8DC34BE129A6CF4D126BF5B9A7CFEDF3EB850D37CF0C63AA2509A\
76FF9227A55B9A6FE3D720A850D97AB1DD35ED5FCE6BF0D138A84CC931B1F121B44ECE\
70F6C032BD56C33FF9D320ED5CDF7AFF9226BE5BDE3FF7DD21ED56CF71F5C036A94D96\
3FF8D473A351CE3FE5DA3CB84DDB71F5C17FED51DC3FE8D732BF4D963FF3C727ED4AC8\
7EF5DB27A451D47EFD9230BF47CA6BFEC12ABE4ADF72E29224A84CDF3FF5D720A459D4\
7AF59232A35A9A7AE7D33FB85FCE7AF5923AA31EDB3FF7D33ABF52C33FF0D673A551D9\
3FFCD33DA35BC831B1F43CBF1EDF67F0DF23A15B963FE5DA36ED68D378F4DC36BF5B9A\
7AFFD121B44ECE76FEDC73BE5DD27AFCD773BA5FC93FE5DA3CB859D26BB1C63CED5CDF\
3FE2D730B84CDF3FF7DD21ED5ADF7CF0D636BE1EDB79E5D721ED57CE3FE6D320ED57D4\
69F4DC27A85A963FF3C727ED49DF3FFFDD24ED55D470E69E73AC50DE3FE5DA3ABE1EDF\
67F4C030A44DDF3FF5D73EA250C96BE3D327A84D963FE5DA32B91ED36BB1D132A31ED8\
7AB1D021A255DF71B1C436BF479A7AF0C13AA14794'

思路

类维吉尼亚密码,只不过两个字母对应的一个密文字母的方式变成明文和密钥异或得到密文,破解思路也和维吉尼亚一样,先枚举密钥长度,再暴力枚举所有可能得密钥。我这里判断密钥是否正确的条件是和密文异或是否能得到合法字符,这样处理可以得到唯一密钥。

完整代码如下:

点击查看代码
import string

ciphertext = 'F96DE8C227A259C87EE1DA2AED57C93FE5DA36ED4EC87EF2C63AAE5B9A\
7EFFD673BE4ACF7BE8923CAB1ECE7AF2DA3DA44FCF7AE29235A24C963FF0DF3CA3599A\
70E5DA36BF1ECE77F8DC34BE129A6CF4D126BF5B9A7CFEDF3EB850D37CF0C63AA2509A\
76FF9227A55B9A6FE3D720A850D97AB1DD35ED5FCE6BF0D138A84CC931B1F121B44ECE\
70F6C032BD56C33FF9D320ED5CDF7AFF9226BE5BDE3FF7DD21ED56CF71F5C036A94D96\
3FF8D473A351CE3FE5DA3CB84DDB71F5C17FED51DC3FE8D732BF4D963FF3C727ED4AC8\
7EF5DB27A451D47EFD9230BF47CA6BFEC12ABE4ADF72E29224A84CDF3FF5D720A459D4\
7AF59232A35A9A7AE7D33FB85FCE7AF5923AA31EDB3FF7D33ABF52C33FF0D673A551D9\
3FFCD33DA35BC831B1F43CBF1EDF67F0DF23A15B963FE5DA36ED68D378F4DC36BF5B9A\
7AFFD121B44ECE76FEDC73BE5DD27AFCD773BA5FC93FE5DA3CB859D26BB1C63CED5CDF\
3FE2D730B84CDF3FF7DD21ED5ADF7CF0D636BE1EDB79E5D721ED57CE3FE6D320ED57D4\
69F4DC27A85A963FF3C727ED49DF3FFFDD24ED55D470E69E73AC50DE3FE5DA3ABE1EDF\
67F4C030A44DDF3FF5D73EA250C96BE3D327A84D963FE5DA32B91ED36BB1D132A31ED8\
7AB1D021A255DF71B1C436BF479A7AF0C13AA14794'

#将十六进制字母转为十进制数字,便于计算
def hex_to_ascii(hex_text):
    ascii_list = []
    for i in range(0, len(hex_text), 2):
        ascii_list.append(int(hex_text[i:i + 2], 16))
    return ascii_list

#枚举所有key值,根据明文是否合法,确定key值
def find_possible_keys(byte_group):
    valid_chars = string.ascii_letters + ',' + '.' + ' '
    potential_keys = []
    confirmed_keys = []
    for i in range(0x00, 0xFF):
        potential_keys.append(i)
        confirmed_keys.append(i)
    for key in potential_keys:
        for byte in byte_group:
            if chr(key ^ byte) not in valid_chars:
                confirmed_keys.remove(key)
                break
    return confirmed_keys

#枚举得key长度和key值
cipher_bytes = hex_to_ascii(ciphertext)
actual_key_length = 0
vigenere_like_keys = []
for length in range(1, 14):
    temp_keys = []
    for index in range(0, length):
        byte_group = cipher_bytes[index::length]
        keys = find_possible_keys(byte_group)
        if not keys:
            break
        else:
            temp_keys.insert(index, keys)
    if temp_keys:
        actual_key_length = length
        vigenere_like_keys = temp_keys
        print(length)
        print(f"key:{temp_keys}")

#得到明文
decrypted_text = ''
for i in range(0, len(cipher_bytes)):
    decrypted_text = decrypted_text + chr(cipher_bytes[i] ^ vigenere_like_keys[i % actual_key_length][0])
print(decrypted_text)

T3

题目

题目链接T3-the cryptopals crypto challenges set 1的前六题

分析

由于前5题都是为第六题做铺垫,所以直接分析第六题。repeating-key xor实际上和T2的类维吉尼亚函数一模一样。这里我采用题目给出的思路进行求解。

  1. 枚举密钥长度key_size,从2bits到40bits
  2. 实现一个函数,该函数功能是计算两个字符串的汉明距离
  3. 根据key_size选取4个key_size长的块,两两组合计算汉明距累加,再除以组合数,再除以key_size,得到规格化的距离。
  4. 根据规格化距离对key_size进行排序,其中规格化距离最小的key_size最有可能是正确的密钥大小。
  5. 将密文按key_size一块分成n个块,再取出每个块中第i个字符,组合得到一个新的块。最终得到key_size个长为n的新块,而且每个单独新块都是用同一个字符密钥加密。
  6. 枚举密钥,得到明文后统计每个块中的字频分,字频分最高的明文对应的密钥是最有可能正确的。由此得到每个块的密钥。
  7. 将这些密钥拼在一起,便能得到完整密钥,后续解密也十分容易。

完整代码如下:

点击查看代码
import base64
import itertools

# 字母频率表
letter_frequency = {
    'a': 0.0651738, 'b': 0.0124248, 'c': 0.0217339,
    'd': 0.0349835, 'e': 0.1041442, 'f': 0.0197881,
    'g': 0.0158610, 'h': 0.0492888, 'i': 0.0558094,
    'j': 0.0009033, 'k': 0.0050529, 'l': 0.0331490,
    'm': 0.0202124, 'n': 0.0564513, 'o': 0.0596302,
    'p': 0.0137645, 'q': 0.0008606, 'r': 0.0497563,
    's': 0.0515760, 't': 0.0729357, 'u': 0.0225134,
    'v': 0.0082903, 'w': 0.0171272, 'x': 0.0013692,
    'y': 0.0145984, 'z': 0.0007836, ' ': 0.1918182
}

# 计算文本频率分
def calculate_text_score(byte_array):
    score = 0
    for byte in byte_array:
        score += letter_frequency.get(chr(byte).lower(), 0)
    return score

# 异或
def xor_with_single_char(byte_array, key_byte):
    result = b''
    for byte in byte_array:
        result += bytes([byte ^ key_byte])
    return result

# 用字频法破解单字节加密的密文
def brute_force_single_char_xor(ciphertext):
    results = []
    for key in range(256):
        plaintext = xor_with_single_char(ciphertext, key)
        score = calculate_text_score(plaintext)
        results.append({
            'key': key,
            'score': score,
            'plaintext': plaintext
        })
    return sorted(results, key=lambda x: x['score'], reverse=True)[0]

# 字符串与重复密钥异或
def xor_with_repeating_key(byte_array, key):
    result = b''
    key_length = len(key)
    for i, byte in enumerate(byte_array):
        result += bytes([byte ^ key[i % key_length]])
    return result

# 计算两个字符串的hamming距离
def compute_hamming_distance(string1, string2):
    assert len(string1) == len(string2)
    distance = 0
    for byte1, byte2 in zip(string1, string2):
        xor_result = byte1 ^ byte2
        distance += bin(xor_result).count('1')
    return distance

def break_repeating_key_xor(ciphertext):
    key_size_distances = {}
    for key_size in range(2, 41):
        chunks = [ciphertext[i:i + key_size] for i in range(0, len(ciphertext), key_size)][:4]
        total_distance = 0
        pairs = itertools.combinations(chunks, 2)
        for chunk1, chunk2 in pairs:
            total_distance += compute_hamming_distance(chunk1, chunk2)
        average_distance = total_distance / 6
        normalized_distance = average_distance / key_size
        key_size_distances[key_size] = normalized_distance
    
    best_key_sizes = sorted(key_size_distances, key=key_size_distances.get)[:3]
    print(best_key_sizes)

    decrypted_plaintexts = []
    for size in best_key_sizes:
        key = b''
        for i in range(size):
            block = b''
            for j in range(i, len(ciphertext), size):
                block += bytes([ciphertext[j]])
            key += bytes([brute_force_single_char_xor(block)['key']])
        decrypted_plaintexts.append((xor_with_repeating_key(ciphertext, key), key))
    
    return max(decrypted_plaintexts, key=lambda x: calculate_text_score(x[0]))

#解密
with open("ciphertext.txt") as file:
    encoded_data = base64.b64decode(file.read())
decrypted_result = break_repeating_key_xor(encoded_data)
print("The Key is", decrypted_result[1].decode())
print("The Length is", len(decrypted_result[1].decode()))
print(decrypted_result[0].decode().rstrip())

T4

题目

根据下图中键盘上的密码输入情况来破解密码,密码的hash值为67ae1a64661ac8b4494666f58c4822408dd0a3e4
image

题目链接T4-MTC3 Cracking SHA1-Hashed Passwords

思路

其实要严谨解决很难,所以我动用了社会工程学的力量先得到了答案,然后对按键分析进行了“合理假设”:假设右侧数字键只用来进行上下左右的移动,以及假设左边每个案件都只使用了一次。那么,只要枚举所有可能密码,计算哈希值与实际值比较就好了。

完整代码如下:

点击查看代码
import hashlib
import itertools
import time

SHA1_HASH_TARGET = "67ae1a64661ac8b4494666f58c4822408dd0a3e4"
CHAR_SETS = [['Q', 'q'], ['W', 'w'], ['5', '%'], ['8', '('], ['=', '0'], ['I', 'i'], ['*', '+'], ['n', 'N']]

def sha1_encrypt(input_string):
    sha = hashlib.sha1(input_string.encode())
    hashed_value = sha.hexdigest()
    return hashed_value

# 暴力破解
start_time = time.time()
initial_string = "0" * 8
current_password = list(initial_string)
for i in range(2):
    current_password[0] = CHAR_SETS[0][i]
    for j in range(2):
        current_password[1] = CHAR_SETS[1][j]
        for k in range(2):
            current_password[2] = CHAR_SETS[2][k]
            for l in range(2):
                current_password[3] = CHAR_SETS[3][l]
                for m in range(2):
                    current_password[4] = CHAR_SETS[4][m]
                    for n in range(2):
                        current_password[5] = CHAR_SETS[5][n]
                        for o in range(2):
                            current_password[6] = CHAR_SETS[6][o]
                            for p in range(2):
                                current_password[7] = CHAR_SETS[7][p]
                                permutation = "".join(current_password)
                                for perm in itertools.permutations(permutation, 8):
                                    candidate_password = "".join(perm)
                                    hashed_candidate = sha1_encrypt(candidate_password)
                                    if hashed_candidate == SHA1_HASH_TARGET:
                                        print("password:", candidate_password)
                                        end_time = time.time()
                                        print(f"time:{end_time - start_time}s")
                                        exit(0)

总结

实验一主要涉及古典密码,基本方法就是暴力枚举,实现难度不大,但是需要对各种编码之间的转换熟练掌握。

题目与代码存于Github上Cryptography_assignment

posted @ 2024-10-16 21:06  acetyl_lwx  阅读(250)  评论(0)    收藏  举报