python challenge - orc.py

http://www.pythonchallenge.com/pc/def/ocr.html

recognize the characters. maybe they are in the book,
but MAYBE they are in the page source.

打开页面源代码，可以看到下面的信息：

<!--
find rare characters in the mess below:
-->

通过给出的提示“find rare characters in the mess below”，我们可以知道线索就在第二个中，不失一般性，设计python代码如下：

import re   
import urllib
import string

# 使用urllib模块读取页面源代码     
sock = urllib.urlopen("http://www.pythonchallenge.com/pc/def/ocr.html")   
source = sock.read()   
sock.close()   

# 标志re.S表示在正则表达式中点(.)可以匹配任意字符，包括换行符
data = re.findall(r'<!--(.+?)-->', source, re.S)
charList = re.findall(r'([a-zA-Z])', data[1], 16)

# 使用string模块将list转为字符串打印
print string.join(charList)

输出：

“e q u a l i t y”

下一关：http://www.pythonchallenge.com/pc/def/equality.html

urllib模块

posted @ 2013-10-19 15:57 zhxiang 阅读(476) 评论(0) 编辑收藏举报

会员力量，点亮园子希望

刷新页面返回顶部

ZX

python challenge - orc.py

urllib模块