组件对象模型 COM pywin32 对象链接与嵌入 Object Linking and Embedding,OLE 键盘鼠标控制
组件对象模型(英语:Component Object Model,缩写COM)是微软的一套软件组件的二进制接口标准。这使得跨编程语言的进程间通信、动态对象创建成为可能。COM是多项微软技术与框架的基础,包括OLE、OLE自动化、ActiveX、COM+、DCOM、Windows shell、DirectX、Windows Runtime。COM与实现语言种类无关,如此使用它实现的对象可用在不同于开发它的环境,甚至跨越机器边界。对制作良好的对象,COM使对象得以重复使用,而无须知道其内部实现,因为它强制实现者提供与实现分离、确切定义的界面。各语言不同的存储配置语义使组件对象模型用对象引用计数(Reference counting)管理其自身的产生与销毁。不同界面间类型转换的铸型用 QueryInterface 方法。
对象链接与嵌入(英语:Object Linking and Embedding,OLE)是能让应用程序创建包含不同来源的复合文档的技术[3]。OLE不仅是桌面应用程序集成,而且还定义和实现了允许应用程序作为软件“对象”(数据集合和操作数据的函数)彼此进行“链接”的机制,这种链接机制和协议称为部件对象模型(Component Object Model),简称COM。OLE可以用来创建复合文档,复合文档包含了创建于不同源应用程序,有着不同类型的数据,因此可以把文字、声音、图像、表格、应用程序等组合在一起。
但对OLE的支持也带来了一些安全性问题,如在Outlook2002及以上版本中,黑客如果在邮件中嵌入危险OLE对象,就可以对其肆意伪装,有可能骗过用户导致安全问题。
GitHub - mhammond/pywin32: Python for Windows (pywin32) Extensions · GitHub
the Python for Win32 (pywin32) extensions, which provides access to many of the Windows APIs from Python, including COM support.
docx文档转html页面 word doc docx 提取文字 图片 html 结构
https://pydocx.readthedocs.io/en/latest/usage.html
from pydocx import PyDocX
# Pass in a path
html = PyDocX.to_html('file.docx')
# Pass in a file object
html = PyDocX.to_html(open('file.docx', 'rb'))
# Pass in a file-like object
from cStringIO import StringIO
buf = StringIO()
with open('file.docx') as f:
buf.write(f.read())
html = PyDocX.to_html(buf)
https://github.com/mhammond/pywin32/blob/main/com/win32com/demos/dump_clipboard.py
import pythoncom import win32con formats = """CF_TEXT CF_BITMAP CF_METAFILEPICT CF_SYLK CF_DIF CF_TIFF CF_OEMTEXT CF_DIB CF_PALETTE CF_PENDATA CF_RIFF CF_WAVE CF_UNICODETEXT CF_ENHMETAFILE CF_HDROP CF_LOCALE CF_MAX CF_OWNERDISPLAY CF_DSPTEXT CF_DSPBITMAP CF_DSPMETAFILEPICT CF_DSPENHMETAFILE""".split() format_name_map = {} for f in formats: val = getattr(win32con, f) format_name_map[val] = f tymeds = [attr for attr in pythoncom.__dict__ if attr.startswith("TYMED_")] def DumpClipboard(): do = pythoncom.OleGetClipboard() print("Dumping all clipboard formats...") for fe in do.EnumFormatEtc(): fmt, td, aspect, index, tymed = fe tymeds_this = [ getattr(pythoncom, t) for t in tymeds if tymed & getattr(pythoncom, t) ] print("Clipboard format", format_name_map.get(fmt, str(fmt))) for t_this in tymeds_this: # As we are enumerating there should be no need to call # QueryGetData, but we do anyway! fetc_query = fmt, td, aspect, index, t_this try: do.QueryGetData(fetc_query) except pythoncom.com_error: print("Eeek - QGD indicated failure for tymed", t_this) # now actually get it. try: medium = do.GetData(fetc_query) except pythoncom.com_error as exc: print("Failed to get the clipboard data:", exc) continue if medium.tymed == pythoncom.TYMED_GDI: data = "GDI handle %d" % medium.data elif medium.tymed == pythoncom.TYMED_MFPICT: data = "METAFILE handle %d" % medium.data elif medium.tymed == pythoncom.TYMED_ENHMF: data = "ENHMETAFILE handle %d" % medium.data elif medium.tymed == pythoncom.TYMED_HGLOBAL: data = "%d bytes via HGLOBAL" % len(medium.data) elif medium.tymed == pythoncom.TYMED_FILE: data = "filename '%s'" % data elif medium.tymed == pythoncom.TYMED_ISTREAM: stream = medium.data stream.Seek(0, 0) bytes = 0 while 1: chunk = stream.Read(4096) if not chunk: break bytes += len(chunk) data = "%d bytes via IStream" % bytes elif medium.tymed == pythoncom.TYMED_ISTORAGE: data = "a IStorage" else: data = "*** unknown tymed!" print(" -> got", data) do = None if __name__ == "__main__": DumpClipboard() if pythoncom._GetInterfaceCount() + pythoncom._GetGatewayCount(): print( "XXX - Leaving with %d/%d COM objects alive" % (pythoncom._GetInterfaceCount(), pythoncom._GetGatewayCount()) )
Keyboard and Mouse Control
The x, y coordinates used by PyAutoGUI has the 0, 0 origin coordinates in the top left corner of the screen. The x coordinates increase going to the right (just as in mathematics) but the y coordinates increase going down (the opposite of mathematics). On a screen that is 1920 x 1080 pixels in size, coordinates 0, 0 are for the top left while 1919, 1079 is for the bottom right.
>>> import pyautogui
>>> screenWidth, screenHeight = pyautogui.size() # Returns two integers, the width and height of the screen. (The primary monitor, in multi-monitor setups.)
>>> currentMouseX, currentMouseY = pyautogui.position() # Returns two integers, the x and y of the mouse cursor's current position.
>>> pyautogui.moveTo(100, 150) # Move the mouse to the x, y coordinates 100, 150.
>>> pyautogui.click() # Click the mouse at its current location.
>>> pyautogui.click(200, 220) # Click the mouse at the x, y coordinates 200, 220.
>>> pyautogui.move(None, 10) # Move mouse 10 pixels down, that is, move the mouse relative to its current position.
>>> pyautogui.doubleClick() # Double click the mouse at the
>>> pyautogui.moveTo(500, 500, duration=2, tween=pyautogui.easeInOutQuad) # Use tweening/easing function to move mouse over 2 seconds.
>>> pyautogui.write('Hello world!', interval=0.25) # Type with quarter-second pause in between each key.
>>> pyautogui.press('esc') # Simulate pressing the Escape key.
>>> pyautogui.keyDown('shift')
>>> pyautogui.write(['left', 'left', 'left', 'left', 'left', 'left'])
>>> pyautogui.keyUp('shift')
>>> pyautogui.hotkey('ctrl', 'c')
Display Message Boxes
>>> import pyautogui
>>> pyautogui.alert('This is an alert box.')
'OK'
>>> pyautogui.confirm('Shall I proceed?')
'Cancel'
>>> pyautogui.confirm('Enter option.', buttons=['A', 'B', 'C'])
'B'
>>> pyautogui.prompt('What is your name?')
'Al'
>>> pyautogui.password('Enter password (text will be hidden)')
'swordfish'
Screenshot Functions
(PyAutoGUI uses Pillow for image-related features.)
>>> import pyautogui
>>> im1 = pyautogui.screenshot()
>>> im1.save('my_screenshot.png')
>>> im2 = pyautogui.screenshot('my_screenshot2.png')
You can also locate where an image is on the screen:
>>> import pyautogui
>>> button7location = pyautogui.locateOnScreen('button.png') # returns (left, top, width, height) of matching region
>>> button7location
(1416, 562, 50, 41)
>>> buttonx, buttony = pyautogui.center(button7location)
>>> buttonx, buttony
(1441, 582)
>>> pyautogui.click(buttonx, buttony) # clicks the center of where the button was found
The locateCenterOnScreen() function returns the center of this match region:
>>> import pyautogui
>>> buttonx, buttony = pyautogui.locateCenterOnScreen('button.png') # returns (x, y) of matching region
>>> buttonx, buttony
(1441, 582)
>>> pyautogui.click(buttonx, buttony) # clicks the center of where the button was found
How Does PyAutoGUI Work?
The three major operating systems (Windows, macOS, and Linux) each have different ways to programmatically control the mouse and keyboard. This can often involve confusing, obscure, and deeply technical details. The job of PyAutoGUI is to hide all of this complexity behind a simple API.
-
On Windows, PyAutoGUI accesses the Windows API (also called the WinAPI or win32 API) through the built-in
ctypesmodule. Thenicewinmodule at https://github.com/asweigart/nicewin provides a demonstration for how Windows API calls can be made through Python. -
On macOS, PyAutoGUI uses the
rubicon-objcmodule to access the Cocoa API. -
On Linux, PyAutoGUI uses the
Xlibmodule to access the X11 or X Window System.
https://pywinauto.readthedocs.io/en/latest/
>>> from pywinauto.application import Application
>>> app = Application(backend="uia").start("notepad.exe")
>>> app.UntitledNotepad.type_keys("%FX")
app.UntitledNotepad.menu_select("File->SaveAs") app.SaveAs.ComboBox5.select("UTF-8") app.SaveAs.edit1.set_text("Example-utf8.txt") app.SaveAs.Save.click()

浙公网安备 33010602011771号