CDP 协议自动化实战:用 Chrome DevTools Protocol 控制 Electron 桌面应用
Electron自动化的痛点
Electron应用本质上是"一个定制的Chromium浏览器 + Node.js运行时"。市面上大部分UI自动化框架都无法很好地处理这个混合体:
常见方案的问题:
┌───────────────────┬──────────────────────────────────────────┐
│ 方案 │ 问题 │
├───────────────────┼──────────────────────────────────────────┤
│ Selenium │ 只能控制浏览器,无法连接Electron进程 │
│ PyAutoGUI │ 基于屏幕坐标,分辨率/缩放变了就挂 │
│ WinAppDriver │ 无法穿透Chromium渲染层到达DOM │
│ Spectron(已停维) │ 官方弃坑,不再维护 │
│ Playwright │ 对Electron支持有限,部分API不可用 │
└───────────────────┴──────────────────────────────────────────┘
核心矛盾在于:Electron的UI是Web技术(HTML/CSS/JS),但它不是运行在标准浏览器中。 需要一个能直接连接Electron内部Chromium引擎的通道——这就是CDP协议的价值所在。
CDP协议原理
什么是CDP
Chrome DevTools Protocol(CDP)是Chromium暴露的一套调试协议。你平时用Chrome DevTools(F12)看到的Elements、Console、Network面板,底层都是通过CDP协议与浏览器通信的。
CDP协议架构:
┌──────────────────────────────────────────────────────────┐
│ Electron应用 │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Chromium引擎 │ │
│ │ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ │
│ │ │ Renderer │ │ Renderer │ │ GPU │ │ │
│ │ │ Process │ │ Process │ │ Process │ │ │
│ │ │(BrowserWindow)│(BrowserWindow)│ │ │ │
│ │ └──────────┘ └──────────┘ └──────────┘ │ │
│ │ │ │
│ │ ┌──────────────────────────────────────────┐ │ │
│ │ │ Browser Process │ │ │
│ │ │ ┌──────────────────────────────────┐ │ │ │
│ │ │ │ DevTools Server (CDP) │ │ │ │
│ │ │ │ ws://127.0.0.1:9222 │◄───┼──┐ │ │
│ │ │ └──────────────────────────────────┘ │ │ │ │
│ │ └──────────────────────────────────────────┘ │ │ │
│ └────────────────────────────────────────────────────┘ │
│ │
│ ┌────────────────────────────────────────────────────┐ │
│ │ Node.js Runtime │ │
│ │ · 文件系统 · 网络 · 系统API · 原生模块 │ │
│ └────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────┘
│
WebSocket连接 │
┌─────────────────────┘
▼
┌──────────────────┐
│ 自动化脚本 │
│ (Python/Node) │
└──────────────────┘
CDP的Domain模型
CDP协议按功能划分为多个Domain,每个Domain提供一组方法和事件:
| Domain | 功能 | 自动化常用方法 |
|--------|------|--------------|
| Page | 页面管理 | navigate, reload, captureScreenshot |
| DOM | DOM操作 | getDocument, querySelector, getOuterHTML |
| Runtime | JS执行 | evaluate, callFunctionOn |
| Input | 输入模拟 | dispatchMouseEvent, dispatchKeyEvent |
| Network | 网络拦截 | enable, interceptRequests |
| Emulation | 设备模拟 | setDeviceMetricsOverride |
| Target | 目标管理 | getTargets, attachToTarget |
启动配置:开启远程调试端口
基本配置
Electron应用需要以特定参数启动才能开启CDP端口:
# 方式一:直接命令行启动
./your-electron-app --remote-debugging-port=9222
# 方式二:带额外调试参数
./your-electron-app \
--remote-debugging-port=9222 \
--remote-debugging-address=0.0.0.0 \
--no-sandbox
# 方式三:如果是已打包的应用(Windows)
"C:\Program Files\YourApp\YourApp.exe" --remote-debugging-port=9222
# 方式四:macOS
open -a "YourApp" --args --remote-debugging-port=9222
验证CDP端口是否开启
# 检查端口是否监听
curl http://127.0.0.1:9222/json/version
# 预期返回:
{
"Browser": "Chrome/120.0.6099.0",
"Protocol-Version": "1.3",
"webSocketDebuggerUrl": "ws://127.0.0.1:9222/devtools/browser/xxxx-xxxx"
}
# 列出所有可调试的页面/窗口
curl http://127.0.0.1:9222/json/list
在Electron代码中配置(如果你有源码权限)
// main.js - Electron主进程
const { app, BrowserWindow } = require('electron');
app.commandLine.appendSwitch('remote-debugging-port', '9222');
app.commandLine.appendSwitch('remote-debugging-address', '0.0.0.0');
app.on('ready', () => {
const mainWindow = new BrowserWindow({
width: 1200,
height: 800,
webPreferences: {
nodeIntegration: false,
contextIsolation: true
}
});
mainWindow.loadFile('index.html');
// 获取DevTools WebSocket地址
const debugger_url = mainWindow.webContents.debugger;
console.log('CDP available at:',
`ws://127.0.0.1:9222/devtools/page/${mainWindow.webContents.id}`
);
});
WebSocket连接:建立CDP通信
Python实现
import asyncio
import json
import websockets
class CDPClient:
"""轻量级CDP客户端"""
def __init__(self, ws_url: str):
self.ws_url = ws_url
self.ws = None
self.message_id = 0
self.responses = {}
async def connect(self):
"""建立WebSocket连接"""
self.ws = await websockets.connect(self.ws_url)
# 启动消息接收循环
asyncio.create_task(self._receive_loop())
async def _receive_loop(self):
"""持续接收CDP响应"""
async for message in self.ws:
data = json.loads(message)
if 'id' in data:
self.responses[data['id']] = data
async def send_command(self, method: str, params: dict = None) -> dict:
"""发送CDP命令"""
self.message_id += 1
msg = {
"id": self.message_id,
"method": method,
"params": params or {}
}
await self.ws.send(json.dumps(msg))
# 等待响应
while self.message_id not in self.responses:
await asyncio.sleep(0.01)
return self.responses.pop(self.message_id)
async def close(self):
await self.ws.close()
# 使用示例
async def main():
# 1. 获取WebSocket URL
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.get('http://127.0.0.1:9222/json/list') as resp:
targets = await resp.json()
# 选择第一个page类型的target
page_target = next(t for t in targets if t['type'] == 'page')
ws_url = page_target['webSocketDebuggerUrl']
# 2. 连接CDP
cdp = CDPClient(ws_url)
await cdp.connect()
# 3. 启用必要的Domain
await cdp.send_command("Page.enable")
await cdp.send_command("DOM.enable")
await cdp.send_command("Runtime.enable")
# 4. 获取页面标题
result = await cdp.send_command("Runtime.evaluate", {
"expression": "document.title"
})
print(f"Page title: {result['result']['result']['value']}")
await cdp.close()
asyncio.run(main())
Node.js实现(使用chrome-remote-interface)
const CDP = require('chrome-remote-interface');
async function main() {
// 连接到Electron应用
const client = await CDP({
host: '127.0.0.1',
port: 9222,
target: (targets) => {
// 选择主窗口
return targets.find(t => t.type === 'page' && t.url !== 'about:blank');
}
});
const { Page, DOM, Runtime, Input } = client;
// 启用必要的Domain
await Promise.all([
Page.enable(),
DOM.enable(),
Runtime.enable()
]);
// 获取页面标题
const { result } = await Runtime.evaluate({
expression: 'document.title'
});
console.log('Page title:', result.value);
await client.close();
}
main().catch(console.error);
DOM操作:元素定位与交互
获取DOM树
async def get_element(cdp: CDPClient, selector: str) -> dict:
"""通过CSS选择器定位元素"""
# 获取文档根节点
doc = await cdp.send_command("DOM.getDocument")
root_node_id = doc['result']['root']['nodeId']
# 通过选择器查询
result = await cdp.send_command("DOM.querySelector", {
"nodeId": root_node_id,
"selector": selector
})
node_id = result['result']['nodeId']
if node_id == 0:
raise ElementNotFoundError(f"Element not found: {selector}")
# 获取元素信息
element = await cdp.send_command("DOM.describeNode", {
"nodeId": node_id,
"depth": 0
})
return {
"nodeId": node_id,
"nodeName": element['result']['node']['nodeName'],
"attributes": element['result']['node'].get('attributes', [])
}
async def get_element_box(cdp: CDPClient, node_id: int) -> dict:
"""获取元素的位置和尺寸"""
result = await cdp.send_command("DOM.getBoxModel", {
"nodeId": node_id
})
model = result['result']['model']
# content区域的四个角坐标 [x1,y1, x2,y2, x3,y3, x4,y4]
content = model['content']
# 计算中心点
center_x = (content[0] + content[2] + content[4] + content[6]) / 4
center_y = (content[1] + content[3] + content[5] + content[7]) / 4
return {
"center": (center_x, center_y),
"width": model['width'],
"height": model['height'],
"content": content
}
模拟用户输入
async def click_element(cdp: CDPClient, selector: str):
"""点击指定元素"""
element = await get_element(cdp, selector)
box = await get_element_box(cdp, element['nodeId'])
x, y = box['center']
# 模拟鼠标点击(mousedown → mouseup)
await cdp.send_command("Input.dispatchMouseEvent", {
"type": "mousePressed",
"x": x,
"y": y,
"button": "left",
"clickCount": 1
})
await cdp.send_command("Input.dispatchMouseEvent", {
"type": "mouseReleased",
"x": x,
"y": y,
"button": "left",
"clickCount": 1
})
async def type_text(cdp: CDPClient, selector: str, text: str):
"""在输入框中输入文本"""
# 先点击输入框获取焦点
await click_element(cdp, selector)
await asyncio.sleep(0.1)
# 清空现有内容
await cdp.send_command("Runtime.evaluate", {
"expression": f"""
(function() {{
const el = document.querySelector('{selector}');
el.value = '';
el.dispatchEvent(new Event('input', {{bubbles: true}}));
}})()
"""
})
# 逐字符输入
for char in text:
await cdp.send_command("Input.dispatchKeyEvent", {
"type": "keyDown",
"text": char,
"key": char,
"unmodifiedText": char
})
await cdp.send_command("Input.dispatchKeyEvent", {
"type": "keyUp",
"key": char
})
await asyncio.sleep(0.02) # 模拟人类输入速度
async def scroll_to_element(cdp: CDPClient, selector: str):
"""滚动到指定元素可见"""
await cdp.send_command("Runtime.evaluate", {
"expression": f"""
document.querySelector('{selector}')
.scrollIntoView({{behavior: 'smooth', block: 'center'}})
"""
})
await asyncio.sleep(0.5) # 等待滚动完成
截图与视觉对比
页面截图
import base64
from PIL import Image
import io
async def take_screenshot(cdp: CDPClient, save_path: str,
clip: dict = None) -> str:
"""
截取页面截图
clip: 可选,指定裁剪区域 {x, y, width, height, scale}
"""
params = {
"format": "png",
"quality": 100
}
if clip:
params["clip"] = clip
result = await cdp.send_command("Page.captureScreenshot", params)
# 解码Base64图片数据
image_data = base64.b64decode(result['result']['data'])
# 保存文件
with open(save_path, 'wb') as f:
f.write(image_data)
return save_path
async def take_element_screenshot(cdp: CDPClient, selector: str,
save_path: str) -> str:
"""截取指定元素的截图"""
element = await get_element(cdp, selector)
box = await get_element_box(cdp, element['nodeId'])
content = box['content']
x_min = min(content[0], content[2], content[4], content[6])
y_min = min(content[1], content[3], content[5], content[7])
clip = {
"x": x_min,
"y": y_min,
"width": box['width'],
"height": box['height'],
"scale": 1
}
return await take_screenshot(cdp, save_path, clip)
截图对比(回归测试)
import numpy as np
from PIL import Image, ImageChops
class ScreenshotComparator:
"""截图对比器,用于UI回归测试"""
def __init__(self, threshold: float = 0.01):
"""
threshold: 允许的像素差异比例(0.01 = 1%)
"""
self.threshold = threshold
def compare(self, baseline_path: str, current_path: str) -> dict:
"""对比两张截图"""
baseline = Image.open(baseline_path).convert('RGB')
current = Image.open(current_path).convert('RGB')
# 尺寸检查
if baseline.size != current.size:
return {
"match": False,
"reason": "size_mismatch",
"baseline_size": baseline.size,
"current_size": current.size
}
# 像素差异计算
diff = ImageChops.difference(baseline, current)
diff_array = np.array(diff)
# 统计非零像素
non_zero_pixels = np.count_nonzero(diff_array.sum(axis=2))
total_pixels = diff_array.shape[0] * diff_array.shape[1]
diff_ratio = non_zero_pixels / total_pixels
# SSIM结构相似度(简化版)
ssim = self._calculate_ssim(
np.array(baseline).astype(float),
np.array(current).astype(float)
)
return {
"match": diff_ratio <= self.threshold,
"diff_ratio": round(diff_ratio, 6),
"ssim": round(ssim, 4),
"non_zero_pixels": non_zero_pixels,
"total_pixels": total_pixels
}
def _calculate_ssim(self, img1: np.ndarray, img2: np.ndarray) -> float:
"""计算结构相似度(简化版)"""
C1 = (0.01 * 255) ** 2
C2 = (0.03 * 255) ** 2
mu1 = img1.mean()
mu2 = img2.mean()
sigma1 = img1.std()
sigma2 = img2.std()
sigma12 = np.mean((img1 - mu1) * (img2 - mu2))
ssim = ((2 * mu1 * mu2 + C1) * (2 * sigma12 + C2)) / \
((mu1**2 + mu2**2 + C1) * (sigma1**2 + sigma2**2 + C2))
return float(ssim)
完整自动化测试框架
框架封装
class ElectronAutomator:
"""Electron自动化测试框架"""
def __init__(self, app_path: str, debug_port: int = 9222):
self.app_path = app_path
self.debug_port = debug_port
self.cdp = None
self.process = None
async def start(self):
"""启动Electron应用并连接CDP"""
import subprocess
import time
# 启动应用
self.process = subprocess.Popen([
self.app_path,
f'--remote-debugging-port={self.debug_port}',
'--no-sandbox'
])
# 等待CDP端口就绪
for _ in range(30):
try:
import aiohttp
async with aiohttp.ClientSession() as session:
async with session.get(
f'http://127.0.0.1:{self.debug_port}/json/list'
) as resp:
if resp.status == 200:
break
except:
pass
await asyncio.sleep(1)
else:
raise TimeoutError("CDP port not available after 30s")
# 连接到第一个page
async with aiohttp.ClientSession() as session:
async with session.get(
f'http://127.0.0.1:{self.debug_port}/json/list'
) as resp:
targets = await resp.json()
page = next(t for t in targets if t['type'] == 'page')
self.cdp = CDPClient(page['webSocketDebuggerUrl'])
await self.cdp.connect()
# 启用必要的Domain
await self.cdp.send_command("Page.enable")
await self.cdp.send_command("DOM.enable")
await self.cdp.send_command("Runtime.enable")
async def navigate(self, url: str):
"""导航到指定URL"""
await self.cdp.send_command("Page.navigate", {"url": url})
await asyncio.sleep(1) # 等待页面加载
async def wait_for_selector(self, selector: str, timeout: float = 10):
"""等待元素出现"""
start = time.time()
while time.time() - start < timeout:
try:
element = await get_element(self.cdp, selector)
if element['nodeId'] != 0:
return element
except:
pass
await asyncio.sleep(0.5)
raise TimeoutError(f"Element not found: {selector}")
async def execute_js(self, expression: str) -> any:
"""执行JavaScript表达式"""
result = await self.cdp.send_command("Runtime.evaluate", {
"expression": expression,
"returnByValue": True
})
return result['result']['result'].get('value')
async def stop(self):
"""关闭应用"""
if self.cdp:
await self.cdp.close()
if self.process:
self.process.terminate()
# 使用示例
async def test_login_flow():
"""测试登录流程"""
bot = ElectronAutomator("/path/to/your-app")
await bot.start()
try:
# 等待登录页面加载
await bot.wait_for_selector('#login-form')
# 输入用户名和密码
await type_text(bot.cdp, '#username', 'testuser@example.com')
await type_text(bot.cdp, '#password', 'securePassword123')
# 点击登录按钮
await click_element(bot.cdp, '#login-button')
# 等待登录完成(主页出现)
await bot.wait_for_selector('.dashboard-container', timeout=15)
# 验证登录状态
title = await bot.execute_js('document.title')
assert 'Dashboard' in title, f"Expected Dashboard, got: {title}"
# 截图保存
await take_screenshot(bot.cdp, '/tmp/login_success.png')
print("✓ Login flow test passed")
finally:
await bot.stop()
asyncio.run(test_login_flow())
常见坑与解决方案
坑一:多窗口/BrowserWindow处理
问题:Electron应用可能有多个BrowserWindow,CDP连接的是哪个?
解决方案:
┌──────────────────────────────────────────────────────────┐
│ 1. 通过/json/list获取所有target │
│ 2. 根据URL或title选择正确的窗口 │
│ 3. 如果需要操作新弹出的窗口,需要重新连接target │
│ │
│ # 列出所有窗口 │
│ curl http://127.0.0.1:9222/json/list │
│ │
│ # 返回示例: │
│ [ │
│ {"type":"page", "url":"file:///index.html", │
│ "title":"Main Window", "id":"abc123"}, │
│ {"type":"page", "url":"file:///settings.html", │
│ "title":"Settings", "id":"def456"}, │
│ {"type":"background_page", "url":"...", │
│ "title":"Service Worker", "id":"ghi789"} │
│ ] │
└──────────────────────────────────────────────────────────┘
坑二:Electron的Node集成导致的安全限制
# 问题:某些Electron应用禁用了webPreferences中的某些功能
# 导致Runtime.evaluate无法访问Node.js API
# 解决方案:通过main process的IPC通道执行Node操作
async def execute_node_code(cdp: CDPClient, code: str):
"""通过IPC在main process执行Node代码"""
# 在renderer process中发送IPC消息
await cdp.send_command("Runtime.evaluate", {
"expression": f"""
(function() {{
const {{ ipcRenderer }} = require('electron');
ipcRenderer.send('automation-execute', `{code}`);
return new Promise((resolve) => {{
ipcRenderer.once('automation-result', (event, result) => {{
resolve(JSON.stringify(result));
}});
}});
}})()
""",
"awaitPromise": True
})
坑三:Shadow DOM和Web Components
# 问题:querySelectorAll无法穿透Shadow DOM
# 解决方案:使用deep combinator 或 递归查询
async def query_shadow_dom(cdp: CDPClient, selector_chain: list) -> int:
"""
穿透Shadow DOM查询元素
selector_chain: ['#host-element', 'shadow::part-name', '#inner-element']
"""
js_code = """
(function() {
const selectors = SELECTORS_PLACEHOLDER;
let current = document;
for (let i = 0; i < selectors.length; i++) {
const sel = selectors[i];
if (sel.startsWith('shadow::')) {
// 进入shadow root
const partName = sel.replace('shadow::', '');
const host = current.querySelector(partName);
current = host.shadowRoot;
} else {
current = current.querySelector(sel);
if (!current) return null;
}
}
return current;
})()
""".replace("SELECTORS_PLACEHOLDER", json.dumps(selector_chain))
result = await cdp.send_command("Runtime.evaluate", {
"expression": js_code
})
return result
坑四:文件上传对话框
# 问题:点击上传按钮会弹出系统文件选择对话框
# CDP无法控制操作系统原生对话框
# 解决方案:使用DOM.setFileInputFiles直接设置文件
async def upload_file(cdp: CDPClient, input_selector: str, file_paths: list):
"""
绕过系统对话框,直接设置文件上传
"""
# 找到input[type=file]元素
element = await get_element(cdp, input_selector)
# 通过CDP直接设置文件路径(不触发系统对话框)
await cdp.send_command("DOM.setFileInputFiles", {
"nodeId": element['nodeId'],
"files": file_paths # 绝对路径列表
})
# 使用示例
await upload_file(cdp, 'input[type="file"]', [
'/home/user/documents/report.pdf',
'/home/user/images/screenshot.png'
])
坑五:WebSocket重连和连接超时
class RobustCDPClient(CDPClient):
"""带自动重连的CDP客户端"""
def __init__(self, ws_url: str, max_retries: int = 3):
super().__init__(ws_url)
self.max_retries = max_retries
self._connected = False
async def send_command(self, method: str, params: dict = None) -> dict:
"""带重试的命令发送"""
for attempt in range(self.max_retries):
try:
if not self._connected:
await self.connect()
return await asyncio.wait_for(
super().send_command(method, params),
timeout=30.0
)
except (websockets.ConnectionClosed, asyncio.TimeoutError) as e:
self._connected = False
if attempt < self.max_retries - 1:
await asyncio.sleep(2 ** attempt) # 指数退避
else:
raise CDPConnectionError(
f"Failed after {self.max_retries} attempts: {e}"
)
性能优化建议
CDP自动化性能优化清单:
┌──────────────────────────────────────────────────────────────┐
│ 1. 减少Round-trip │
│ · 批量命令使用Runtime.evaluate一次执行多步操作 │
│ · 避免逐个获取元素属性,用一次JS调用返回所有需要的数据 │
│ │
│ 2. 避免频繁的DOM查询 │
│ · 缓存nodeId,但要注意DOM变更后需要刷新 │
│ · 对于稳定的元素,使用XPath替代CSS选择器(更快) │
│ │
│ 3. 截图优化 │
│ · 使用JPEG格式(比PNG快3-5倍) │
│ · 只在需要时截取全屏,平时只截取关注区域 │
│ · 降低截图分辨率(Emulation.setDeviceMetricsOverride) │
│ │
│ 4. 并行操作 │
│ · 多个BrowserWindow可以并行操作 │
│ · 独立的测试用例可以并行执行 │
│ │
│ 5. 避免不必要的等待 │
│ · 用waitForSelector替代固定sleep │
│ · 用Page.loadEventFired替代固定等待时间 │
└──────────────────────────────────────────────────────────────┘
CDP协议为Electron应用的自动化提供了一条"官方通道"。它不依赖屏幕坐标、不依赖第三方框架、直接与Chromium引擎对话。唯一的"代价"是API层级较低,需要自己封装一些便利方法。但对于需要高稳定性和深度控制的场景(如CI/CD自动化测试、RPA流程自动化),这套方案是目前最可靠的选择。
在实际项目中,建议将上述代码封装为一个内部SDK,上层业务只需关注"点击哪个按钮、输入什么内容、验证什么结果",底层的CDP通信细节对使用者透明。这样既保证了自动化的稳定性,也降低了团队成员的使用门槛。
原文链接:https://wenyiblog.top/2026/06/cdp-electron-automation/
首发于文艺技术笔记(wenyiblog.top),转载请注明出处。

浙公网安备 33010602011771号