霍格沃兹测试开发学社

《Python测试开发进阶训练营》(随到随学!)
2023年第2期《Python全栈开发与自动化测试班》(开班在即)
报名联系weixin/qq:2314507862

实战教程:构建能交互网页的 AI 助手——基于 Playwright MCP 的完整项目

项目概述:打造智能网页操作助手
在本教程中,我们将构建一个完整的、能够实际交互网页的AI助手。这个助手不仅能理解自然语言指令,还能通过 Playwright MCP 执行复杂的网页操作。我们将从零开始,搭建一个功能完备的系统,涵盖从环境配置到实际部署的全流程。

项目目标
构建一个能够执行以下任务的AI助手:

自动登录网站并处理认证
填写复杂表单和交互元素
提取、分析和结构化网页数据
处理多步骤工作流程
应对网页异常和动态内容
一、项目架构设计
技术栈选择
后端框架: Node.js + Express
浏览器自动化: Playwright
AI 模型集成: Anthropic Claude API
协议层: 自定义 MCP (Model Context Protocol) Server
前端界面: React + Tailwind CSS
数据库: SQLite (用于会话存储)
任务队列: Bull (用于异步任务处理)
系统架构
用户界面 (React)
↓ (HTTP/REST API)
后端服务器 (Express + AI 路由)
↓ (MCP 协议)
Playwright MCP Server
↓ (浏览器控制)
Chromium/Firefox 实例
二、环境准备与项目初始化
步骤1:创建项目结构
mkdir ai-web-assistant
cd ai-web-assistant
mkdir -p src/{mcp,ai,routes,models,utils} public/{css,js} tests
touch package.json server.js .env.example README.md
步骤2:定义项目依赖
创建 package.json:

{
"name": "ai-web-assistant",
"version": "1.0.0",
"type": "module",
"scripts": {
"start": "node server.js",
"dev": "nodemon server.js",
"test": "jest",
"mcp:dev": "node src/mcp/server.js"
},
"dependencies": {
"express": "^4.18.2",
"cors": "^2.8.5",
"dotenv": "^16.3.0",
"playwright": "^1.40.0",
"@anthropic-ai/sdk": "^0.7.0",
"sqlite3": "^5.1.6",
"bull": "^4.11.0",
"express-rate-limit": "^7.1.0",
"helmet": "^7.0.0"
},
"devDependencies": {
"nodemon": "^3.0.0",
"jest": "^29.6.0"
}
}
运行 npm install 安装依赖。

步骤3:环境配置
创建 .env 文件:

API 配置

ANTHROPIC_API_KEY=your_anthropic_api_key_here
PORT=3000
NODE_ENV=development

浏览器配置

BROWSER_TYPE=chromium
HEADLESS_MODE=false
BROWSER_TIMEOUT=30000

数据库配置

DB_PATH=./data/sessions.db

安全配置

SESSION_SECRET=your_session_secret_here
RATE_LIMIT_WINDOW=900000
RATE_LIMIT_MAX=100
三、核心模块实现

  1. Playwright MCP Server 实现
    创建 src/mcp/server.js:

import { chromium, firefox, webkit } from'playwright';
import { EventEmitter } from'events';

class PlaywrightMCPServer extends EventEmitter {
constructor(config = {}) {
super();
this.config = {
browserType: config.browserType || 'chromium',
headless: config.headless !== false,
timeout: config.timeout || 30000,
...config
};
this.browser = null;
this.context = null;
this.page = null;
this.isInitialized = false;
this.sessionId = null;
}

// 初始化浏览器实例
async initialize(sessionId = null) {
try {
this.sessionId = sessionId || session_${Date.now()};

  const browserMap = { chromium, firefox, webkit };
  const BrowserClass = browserMap[this.config.browserType] || chromium;
  
  this.browser = await BrowserClass.launch({ 
    headless: this.config.headless,
    timeout: this.config.timeout,
    args: ['--no-sandbox', '--disable-dev-shm-usage']
  });
  
  this.context = awaitthis.browser.newContext({
    viewport: { width: 1280, height: 720 },
    userAgent: 'AI-Web-Assistant/1.0',
    acceptDownloads: true,
    ignoreHTTPSErrors: true
  });
  
  // 添加页面错误处理
  this.context.on('page', page => {
    page.on('pageerror', error => {
      this.emit('pageError', { sessionId: this.sessionId, error });
    });
  });
  
  this.page = awaitthis.context.newPage();
  
  // 设置默认超时
  this.page.setDefaultTimeout(this.config.timeout);
  this.page.setDefaultNavigationTimeout(this.config.timeout * 2);
  
  this.isInitialized = true;
  this.emit('initialized', { sessionId: this.sessionId });
  
  return { 
    success: true, 
    message: 'Playwright MCP Server initialized successfully',
    sessionId: this.sessionId
  };
} catch (error) {
  console.error('Failed to initialize Playwright:', error);
  this.emit('error', error);
  return { success: false, error: error.message };
}

}

// 工具定义 - MCP 协议核心
getTools() {
return {
navigate: {
name: 'navigate',
description: 'Navigate to a specific URL',
parameters: {
url: {
type: 'string',
description: 'The URL to navigate to'
},
waitUntil: {
type: 'string',
description: 'When to consider navigation successful',
enum: ['load', 'domcontentloaded', 'networkidle'],
default: 'networkidle'
}
}
},
click: {
name: 'click',
description: 'Click on an element using CSS selector, XPath, or text',
parameters: {
selector: {
type: 'string',
description: 'CSS selector, XPath, or text to identify the element'
},
selectorType: {
type: 'string',
description: 'Type of selector: css, xpath, or text',
enum: ['css', 'xpath', 'text'],
default: 'css'
},
waitForNavigation: {
type: 'boolean',
description: 'Whether to wait for navigation after click',
default: false
}
}
},
fill_form: {
name: 'fill_form',
description: 'Fill a form with multiple fields',
parameters: {
fields: {
type: 'object',
description: 'Object mapping selectors to values'
}
}
},
extract_data: {
name: 'extract_data',
description: 'Extract structured data from the page',
parameters: {
schema: {
type: 'object',
description: 'Schema defining what data to extract'
}
}
},
wait_for_element: {
name: 'wait_for_element',
description: 'Wait for an element to appear',
parameters: {
selector: {
type: 'string',
description: 'CSS selector for the element'
},
state: {
type: 'string',
description: 'Element state to wait for',
enum: ['attached', 'detached', 'visible', 'hidden'],
default: 'visible'
},
timeout: {
type: 'number',
description: 'Timeout in milliseconds',
default: 10000
}
}
},
screenshot: {
name: 'screenshot',
description: 'Take a screenshot for debugging',
parameters: {
fullPage: {
type: 'boolean',
description: 'Whether to capture full page',
default: false
}
}
},
get_page_info: {
name: 'get_page_info',
description: 'Get comprehensive information about the current page'
}
};
}

// 工具执行引擎
async executeTool(toolName, parameters = {}) {
if (!this.isInitialized) {
thrownewError('Playwright not initialized. Call initialize() first.');
}

try {
  let result;
  
  switch (toolName) {
    case'navigate':
      result = awaitthis.navigateToUrl(parameters.url, parameters.waitUntil);
      break;
      
    case'click':
      result = awaitthis.clickElement(parameters.selector, parameters.selectorType, parameters.waitForNavigation);
      break;
      
    case'fill_form':
      result = awaitthis.fillForm(parameters.fields);
      break;
      
    case'extract_data':
      result = awaitthis.extractData(parameters.schema);
      break;
      
    case'wait_for_element':
      result = awaitthis.waitForElement(parameters.selector, parameters.state, parameters.timeout);
      break;
      
    case'screenshot':
      result = awaitthis.takeScreenshot(parameters.fullPage);
      break;
      
    case'get_page_info':
      result = awaitthis.getPageInfo();
      break;
      
    default:
      thrownewError(`Unknown tool: ${toolName}`);
  }
  
  this.emit('toolExecuted', { 
    sessionId: this.sessionId, 
    toolName, 
    parameters, 
    result 
  });
  
  return { success: true, data: result };
  
} catch (error) {
  console.error(`Tool execution failed: ${toolName}`, error);
  this.emit('toolError', { 
    sessionId: this.sessionId, 
    toolName, 
    parameters, 
    error: error.message 
  });
  
  return { 
    success: false, 
    error: error.message,
    suggestion: this.getErrorSuggestion(error.message)
  };
}

}

// 具体的工具实现方法
async navigateToUrl(url, waitUntil = 'networkidle') {
if (!url.startsWith('http')) {
url = 'https://' + url;
}

const response = awaitthis.page.goto(url, { 
  waitUntil,
  timeout: this.config.timeout 
});

return {
  url: this.page.url(),
  status: response?.status(),
  title: awaitthis.page.title(),
  finalUrl: this.page.url()
};

}

async clickElement(selector, selectorType = 'css', waitForNavigation = false) {
let element;

switch (selectorType) {
  case'css':
    element = this.page.locator(selector);
    break;
  case'xpath':
    element = this.page.locator(`xpath=${selector}`);
    break;
  case'text':
    element = this.page.getByText(selector, { exact: false });
    break;
  default:
    thrownewError(`Unsupported selector type: ${selectorType}`);
}

await element.waitFor({ state: 'visible' });

if (waitForNavigation) {
  awaitPromise.all([
    this.page.waitForNavigation({ waitUntil: 'networkidle' }),
    element.click()
  ]);
} else {
  await element.click();
}

return {
  success: true,
  element: awaitthis.getElementInfo(element)
};

}

async fillForm(fields) {
const results = {};

for (const [selector, value] ofObject.entries(fields)) {
  try {
    const element = this.page.locator(selector);
    await element.waitFor({ state: 'visible' });
    await element.fill(value);
    results[selector] = { success: true, value };
  } catch (error) {
    results[selector] = { success: false, error: error.message };
  }
}

return results;

}

async extractData(schema) {
const data = {};

for (const [key, config] ofObject.entries(schema)) {
  try {
    const { selector, type = 'text', attribute } = config;
    const element = this.page.locator(selector);
    
    switch (type) {
      case'text':
        data[key] = await element.textContent();
        break;
      case'attribute':
        data[key] = await element.getAttribute(attribute);
        break;
      case'multiple':
        data[key] = await element.allTextContents();
        break;
      default:
        data[key] = await element.textContent();
    }
  } catch (error) {
    data[key] = null;
  }
}

return data;

}

async getElementInfo(element) {
try {
const boundingBox = await element.boundingBox();
const isVisible = await element.isVisible();

  return {
    visible: isVisible,
    boundingBox,
    tagName: await element.evaluate(el => el.tagName.toLowerCase())
  };
} catch (error) {
  return { error: error.message };
}

}

async takeScreenshot(fullPage = false) {
const screenshot = awaitthis.page.screenshot({
fullPage,
type: 'png'
});

return {
  screenshot: screenshot.toString('base64'),
  type: 'png',
  fullPage
};

}

async getPageInfo() {
return {
url: this.page.url(),
title: awaitthis.page.title(),
content: awaitthis.page.content(),
viewport: this.page.viewportSize()
};
}

// 错误处理和建议
getErrorSuggestion(errorMessage) {
const suggestions = {
'timeout': '尝试增加等待时间或检查网络连接',
'element not found': '检查选择器是否正确,或等待元素加载',
'navigation failed': '检查URL是否正确,或网站是否可访问',
'target closed': '浏览器页面已关闭,需要重新初始化'
};

for (const [key, suggestion] ofObject.entries(suggestions)) {
  if (errorMessage.toLowerCase().includes(key)) {
    return suggestion;
  }
}

return'请检查网络连接和页面状态后重试';

}

// 清理资源
async cleanup() {
try {
if (this.page) {
awaitthis.page.close();
}
if (this.context) {
awaitthis.context.close();
}
if (this.browser) {
awaitthis.browser.close();
}

  this.isInitialized = false;
  this.emit('cleanedUp', { sessionId: this.sessionId });
  
  return { success: true, message: 'Resources cleaned up successfully' };
} catch (error) {
  console.error('Cleanup failed:', error);
  return { success: false, error: error.message };
}

}
}

exportdefault PlaywrightMCPServer;
2. AI 处理模块
创建 src/ai/handler.js:

import Anthropic from'@anthropic-ai/sdk';
import PlaywrightMCPServer from'../mcp/server.js';

class AIHandler {
constructor(apiKey) {
this.anthropic = new Anthropic({ apiKey });
this.mcpServer = new PlaywrightMCPServer();
this.conversationHistory = newMap();
}

// 初始化会话
async initializeSession(sessionId) {
const result = awaitthis.mcpServer.initialize(sessionId);

if (!this.conversationHistory.has(sessionId)) {
  this.conversationHistory.set(sessionId, []);
}

return result;

}

// 处理用户指令
async processInstruction(sessionId, instruction, context = {}) {
try {
const history = this.conversationHistory.get(sessionId) || [];

  // 构建系统提示词
  const systemPrompt = this.buildSystemPrompt(context);
  
  // 获取可用工具
  const availableTools = this.mcpServer.getTools();
  
  // 调用 Claude 模型
  const message = awaitthis.anthropic.messages.create({
    model: "claude-3-sonnet-20240229",
    max_tokens: 4096,
    system: systemPrompt,
    messages: [
      ...history,
      { role: "user", content: instruction }
    ],
    tools: Object.values(availableTools)
  });
  
  let finalResponse = '';
  let currentMessage = message;
  
  // 处理工具调用
  while (currentMessage.content.some(item => item.type === 'tool_use')) {
    const toolResults = [];
    
    for (const contentItem of currentMessage.content) {
      if (contentItem.type === 'tool_use') {
        const toolName = contentItem.name;
        const parameters = contentItem.input;
        
        // 执行工具
        const toolResult = awaitthis.mcpServer.executeTool(toolName, parameters);
        toolResults.push({
          type: 'tool_result',
          tool_use_id: contentItem.id,
          content: JSON.stringify(toolResult)
        });
      }
    }
    
    // 继续对话
    currentMessage = awaitthis.anthropic.messages.create({
      model: "claude-3-sonnet-20240229",
      max_tokens: 4096,
      messages: [
        ...history,
        { role: "user", content: instruction },
        { role: "assistant", content: currentMessage.content },
        { role: "user", content: toolResults }
      ],
      tools: Object.values(availableTools)
    });
  }
  
  // 提取最终响应
  const textContent = currentMessage.content.find(item => item.type === 'text');
  finalResponse = textContent ? textContent.text : '操作完成';
  
  // 更新对话历史
  history.push(
    { role: "user", content: instruction },
    { role: "assistant", content: currentMessage.content }
  );
  
  // 保持最近10轮对话
  if (history.length > 20) {
    history.splice(0, 4);
  }
  
  return {
    success: true,
    response: finalResponse,
    sessionId
  };
  
} catch (error) {
  console.error('AI processing failed:', error);
  return {
    success: false,
    error: error.message,
    sessionId
  };
}

}

// 构建系统提示词
buildSystemPrompt(context) {
return`你是一个专业的网页操作助手,可以通过浏览器自动化工具执行各种网页任务。

你的能力包括:

  • 导航到指定网址
  • 点击按钮和链接
  • 填写表单和输入框
  • 提取网页数据
  • 等待页面加载
  • 处理复杂交互

重要指导原则:

  1. 在执行操作前先分析页面结构
  2. 使用合适的选择器定位元素
  3. 处理可能出现的错误和异常
  4. 提供清晰的操作反馈
  5. 对于复杂任务,分解为多个步骤执行

当前上下文:${JSON.stringify(context)}

请谨慎操作,确保每一步都正确执行。如果遇到错误,请分析原因并提供解决方案。`;
}

// 获取会话历史
getSessionHistory(sessionId) {
returnthis.conversationHistory.get(sessionId) || [];
}

// 清理会话
async cleanupSession(sessionId) {
this.conversationHistory.delete(sessionId);
returnawaitthis.mcpServer.cleanup();
}
}

exportdefault AIHandler;
3. Express 服务器和路由
创建 server.js:

import express from'express';
import cors from'cors';
import helmet from'helmet';
import rateLimit from'express-rate-limit';
import dotenv from'dotenv';
import AIHandler from'./src/ai/handler.js';

// 加载环境变量
dotenv.config();

const app = express();
const PORT = process.env.PORT || 3000;

// 初始化 AI 处理器
const aiHandler = new AIHandler(process.env.ANTHROPIC_API_KEY);

// 中间件配置
app.use(helmet());
app.use(cors());
app.use(express.json({ limit: '10mb' }));

// 速率限制
const limiter = rateLimit({
windowMs: parseInt(process.env.RATE_LIMIT_WINDOW) || 15 * 60 * 1000,
max: parseInt(process.env.RATE_LIMIT_MAX) || 100,
message: '请求过于频繁,请稍后再试'
});
app.use(limiter);

// 会话存储
const sessions = newMap();

// API 路由

// 健康检查
app.get('/health', (req, res) => {
res.json({ status: 'ok', timestamp: newDate().toISOString() });
});

// 初始化会话
app.post('/api/session/init', async (req, res) => {
try {
const sessionId = req.body.sessionId || session_${Date.now()}_${Math.random().toString(36).substr(2, 9)};

const result = await aiHandler.initializeSession(sessionId);

if (result.success) {
  sessions.set(sessionId, {
    createdAt: newDate(),
    lastActivity: newDate()
  });
  
  res.json({
    success: true,
    sessionId,
    message: '会话初始化成功'
  });
} else {
  res.status(500).json({
    success: false,
    error: result.error
  });
}

} catch (error) {
console.error('Session init error:', error);
res.status(500).json({
success: false,
error: error.message
});
}
});

// 处理用户指令
app.post('/api/instruction', async (req, res) => {
try {
const { sessionId, instruction, context = {} } = req.body;

if (!sessionId || !instruction) {
  return res.status(400).json({
    success: false,
    error: '缺少必要参数:sessionId 和 instruction'
  });
}

// 更新会话活动时间
const session = sessions.get(sessionId);
if (session) {
  session.lastActivity = newDate();
}

const result = await aiHandler.processInstruction(sessionId, instruction, context);

res.json(result);

} catch (error) {
console.error('Instruction processing error:', error);
res.status(500).json({
success: false,
error: error.message
});
}
});

// 获取会话历史
app.get('/api/session/:sessionId/history', (req, res) => {
const { sessionId } = req.params;
const history = aiHandler.getSessionHistory(sessionId);

res.json({
success: true,
sessionId,
history
});
});

// 清理会话
app.delete('/api/session/:sessionId', async (req, res) => {
try {
const { sessionId } = req.params;

const result = await aiHandler.cleanupSession(sessionId);
sessions.delete(sessionId);

res.json({
  success: true,
  sessionId,
  message: '会话清理成功'
});

} catch (error) {
console.error('Session cleanup error:', error);
res.status(500).json({
success: false,
error: error.message
});
}
});

// 会话清理任务(定期清理过期会话)
setInterval(() => {
const now = newDate();
const SESSION_TIMEOUT = 30 * 60 * 1000; // 30分钟

for (const [sessionId, session] of sessions.entries()) {
if (now - session.lastActivity > SESSION_TIMEOUT) {
console.log(清理过期会话: ${sessionId});
aiHandler.cleanupSession(sessionId);
sessions.delete(sessionId);
}
}
}, 5 * 60 * 1000); // 每5分钟检查一次

// 错误处理中间件
app.use((error, req, res, next) => {
console.error('Unhandled error:', error);
res.status(500).json({
success: false,
error: '服务器内部错误'
});
});

// 404 处理
app.use('*', (req, res) => {
res.status(404).json({
success: false,
error: '接口不存在'
});
});

// 启动服务器
app.listen(PORT, () => {
console.log(AI Web Assistant 服务器运行在端口 ${PORT});
console.log(环境: ${process.env.NODE_ENV});
});

exportdefault app;
四、前端界面实现
创建 public/index.html:

AI 网页操作助手

AI 网页操作助手

使用自然语言指令自动化网页操作

    <!-- 主界面 -->
    <div class="bg-white rounded-lg shadow-lg overflow-hidden">
        <!-- 会话控制 -->
        <div class="bg-gray-800 text-white p-4 flex justify-between items-center">
            <div>
                <span id="sessionStatus" class="text-sm">未连接</span>
            </div>
            <div class="space-x-2">
                <button id="initSession" class="bg-green-600 hover:bg-green-700 px-4 py-2 rounded text-sm">
                    开始新会话
                </button>
                <button id="clearSession" class="bg-red-600 hover:bg-red-700 px-4 py-2 rounded text-sm" disabled>
                    结束会话
                </button>
            </div>
        </div>

        <!-- 聊天区域 -->
        <div class="h-96 overflow-y-auto p-4 space-y-4" id="chatMessages">
            <div class="text-center text-gray-500 py-8">
                发送指令开始与AI助手对话
            </div>
        </div>

        <!-- 输入区域 -->
        <div class="border-t p-4">
            <div class="flex space-x-2">
                <input 
                    type="text" 
                    id="instructionInput" 
                    placeholder="输入你的指令,例如:打开百度并搜索AI最新进展..." 
                    class="flex-1 border rounded-lg px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
                    disabled
                >
                <button 
                    id="sendButton" 
                    class="bg-blue-600 hover:bg-blue-700 text-white px-6 py-2 rounded-lg disabled:bg-gray-400 disabled:cursor-not-allowed"
                    disabled
                >
                    发送
                </button>
            </div>
            <div class="mt-2 text-sm text-gray-500">
                <p>示例指令:</p>
                <div class="flex flex-wrap gap-2 mt-1">
                    <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="打开百度首页">打开百度</button>
                    <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="搜索今天的热门新闻">搜索新闻</button>
                    <button class="example-instruction text-xs bg-gray-200 hover:bg-gray-300 px-2 py-1 rounded" data-instruction="提取当前页面的所有标题">提取标题</button>
                </div>
            </div>
        </div>
    </div>

    <!-- 会话信息 -->
    <div class="mt-4 bg-white rounded-lg shadow p-4">
        <h3 class="font-semibold mb-2">会话信息</h3>
        <div class="text-sm space-y-1">
            <div>会话ID: <span id="sessionIdDisplay" class="font-mono">-</span></div>
            <div>状态: <span id="connectionStatus">未连接</span></div>
            <div>消息数: <span id="messageCount">0</span></div>
        </div>
    </div>
</div>

<script>
    class AIAssistant {
        constructor() {
            this.sessionId = null;
            this.isConnected = false;
            this.messageCount = 0;
            
            this.initializeElements();
            this.attachEventListeners();
        }

        initializeElements() {
            this.sessionStatus = document.getElementById('sessionStatus');
            this.sessionIdDisplay = document.getElementById('sessionIdDisplay');
            this.connectionStatus = document.getElementById('connectionStatus');
            this.messageCountDisplay = document.getElementById('messageCount');
            this.chatMessages = document.getElementById('chatMessages');
            this.instructionInput = document.getElementById('instructionInput');
            this.sendButton = document.getElementById('sendButton');
            this.initSessionBtn = document.getElementById('initSession');
            this.clearSessionBtn = document.getElementById('clearSession');
        }

        attachEventListeners() {
            this.initSessionBtn.addEventListener('click', () => this.initializeSession());
            this.clearSessionBtn.addEventListener('click', () => this.clearSession());
            this.sendButton.addEventListener('click', () => this.sendInstruction());
            this.instructionInput.addEventListener('keypress', (e) => {
                if (e.key === 'Enter') this.sendInstruction();
            });

            // 示例指令点击事件
            document.querySelectorAll('.example-instruction').forEach(btn => {
                btn.addEventListener('click', (e) => {
                    this.instructionInput.value = e.target.dataset.instruction;
                    this.sendInstruction();
                });
            });
        }

        async initializeSession() {
            try {
                this.showLoading('正在初始化会话...');
                
                const response = await fetch('/api/session/init', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({})
                });

                const data = await response.json();

                if (data.success) {
                    this.sessionId = data.sessionId;
                    this.isConnected = true;
                    this.messageCount = 0;
                    
                    this.updateUI();
                    this.addMessage('system', '会话已初始化,你可以开始发送指令了。');
                } else {
                    thrownewError(data.error);
                }
            } catch (error) {
                this.addMessage('error', `初始化失败: ${error.message}`);
            } finally {
                this.hideLoading();
            }
        }

        async sendInstruction() {
            const instruction = this.instructionInput.value.trim();
            if (!instruction || !this.isConnected) return;

            // 添加用户消息
            this.addMessage('user', instruction);
            this.instructionInput.value = '';
            
            // 显示输入状态
            const thinkingMessage = this.addMessage('assistant', '');
            this.showTypingIndicator(thinkingMessage);

            try {
                const response = await fetch('/api/instruction', {
                    method: 'POST',
                    headers: { 'Content-Type': 'application/json' },
                    body: JSON.stringify({
                        sessionId: this.sessionId,
                        instruction: instruction
                    })
                });

                const data = await response.json();
                
                // 移除输入状态
                this.removeTypingIndicator(thinkingMessage);

                if (data.success) {
                    this.addMessage('assistant', data.response);
                } else {
                    this.addMessage('error', `操作失败: ${data.error}`);
                }
            } catch (error) {
                this.removeTypingIndicator(thinkingMessage);
                this.addMessage('error', `网络错误: ${error.message}`);
            }
        }

        async clearSession() {
            if (!this.sessionId) return;

            try {
                await fetch(`/api/session/${this.sessionId}`, {
                    method: 'DELETE'
                });
            } catch (error) {
                console.error('清理会话失败:', error);
            }

            this.sessionId = null;
            this.isConnected = false;
            this.messageCount = 0;
            this.updateUI();
            this.clearMessages();
            this.addMessage('system', '会话已结束。点击"开始新会话"重新开始。');
        }

        addMessage(role, content) {
            this.messageCount++;
            this.messageCountDisplay.textContent = this.messageCount;

            const messageDiv = document.createElement('div');
            messageDiv.className = `p-3 rounded-lg max-w-3/4 ${
                role === 'user' ? 'message-user ml-auto' : 
                role === 'error' ? 'bg-red-100 text-red-800 border border-red-200' :
                'message-assistant'
            }`;

            if (role === 'thinking') {
                messageDiv.innerHTML = '<div class="typing-indicator"><span class="typing-dot"></span><span class="typing-dot"></span><span class="typing-dot"></span></div>';
            } else {
                messageDiv.textContent = content;
            }

            this.chatMessages.appendChild(messageDiv);
            this.chatMessages.scrollTop = this.chatMessages.scrollHeight;

            return messageDiv;
        }

        showTypingIndicator(messageElement) {
            messageElement.innerHTML = '<div class="typing-indicator"><span class="typing-dot"></span><span class="typing-dot"></span><span class="typing-dot"></span></div>';
        }

        removeTypingIndicator(messageElement) {
            messageElement.innerHTML = '';
        }

        clearMessages() {
            this.chatMessages.innerHTML = '<div class="text-center text-gray-500 py-8">发送指令开始与AI助手对话</div>';
        }

        showLoading(message) {
            this.initSessionBtn.disabled = true;
            this.initSessionBtn.textContent = message;
        }

        hideLoading() {
            this.initSessionBtn.disabled = false;
            this.initSessionBtn.textContent = '开始新会话';
        }

        updateUI() {
            this.sessionStatus.textContent = this.isConnected ? '已连接' : '未连接';
            this.sessionIdDisplay.textContent = this.sessionId || '-';
            this.connectionStatus.textContent = this.isConnected ? '活跃' : '未连接';
            this.connectionStatus.className = this.isConnected ? 'text-green-600' : 'text-red-600';
            
            this.instructionInput.disabled = !this.isConnected;
            this.sendButton.disabled = !this.isConnected;
            this.clearSessionBtn.disabled = !this.isConnected;
        }
    }

    // 初始化应用
    document.addEventListener('DOMContentLoaded', () => {
        new AIAssistant();
    });
</script>
五、测试与验证 1. 创建测试脚本 创建 tests/integration.test.js:

import { test, expect } from'@playwright/test';
import AIHandler from'../src/ai/handler.js';
import dotenv from'dotenv';

dotenv.config();

test.describe('AI Web Assistant Integration Tests', () => {
let aiHandler;
let sessionId;

test.beforeEach(async () => {
aiHandler = new AIHandler(process.env.ANTHROPIC_API_KEY);
const initResult = await aiHandler.initializeSession();
sessionId = initResult.sessionId;
});

test.afterEach(async () => {
await aiHandler.cleanupSession(sessionId);
});

test('should initialize session successfully', async () => {
expect(sessionId).toBeDefined();
expect(typeof sessionId).toBe('string');
});

test('should process simple navigation instruction', async () => {
const result = await aiHandler.processInstruction(
sessionId,
'请打开百度首页 https://www.baidu.com'
);

expect(result.success).toBe(true);
expect(result.response).toBeDefined();

});

test('should handle invalid instruction gracefully', async () => {
const result = await aiHandler.processInstruction(
sessionId,
'执行一个不存在的操作'
);

// 即使指令有问题,也应该有合理的响应
expect(result.response).toBeDefined();

});
});
2. 运行测试
npm test
六、部署与运行

  1. 生产环境配置
    创建 ecosystem.config.js:

module.exports = {
apps: [{
name: 'ai-web-assistant',
script: 'server.js',
instances: 'max',
exec_mode: 'cluster',
env: {
NODE_ENV: 'production',
PORT: 3000
},
env_production: {
NODE_ENV: 'production'
}
}]
};
2. Docker 配置
创建 Dockerfile:

FROM node:18-alpine

WORKDIR /app

安装 Playwright 依赖

RUN apk add --no-cache
chromium
nss
freetype
freetype-dev
harfbuzz
ca-certificates
ttf-freefont

设置环境变量

ENV PUPPETEER_SKIP_CHROMIUM_DOWNLOAD=true
ENV PUPPETEER_EXECUTABLE_PATH=/usr/bin/chromium-browser

复制 package.json 并安装依赖

COPY package*.json ./
RUN npm ci --only=production

复制源代码

COPY . .

创建非root用户

RUN addgroup -g 1001 -S nodejs
RUN adduser -S nextjs -u 1001
USER nextjs

EXPOSE3000

CMD ["npm", "start"]
3. 启动应用

开发模式

npm run dev

生产模式

npm start
七、实际应用场景
场景1:自动化数据收集
// 指令:收集 GitHub 趋势项目
const instruction = 请访问 GitHub Trending 页面 (https://github.com/trending), 收集今天最流行的 JavaScript 项目的前5名, 包括项目名称、星标数和描述, 并整理成 JSON 格式返回。;
场景2:自动化表单填写
// 指令:注册测试用户
const instruction = `
请打开我们的测试注册页面 http://localhost:3000/register,
填写以下信息:

  • 用户名: testuser_$
  • 邮箱: test${Date.now()}@example.com
  • 密码: TestPassword123
    然后点击注册按钮,并确认注册成功。
    ; 场景3:复杂工作流程 // 指令:完整的电商流程测试 const instruction =
    请执行以下电商购物流程:
  1. 登录到测试电商网站
  2. 搜索"笔记本电脑"
  3. 选择第一个商品
  4. 添加到购物车
  5. 进入结算流程
  6. 填写测试配送信息
  7. 确认订单
    请在每个步骤完成后报告状态。
    `;
    总结
    通过本教程,我们成功构建了一个功能完整的 AI 网页操作助手,具备以下特点:

完整的架构:从前端界面到后端服务,再到浏览器自动化层
灵活的 MCP 协议:支持多种网页操作工具
智能的 AI 集成:利用 Claude 模型理解自然语言指令
健壮的错误处理:能够应对各种网页异常情况
可扩展的设计:易于添加新的工具和功能
这个项目展示了如何将现代 AI 技术与浏览器自动化相结合,创造出能够理解并执行复杂网页操作的智能助手。你可以在此基础上继续扩展,比如添加视觉识别、多浏览器支持、分布式任务处理等功能,打造更强大的自动化解决方案。

立即开始构建你自己的 AI 网页助手,释放自动化的无限可能!

推荐学习
自动化智能体与工作流管理平台课程,限时免费,机会难得。扫码报名,参与直播,希望您在这场公开课中掌握自动化与Ai智能体,轻松实现效率翻倍!

image

posted @ 2025-10-14 15:04  霍格沃兹测试开发学社  阅读(19)  评论(0)    收藏  举报