chore: initial code commit

This commit is contained in:
Pavel Feldman 2025-03-21 10:58:58 -07:00
parent b1d5410a1b
commit 852709c026
23 changed files with 6307 additions and 204 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
lib/
node_modules/

5
.npmignore Normal file
View File

@ -0,0 +1,5 @@
**/*
README.md
LICENSE
!lib/**/*.js
!cli.js

View File

@ -186,7 +186,8 @@
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Portions Copyright (c) Microsoft Corporation.
Portions Copyright 2017 Google Inc.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.

224
README.md
View File

@ -1,3 +1,223 @@
# Repository setup required :wave:
## Playwright MCP
Please visit the website URL :point_right: for this repository to complete the setup of this repository and configure access controls.
This package is experimental and not yet ready for production use.
It is a subject to change and will not respect semver versioning.
### Example config
```js
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--headless"
]
}
}
}
```
### Running headed browser (Browser with GUI).
```js
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
]
}
}
}
```
### Running headed browser on Linux
When running headed browser on system w/o display or from worker processes of the IDEs,
you can run Playwright in a client-server manner. You'll run the Playwright server
from environment with the DISPLAY
```sh
npx playwright run-server
```
And then in MCP config, add following to the `env`:
```js
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
],
"env": {
// Use the endpoint from the output of the server above.
"PLAYWRIGHT_WS_ENDPOINT": "ws://localhost:<port>/"
}
}
}
}
```
### Tool Modes
The tools are available in two modes:
1. **Snapshot Mode** (default): Uses accessibility snapshots for better performance and reliability
2. **Vision Mode**: Uses screenshots for visual-based interactions
To use Vision Mode, add the `--vision` flag when starting the server:
```js
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--vision"
]
}
}
}
```
Vision Mode works best with the computer use models that are able to interact with elements using
X Y coordinate space, based on the provided screenshot.
### Snapshot Mode
The Playwright MCP provides a set of tools for browser automation. Here are all available tools:
- **browser_navigate**
- Description: Navigate to a URL
- Parameters:
- `url` (string): The URL to navigate to
- **browser_go_back**
- Description: Go back to the previous page
- Parameters: None
- **browser_go_forward**
- Description: Go forward to the next page
- Parameters: None
- **browser_click**
- Description: Perform click on a web page
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- **browser_hover**
- Description: Hover over element on page
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- **browser_drag**
- Description: Perform drag and drop between two elements
- Parameters:
- `startElement` (string): Human-readable source element description used to obtain permission to interact with the element
- `startRef` (string): Exact source element reference from the page snapshot
- `endElement` (string): Human-readable target element description used to obtain permission to interact with the element
- `endRef` (string): Exact target element reference from the page snapshot
- **browser_type**
- Description: Type text into editable element
- Parameters:
- `element` (string): Human-readable element description used to obtain permission to interact with the element
- `ref` (string): Exact target element reference from the page snapshot
- `text` (string): Text to type into the element
- `submit` (boolean): Whether to submit entered text (press Enter after)
- **browser_press_key**
- Description: Press a key on the keyboard
- Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
- **browser_snapshot**
- Description: Capture accessibility snapshot of the current page (better than screenshot)
- Parameters: None
- **browser_save_as_pdf**
- Description: Save page as PDF
- Parameters: None
- **browser_wait**
- Description: Wait for a specified time in seconds
- Parameters:
- `time` (number): The time to wait in seconds (capped at 10 seconds)
- **browser_close**
- Description: Close the page
- Parameters: None
### Vision Mode
Vision Mode provides tools for visual-based interactions using screenshots. Here are all available tools:
- **browser_navigate**
- Description: Navigate to a URL
- Parameters:
- `url` (string): The URL to navigate to
- **browser_go_back**
- Description: Go back to the previous page
- Parameters: None
- **browser_go_forward**
- Description: Go forward to the next page
- Parameters: None
- **browser_screenshot**
- Description: Capture screenshot of the current page
- Parameters: None
- **browser_move_mouse**
- Description: Move mouse to specified coordinates
- Parameters:
- `x` (number): X coordinate
- `y` (number): Y coordinate
- **browser_click**
- Description: Click at specified coordinates
- Parameters:
- `x` (number): X coordinate to click at
- `y` (number): Y coordinate to click at
- **browser_drag**
- Description: Perform drag and drop operation
- Parameters:
- `startX` (number): Start X coordinate
- `startY` (number): Start Y coordinate
- `endX` (number): End X coordinate
- `endY` (number): End Y coordinate
- **browser_type**
- Description: Type text at specified coordinates
- Parameters:
- `text` (string): Text to type
- `submit` (boolean): Whether to submit entered text (press Enter after)
- **browser_press_key**
- Description: Press a key on the keyboard
- Parameters:
- `key` (string): Name of the key to press or a character to generate, such as `ArrowLeft` or `a`
- **browser_save_as_pdf**
- Description: Save page as PDF
- Parameters: None
- **browser_wait**
- Description: Wait for a specified time in seconds
- Parameters:
- `time` (number): The time to wait in seconds (capped at 10 seconds)
- **browser_close**
- Description: Close the page
- Parameters: None

18
cli.js Executable file
View File

@ -0,0 +1,18 @@
#!/usr/bin/env node
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
require('./lib/program');

199
eslint.config.mjs Normal file
View File

@ -0,0 +1,199 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import typescriptEslint from "@typescript-eslint/eslint-plugin";
import tsParser from "@typescript-eslint/parser";
import notice from "eslint-plugin-notice";
import path from "path";
import { fileURLToPath } from "url";
import stylistic from "@stylistic/eslint-plugin";
import importRules from "eslint-plugin-import";
const __filename = fileURLToPath(import.meta.url);
const __dirname = path.dirname(__filename);
const plugins = {
"@stylistic": stylistic,
"@typescript-eslint": typescriptEslint,
notice,
import: importRules,
};
export const baseRules = {
"@typescript-eslint/no-unused-vars": [
2,
{ args: "none", caughtErrors: "none" },
],
/**
* Enforced rules
*/
// syntax preferences
"object-curly-spacing": ["error", "always"],
quotes: [
2,
"single",
{
avoidEscape: true,
allowTemplateLiterals: true,
},
],
"jsx-quotes": [2, "prefer-single"],
"no-extra-semi": 2,
"@stylistic/semi": [2],
"comma-style": [2, "last"],
"wrap-iife": [2, "inside"],
"spaced-comment": [
2,
"always",
{
markers: ["*"],
},
],
eqeqeq: [2],
"accessor-pairs": [
2,
{
getWithoutSet: false,
setWithoutGet: false,
},
],
"brace-style": [2, "1tbs", { allowSingleLine: true }],
curly: [2, "multi-or-nest", "consistent"],
"new-parens": 2,
"arrow-parens": [2, "as-needed"],
"prefer-const": 2,
"quote-props": [2, "consistent"],
"nonblock-statement-body-position": [2, "below"],
// anti-patterns
"no-var": 2,
"no-with": 2,
"no-multi-str": 2,
"no-caller": 2,
"no-implied-eval": 2,
"no-labels": 2,
"no-new-object": 2,
"no-octal-escape": 2,
"no-self-compare": 2,
"no-shadow-restricted-names": 2,
"no-cond-assign": 2,
"no-debugger": 2,
"no-dupe-keys": 2,
"no-duplicate-case": 2,
"no-empty-character-class": 2,
"no-unreachable": 2,
"no-unsafe-negation": 2,
radix: 2,
"valid-typeof": 2,
"no-implicit-globals": [2],
"no-unused-expressions": [
2,
{ allowShortCircuit: true, allowTernary: true, allowTaggedTemplates: true },
],
"no-proto": 2,
// es2015 features
"require-yield": 2,
"template-curly-spacing": [2, "never"],
// spacing details
"space-infix-ops": 2,
"space-in-parens": [2, "never"],
"array-bracket-spacing": [2, "never"],
"comma-spacing": [2, { before: false, after: true }],
"keyword-spacing": [2, "always"],
"space-before-function-paren": [
2,
{
anonymous: "never",
named: "never",
asyncArrow: "always",
},
],
"no-whitespace-before-property": 2,
"keyword-spacing": [
2,
{
overrides: {
if: { after: true },
else: { after: true },
for: { after: true },
while: { after: true },
do: { after: true },
switch: { after: true },
return: { after: true },
},
},
],
"arrow-spacing": [
2,
{
after: true,
before: true,
},
],
"@stylistic/func-call-spacing": 2,
"@stylistic/type-annotation-spacing": 2,
// file whitespace
"no-multiple-empty-lines": [2, { max: 2, maxEOF: 0 }],
"no-mixed-spaces-and-tabs": 2,
"no-trailing-spaces": 2,
"linebreak-style": [process.platform === "win32" ? 0 : 2, "unix"],
indent: [
2,
2,
{ SwitchCase: 1, CallExpression: { arguments: 2 }, MemberExpression: 2 },
],
"key-spacing": [
2,
{
beforeColon: false,
},
],
"eol-last": 2,
// copyright
"notice/notice": [
2,
{
mustMatch: "Copyright",
templateFile: path.join(__dirname, "utils", "copyright.js"),
},
],
// react
"react/react-in-jsx-scope": 0,
};
const languageOptions = {
parser: tsParser,
ecmaVersion: 9,
sourceType: "module",
};
export default [
{
ignores: ["**/*.js"],
},
{
files: ["**/*.ts", "**/*.tsx"],
plugins,
languageOptions,
rules: baseRules,
},
];

4362
package-lock.json generated Normal file

File diff suppressed because it is too large Load Diff

53
package.json Normal file
View File

@ -0,0 +1,53 @@
{
"name": "@playwright/mcp",
"version": "0.0.1",
"description": "Playwright Tools for MCP",
"repository": {
"type": "git",
"url": "git+https://github.com/microsoft/playwright-mcp.git"
},
"homepage": "https://playwright.dev",
"engines": {
"node": ">=18"
},
"author": {
"name": "Microsoft Corporation"
},
"license": "Apache-2.0",
"scripts": {
"build": "tsc",
"lint": "eslint .",
"watch": "tsc --watch"
},
"exports": {
"./servers/server": "./lib/servers/server.js",
"./servers/screenshot": "./lib/servers/screenshot.js",
"./servers/snapshot": "./lib/servers/snapshot.js",
"./tools/common": "./lib/tools/common.js",
"./tools/screenshot": "./lib/tools/screenshot.js",
"./tools/snapshot": "./lib/tools/snapshot.js",
"./package.json": "./package.json"
},
"dependencies": {
"@modelcontextprotocol/sdk": "^1.6.1",
"commander": "^13.1.0",
"playwright": "1.52.0-alpha-2025-03-21",
"zod-to-json-schema": "^3.24.4"
},
"devDependencies": {
"@eslint/eslintrc": "^3.2.0",
"@eslint/js": "^9.19.0",
"@stylistic/eslint-plugin": "^3.0.1",
"@typescript-eslint/eslint-plugin": "^8.26.1",
"@typescript-eslint/parser": "^8.26.1",
"@typescript-eslint/utils": "^8.26.1",
"@types/node": "^22.13.10",
"eslint": "^9.19.0",
"eslint-plugin-import": "^2.31.0",
"eslint-plugin-notice": "^1.0.0",
"typescript": "^5.8.2"
},
"bin": {
"mcp": "cli.js"
}
}

27
playwright.config.ts Normal file
View File

@ -0,0 +1,27 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { defineConfig } from '@playwright/test';
export default defineConfig({
testDir: './tests',
fullyParallel: true,
forbidOnly: !!process.env.CI,
retries: process.env.CI ? 2 : 0,
workers: process.env.CI ? 1 : undefined,
reporter: 'list',
projects: [{ name: 'default' }],
});

65
src/context.ts Normal file
View File

@ -0,0 +1,65 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import * as playwright from 'playwright';
export class Context {
private _launchOptions: playwright.LaunchOptions;
private _page: playwright.Page | undefined;
private _console: playwright.ConsoleMessage[] = [];
private _initializePromise: Promise<void> | undefined;
constructor(launchOptions: playwright.LaunchOptions) {
this._launchOptions = launchOptions;
}
async ensurePage(): Promise<playwright.Page> {
await this._initialize();
return this._page!;
}
async ensureConsole(): Promise<playwright.ConsoleMessage[]> {
await this._initialize();
return this._console;
}
async close() {
const page = await this.ensurePage();
await page.close();
this._initializePromise = undefined;
}
private async _initialize() {
if (this._initializePromise)
return this._initializePromise;
this._initializePromise = (async () => {
const browser = await this._createBrowser();
this._page = await browser.newPage();
this._page.on('console', event => this._console.push(event));
this._page.on('framenavigated', () => this._console.length = 0);
})();
return this._initializePromise;
}
private async _createBrowser(): Promise<playwright.Browser> {
if (process.env.PLAYWRIGHT_WS_ENDPOINT) {
const url = new URL(process.env.PLAYWRIGHT_WS_ENDPOINT);
url.searchParams.set('launch-options', JSON.stringify(this._launchOptions));
return await playwright.chromium.connect(String(url));
}
return await playwright.chromium.launch({ channel: 'chrome', ...this._launchOptions });
}
}

93
src/program.ts Normal file
View File

@ -0,0 +1,93 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { program } from 'commander';
import { Server } from './server';
import * as snapshot from './tools/snapshot';
import * as common from './tools/common';
import * as screenshot from './tools/screenshot';
import { console } from './resources/console';
import type { LaunchOptions } from './server';
import type { Tool } from './tools/tool';
import type { Resource } from './resources/resource';
const packageJSON = require('../package.json');
program
.version('Version ' + packageJSON.version)
.name(packageJSON.name)
.option('--headless', 'Run browser in headless mode, headed by default')
.option('--vision', 'Run server that uses screenshots (Aria snapshots are used by default)')
.action(async options => {
const launchOptions: LaunchOptions = {
headless: !!options.headless,
};
const tools = options.vision ? screenshotTools : snapshotTools;
const server = new Server({
name: 'Playwright',
version: packageJSON.version,
tools,
resources,
}, launchOptions);
setupExitWatchdog(server);
await server.start();
});
function setupExitWatchdog(server: Server) {
process.stdin.on('close', async () => {
setTimeout(() => process.exit(0), 15000);
await server?.stop();
process.exit(0);
});
}
const commonTools: Tool[] = [
common.pressKey,
common.wait,
common.pdf,
common.close,
];
const snapshotTools: Tool[] = [
common.navigate(true),
common.goBack(true),
common.goForward(true),
snapshot.snapshot,
snapshot.click,
snapshot.hover,
snapshot.type,
...commonTools,
];
const screenshotTools: Tool[] = [
common.navigate(false),
common.goBack(false),
common.goForward(false),
screenshot.screenshot,
screenshot.moveMouse,
screenshot.click,
screenshot.drag,
screenshot.type,
...commonTools,
];
const resources: Resource[] = [
console,
];
program.parse(process.argv);

37
src/resources/console.ts Normal file
View File

@ -0,0 +1,37 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type { Resource, ResourceResult } from './resource';
export const console: Resource = {
schema: {
uri: 'browser://console',
name: 'Page console',
mimeType: 'text/plain',
},
read: async (context, uri) => {
const result: ResourceResult[] = [];
for (const message of await context.ensureConsole()) {
result.push({
uri,
mimeType: 'text/plain',
text: `[${message.type().toUpperCase()}] ${message.text()}`,
});
}
return result;
},
};

36
src/resources/resource.ts Normal file
View File

@ -0,0 +1,36 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type { Context } from '../context';
export type ResourceSchema = {
uri: string;
name: string;
description?: string;
mimeType?: string;
};
export type ResourceResult = {
uri: string;
mimeType?: string;
text?: string;
blob?: string;
};
export type Resource = {
schema: ResourceSchema;
read: (context: Context, uri: string) => Promise<ResourceResult[]>;
};

95
src/server.ts Normal file
View File

@ -0,0 +1,95 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { Server as MCPServer } from '@modelcontextprotocol/sdk/server/index.js';
import { StdioServerTransport } from '@modelcontextprotocol/sdk/server/stdio.js';
import { CallToolRequestSchema, ListResourcesRequestSchema, ListToolsRequestSchema, ReadResourceRequestSchema } from '@modelcontextprotocol/sdk/types.js';
import * as playwright from 'playwright';
import { Context } from './context';
import type { Tool } from './tools/tool';
import type { Resource } from './resources/resource';
export type LaunchOptions = {
headless?: boolean;
};
export class Server {
private _server: MCPServer;
private _tools: Tool[];
private _page: playwright.Page | undefined;
private _context: Context;
constructor(options: { name: string, version: string, tools: Tool[], resources: Resource[] }, launchOptions: LaunchOptions) {
const { name, version, tools, resources } = options;
this._context = new Context(launchOptions);
this._server = new MCPServer({ name, version }, {
capabilities: {
tools: {},
resources: {},
}
});
this._tools = tools;
this._server.setRequestHandler(ListToolsRequestSchema, async () => {
return { tools: tools.map(tool => tool.schema) };
});
this._server.setRequestHandler(ListResourcesRequestSchema, async () => {
return { resources: resources.map(resource => resource.schema) };
});
this._server.setRequestHandler(CallToolRequestSchema, async request => {
const tool = this._tools.find(tool => tool.schema.name === request.params.name);
if (!tool) {
return {
content: [{ type: 'text', text: `Tool "${request.params.name}" not found` }],
isError: true,
};
}
try {
const result = await tool.handle(this._context, request.params.arguments);
return result;
} catch (error) {
return {
content: [{ type: 'text', text: String(error) }],
isError: true,
};
}
});
this._server.setRequestHandler(ReadResourceRequestSchema, async request => {
const resource = resources.find(resource => resource.schema.uri === request.params.uri);
if (!resource)
return { contents: [] };
const contents = await resource.read(this._context, request.params.uri);
return { contents };
});
}
async start() {
const transport = new StdioServerTransport();
await this._server.connect(transport);
}
async stop() {
await this._server.close();
await this._page?.context()?.browser()?.close();
}
}

165
src/tools/common.ts Normal file
View File

@ -0,0 +1,165 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import os from 'os';
import path from 'path';
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { captureAriaSnapshot, runAndWait } from './utils';
import type { ToolFactory, Tool } from './tool';
const navigateSchema = z.object({
url: z.string().describe('The URL to navigate to'),
});
export const navigate: ToolFactory = snapshot => ({
schema: {
name: 'browser_navigate',
description: 'Navigate to a URL',
inputSchema: zodToJsonSchema(navigateSchema),
},
handle: async (context, params) => {
const validatedParams = navigateSchema.parse(params);
const page = await context.ensurePage();
await page.goto(validatedParams.url, { waitUntil: 'domcontentloaded' });
// Cap load event to 5 seconds, the page is operational at this point.
await page.waitForLoadState('load', { timeout: 5000 }).catch(() => {});
if (snapshot)
return captureAriaSnapshot(page);
return {
content: [{
type: 'text',
text: `Navigated to ${validatedParams.url}`,
}],
};
},
});
const goBackSchema = z.object({});
export const goBack: ToolFactory = snapshot => ({
schema: {
name: 'browser_go_back',
description: 'Go back to the previous page',
inputSchema: zodToJsonSchema(goBackSchema),
},
handle: async context => {
return await runAndWait(context, 'Navigated back', async () => {
const page = await context.ensurePage();
await page.goBack();
}, snapshot);
},
});
const goForwardSchema = z.object({});
export const goForward: ToolFactory = snapshot => ({
schema: {
name: 'browser_go_forward',
description: 'Go forward to the next page',
inputSchema: zodToJsonSchema(goForwardSchema),
},
handle: async context => {
return await runAndWait(context, 'Navigated forward', async () => {
const page = await context.ensurePage();
await page.goForward();
}, snapshot);
},
});
const waitSchema = z.object({
time: z.number().describe('The time to wait in seconds'),
});
export const wait: Tool = {
schema: {
name: 'browser_wait',
description: 'Wait for a specified time in seconds',
inputSchema: zodToJsonSchema(waitSchema),
},
handle: async (context, params) => {
const validatedParams = waitSchema.parse(params);
const page = await context.ensurePage();
await page.waitForTimeout(Math.min(10000, validatedParams.time * 1000));
return {
content: [{
type: 'text',
text: `Waited for ${validatedParams.time} seconds`,
}],
};
},
};
const pressKeySchema = z.object({
key: z.string().describe('Name of the key to press or a character to generate, such as `ArrowLeft` or `a`'),
});
export const pressKey: Tool = {
schema: {
name: 'browser_press_key',
description: 'Press a key on the keyboard',
inputSchema: zodToJsonSchema(pressKeySchema),
},
handle: async (context, params) => {
const validatedParams = pressKeySchema.parse(params);
return await runAndWait(context, `Pressed key ${validatedParams.key}`, async page => {
await page.keyboard.press(validatedParams.key);
});
},
};
const pdfSchema = z.object({});
export const pdf: Tool = {
schema: {
name: 'browser_save_as_pdf',
description: 'Save page as PDF',
inputSchema: zodToJsonSchema(pdfSchema),
},
handle: async context => {
const page = await context.ensurePage();
const fileName = path.join(os.tmpdir(), `/page-${new Date().toISOString()}.pdf`);
await page.pdf({ path: fileName });
return {
content: [{
type: 'text',
text: `Saved as ${fileName}`,
}],
};
},
};
const closeSchema = z.object({});
export const close: Tool = {
schema: {
name: 'browser_close',
description: 'Close the page',
inputSchema: zodToJsonSchema(closeSchema),
},
handle: async context => {
await context.close();
return {
content: [{
type: 'text',
text: `Page closed`,
}],
};
},
};

133
src/tools/screenshot.ts Normal file
View File

@ -0,0 +1,133 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import { zodToJsonSchema } from 'zod-to-json-schema';
import { runAndWait } from './utils';
import type { Tool } from './tool';
export const screenshot: Tool = {
schema: {
name: 'browser_screenshot',
description: 'Take a screenshot of the current page',
inputSchema: zodToJsonSchema(z.object({})),
},
handle: async context => {
const page = await context.ensurePage();
const screenshot = await page.screenshot({ type: 'jpeg', quality: 50, scale: 'css' });
return {
content: [{ type: 'image', data: screenshot.toString('base64'), mimeType: 'image/jpeg' }],
};
},
};
const elementSchema = z.object({
element: z.string().describe('Human-readable element description used to obtain permission to interact with the element'),
});
const moveMouseSchema = elementSchema.extend({
x: z.number().describe('X coordinate'),
y: z.number().describe('Y coordinate'),
});
export const moveMouse: Tool = {
schema: {
name: 'browser_move_mouse',
description: 'Move mouse to a given position',
inputSchema: zodToJsonSchema(moveMouseSchema),
},
handle: async (context, params) => {
const validatedParams = moveMouseSchema.parse(params);
const page = await context.ensurePage();
await page.mouse.move(validatedParams.x, validatedParams.y);
return {
content: [{ type: 'text', text: `Moved mouse to (${validatedParams.x}, ${validatedParams.y})` }],
};
},
};
const clickSchema = elementSchema.extend({
x: z.number().describe('X coordinate'),
y: z.number().describe('Y coordinate'),
});
export const click: Tool = {
schema: {
name: 'browser_click',
description: 'Click left mouse button',
inputSchema: zodToJsonSchema(clickSchema),
},
handle: async (context, params) => {
return await runAndWait(context, 'Clicked mouse', async page => {
const validatedParams = clickSchema.parse(params);
await page.mouse.move(validatedParams.x, validatedParams.y);
await page.mouse.down();
await page.mouse.up();
});
},
};
const dragSchema = elementSchema.extend({
startX: z.number().describe('Start X coordinate'),
startY: z.number().describe('Start Y coordinate'),
endX: z.number().describe('End X coordinate'),
endY: z.number().describe('End Y coordinate'),
});
export const drag: Tool = {
schema: {
name: 'browser_drag',
description: 'Drag left mouse button',
inputSchema: zodToJsonSchema(dragSchema),
},
handle: async (context, params) => {
const validatedParams = dragSchema.parse(params);
return await runAndWait(context, `Dragged mouse from (${validatedParams.startX}, ${validatedParams.startY}) to (${validatedParams.endX}, ${validatedParams.endY})`, async page => {
await page.mouse.move(validatedParams.startX, validatedParams.startY);
await page.mouse.down();
await page.mouse.move(validatedParams.endX, validatedParams.endY);
await page.mouse.up();
});
},
};
const typeSchema = z.object({
text: z.string().describe('Text to type into the element'),
submit: z.boolean().describe('Whether to submit entered text (press Enter after)'),
});
export const type: Tool = {
schema: {
name: 'browser_type',
description: 'Type text',
inputSchema: zodToJsonSchema(typeSchema),
},
handle: async (context, params) => {
const validatedParams = typeSchema.parse(params);
return await runAndWait(context, `Typed text "${validatedParams.text}"`, async page => {
await page.keyboard.type(validatedParams.text);
if (validatedParams.submit)
await page.keyboard.press('Enter');
});
},
};

117
src/tools/snapshot.ts Normal file
View File

@ -0,0 +1,117 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { z } from 'zod';
import zodToJsonSchema from 'zod-to-json-schema';
import { captureAriaSnapshot, runAndWait } from './utils';
import type * as playwright from 'playwright';
import type { Tool } from './tool';
export const snapshot: Tool = {
schema: {
name: 'browser_snapshot',
description: 'Capture accessibility snapshot of the current page, this is better than screenshot',
inputSchema: zodToJsonSchema(z.object({})),
},
handle: async context => {
return await captureAriaSnapshot(await context.ensurePage());
},
};
const elementSchema = z.object({
element: z.string().describe('Human-readable element description used to obtain permission to interact with the element'),
ref: z.string().describe('Exact target element reference from the page snapshot'),
});
export const click: Tool = {
schema: {
name: 'browser_click',
description: 'Perform click on a web page',
inputSchema: zodToJsonSchema(elementSchema),
},
handle: async (context, params) => {
const validatedParams = elementSchema.parse(params);
return runAndWait(context, `"${validatedParams.element}" clicked`, page => refLocator(page, validatedParams.ref).click(), true);
},
};
const dragSchema = z.object({
startElement: z.string().describe('Human-readable source element description used to obtain the permission to interact with the element'),
startRef: z.string().describe('Exact source element reference from the page snapshot'),
endElement: z.string().describe('Human-readable target element description used to obtain the permission to interact with the element'),
endRef: z.string().describe('Exact target element reference from the page snapshot'),
});
export const drag: Tool = {
schema: {
name: 'browser_drag',
description: 'Perform drag and drop between two elements',
inputSchema: zodToJsonSchema(dragSchema),
},
handle: async (context, params) => {
const validatedParams = dragSchema.parse(params);
return runAndWait(context, `Dragged "${validatedParams.startElement}" to "${validatedParams.endElement}"`, async page => {
const startLocator = refLocator(page, validatedParams.startRef);
const endLocator = refLocator(page, validatedParams.endRef);
await startLocator.dragTo(endLocator);
}, true);
},
};
export const hover: Tool = {
schema: {
name: 'browser_hover',
description: 'Hover over element on page',
inputSchema: zodToJsonSchema(elementSchema),
},
handle: async (context, params) => {
const validatedParams = elementSchema.parse(params);
return runAndWait(context, `Hovered over "${validatedParams.element}"`, page => refLocator(page, validatedParams.ref).hover(), true);
},
};
const typeSchema = elementSchema.extend({
text: z.string().describe('Text to type into the element'),
submit: z.boolean().describe('Whether to submit entered text (press Enter after)'),
});
export const type: Tool = {
schema: {
name: 'browser_type',
description: 'Type text into editable element',
inputSchema: zodToJsonSchema(typeSchema),
},
handle: async (context, params) => {
const validatedParams = typeSchema.parse(params);
return await runAndWait(context, `Typed "${validatedParams.text}" into "${validatedParams.element}"`, async page => {
const locator = refLocator(page, validatedParams.ref);
await locator.fill(validatedParams.text);
if (validatedParams.submit)
await locator.press('Enter');
}, true);
},
};
function refLocator(page: playwright.Page, ref: string): playwright.Locator {
return page.locator(`aria-ref=${ref}`);
}

37
src/tools/tool.ts Normal file
View File

@ -0,0 +1,37 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type { ImageContent, TextContent } from '@modelcontextprotocol/sdk/types';
import type { JsonSchema7Type } from 'zod-to-json-schema';
import type { Context } from '../context';
export type ToolSchema = {
name: string;
description: string;
inputSchema: JsonSchema7Type;
};
export type ToolResult = {
content: (ImageContent | TextContent)[];
isError?: boolean;
};
export type Tool = {
schema: ToolSchema;
handle: (context: Context, params?: Record<string, any>) => Promise<ToolResult>;
};
export type ToolFactory = (snapshot: boolean) => Tool;

95
src/tools/utils.ts Normal file
View File

@ -0,0 +1,95 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import type * as playwright from 'playwright';
import type { ToolResult } from './tool';
import type { Context } from '../context';
async function waitForCompletion<R>(page: playwright.Page, callback: () => Promise<R>): Promise<R> {
const requests = new Set<playwright.Request>();
let frameNavigated = false;
let waitCallback: () => void = () => {};
const waitBarrier = new Promise<void>(f => { waitCallback = f; });
const requestListener = (request: playwright.Request) => requests.add(request);
const requestFinishedListener = (request: playwright.Request) => {
requests.delete(request);
if (!requests.size)
waitCallback();
};
const frameNavigateListener = (frame: playwright.Frame) => {
if (frame.parentFrame())
return;
frameNavigated = true;
dispose();
clearTimeout(timeout);
void frame.waitForLoadState('load').then(() => {
waitCallback();
});
};
const onTimeout = () => {
dispose();
waitCallback();
};
page.on('request', requestListener);
page.on('requestfinished', requestFinishedListener);
page.on('framenavigated', frameNavigateListener);
const timeout = setTimeout(onTimeout, 10000);
const dispose = () => {
page.off('request', requestListener);
page.off('requestfinished', requestFinishedListener);
page.off('framenavigated', frameNavigateListener);
clearTimeout(timeout);
};
try {
const result = await callback();
if (!requests.size && !frameNavigated)
waitCallback();
await waitBarrier;
await page.evaluate(() => new Promise(f => setTimeout(f, 1000)));
return result;
} finally {
dispose();
}
}
export async function runAndWait(context: Context, status: string, callback: (page: playwright.Page) => Promise<any>, snapshot: boolean = false): Promise<ToolResult> {
const page = await context.ensurePage();
await waitForCompletion(page, () => callback(page));
return snapshot ? captureAriaSnapshot(page, status) : {
content: [{ type: 'text', text: status }],
};
}
export async function captureAriaSnapshot(page: playwright.Page, status: string = ''): Promise<ToolResult> {
const snapshot = await page.locator('html').ariaSnapshot({ ref: true });
return {
content: [{ type: 'text', text: `${status ? `${status}\n` : ''}
- Page URL: ${page.url()}
- Page Title: ${await page.title()}
- Page Snapshot
\`\`\`yaml
${snapshot}
\`\`\`
`
}],
};
}

163
tests/basic.spec.ts Normal file
View File

@ -0,0 +1,163 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import { test, expect } from './fixtures';
test('test tool list', async ({ server }) => {
const list = await server.send({
jsonrpc: '2.0',
id: 1,
method: 'tools/list',
});
expect(list).toEqual(expect.objectContaining({
id: 1,
result: expect.objectContaining({
tools: [
expect.objectContaining({
name: 'browser_navigate',
}),
expect.objectContaining({
name: 'browser_go_back',
}),
expect.objectContaining({
name: 'browser_go_forward',
}),
expect.objectContaining({
name: 'browser_snapshot',
}),
expect.objectContaining({
name: 'browser_click',
}),
expect.objectContaining({
name: 'browser_hover',
}),
expect.objectContaining({
name: 'browser_type',
}),
expect.objectContaining({
name: 'browser_press_key',
}),
expect.objectContaining({
name: 'browser_wait',
}),
expect.objectContaining({
name: 'browser_save_as_pdf',
}),
expect.objectContaining({
name: 'browser_close',
}),
],
}),
}));
});
test('test resources list', async ({ server }) => {
const list = await server.send({
jsonrpc: '2.0',
id: 2,
method: 'resources/list',
});
expect(list).toEqual(expect.objectContaining({
id: 2,
result: expect.objectContaining({
resources: [
expect.objectContaining({
uri: 'browser://console',
mimeType: 'text/plain',
}),
],
}),
}));
});
test('test browser_navigate', async ({ server }) => {
const response = await server.send({
jsonrpc: '2.0',
id: 2,
method: 'tools/call',
params: {
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><body>Hello, world!</body></html>',
},
},
});
expect(response).toEqual(expect.objectContaining({
id: 2,
result: {
content: [{
type: 'text',
text: `
- Page URL: data:text/html,<html><title>Title</title><body>Hello, world!</body></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s1e2]: Hello, world!
\`\`\`
`,
}],
},
}));
});
test('test browser_click', async ({ server }) => {
await server.send({
jsonrpc: '2.0',
id: 2,
method: 'tools/call',
params: {
name: 'browser_navigate',
arguments: {
url: 'data:text/html,<html><title>Title</title><button>Submit</button></html>',
},
},
});
const response = await server.send({
jsonrpc: '2.0',
id: 3,
method: 'tools/call',
params: {
name: 'browser_click',
arguments: {
element: 'Submit button',
ref: 's1e4',
},
},
});
expect(response).toEqual(expect.objectContaining({
id: 3,
result: {
content: [{
type: 'text',
text: `\"Submit button\" clicked
- Page URL: data:text/html,<html><title>Title</title><button>Submit</button></html>
- Page Title: Title
- Page Snapshot
\`\`\`yaml
- document [ref=s2e2]:
- button \"Submit\" [ref=s2e4]
\`\`\`
`,
}],
},
}));
});

151
tests/fixtures.ts Normal file
View File

@ -0,0 +1,151 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
import path from 'path';
import { spawn } from 'child_process';
import EventEmitter from 'events';
import { test as baseTest, expect } from '@playwright/test';
import type { ChildProcess } from 'child_process';
export { expect } from '@playwright/test';
class MCPServer extends EventEmitter {
private _child: ChildProcess;
private _messageQueue: any[] = [];
private _messageResolvers: ((value: any) => void)[] = [];
private _buffer: string = '';
constructor(command: string, args: string[]) {
super();
this._child = spawn(command, args, {
stdio: ['pipe', 'pipe', 'pipe'],
});
this._child.stdout?.on('data', data => {
this._buffer += data.toString();
let newlineIndex: number;
while ((newlineIndex = this._buffer.indexOf('\n')) !== -1) {
const message = this._buffer.slice(0, newlineIndex).trim();
this._buffer = this._buffer.slice(newlineIndex + 1);
if (!message)
continue;
const parsed = JSON.parse(message);
if (this._messageResolvers.length > 0) {
const resolve = this._messageResolvers.shift();
resolve!(parsed);
} else {
this._messageQueue.push(parsed);
}
}
});
this._child.stderr?.on('data', data => {
throw new Error('Server stderr:', data.toString());
});
this._child.on('exit', code => {
if (code !== 0)
throw new Error(`Server exited with code ${code}`);
});
}
async send(message: any, options?: { timeout?: number }): Promise<void> {
await this.sendNoReply(message);
return this._waitForResponse(options || {});
}
async sendNoReply(message: any): Promise<void> {
const jsonMessage = JSON.stringify(message) + '\n';
await new Promise<void>((resolve, reject) => {
this._child.stdin?.write(jsonMessage, err => {
if (err)
reject(err);
else
resolve();
});
});
}
private async _waitForResponse(options: { timeout?: number }): Promise<any> {
if (this._messageQueue.length > 0)
return this._messageQueue.shift();
return new Promise((resolve, reject) => {
const timeoutId = setTimeout(() => {
reject(new Error('Timeout waiting for message'));
}, options.timeout || 5000);
this._messageResolvers.push(message => {
clearTimeout(timeoutId);
resolve(message);
});
});
}
async close(): Promise<void> {
return new Promise(resolve => {
this._child.on('exit', () => resolve());
this._child.stdin?.end();
});
}
}
export const test = baseTest.extend<{ server: MCPServer }>({
server: async ({}, use) => {
const server = new MCPServer('node', [path.join(__dirname, '../cli.js'), '--headless']);
const initialize = await server.send({
jsonrpc: '2.0',
id: 0,
method: 'initialize',
params: {
protocolVersion: '2024-11-05',
capabilities: {},
clientInfo: {
name: 'Playwright Test',
version: '0.0.0',
},
},
});
expect(initialize).toEqual(expect.objectContaining({
id: 0,
result: expect.objectContaining({
protocolVersion: '2024-11-05',
capabilities: {
tools: {},
resources: {},
},
serverInfo: expect.objectContaining({
name: 'Playwright',
version: expect.any(String),
}),
}),
}));
await server.sendNoReply({
jsonrpc: '2.0',
method: 'notifications/initialized',
});
await use(server);
await server.close();
},
});

14
tsconfig.json Normal file
View File

@ -0,0 +1,14 @@
{
"compilerOptions": {
"target": "ESNext",
"skipLibCheck": true,
"esModuleInterop": true,
"moduleResolution": "node",
"strict": true,
"module": "ESNext",
"outDir": "./lib"
},
"include": [
"src",
],
}

15
utils/copyright.js Normal file
View File

@ -0,0 +1,15 @@
/**
* Copyright (c) Microsoft Corporation.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/