6.0 KiB
Playwright MCP
This package is experimental and not yet ready for production use. It is a subject to change and will not respect semver versioning.
Example config
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--headless"
]
}
}
}
Running headed browser (Browser with GUI).
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
]
}
}
}
Running headed browser on Linux
When running headed browser on system w/o display or from worker processes of the IDEs, you can run Playwright in a client-server manner. You'll run the Playwright server from environment with the DISPLAY
npx playwright run-server
And then in MCP config, add following to the env
:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp"
],
"env": {
// Use the endpoint from the output of the server above.
"PLAYWRIGHT_WS_ENDPOINT": "ws://localhost:<port>/"
}
}
}
}
Tool Modes
The tools are available in two modes:
- Snapshot Mode (default): Uses accessibility snapshots for better performance and reliability
- Vision Mode: Uses screenshots for visual-based interactions
To use Vision Mode, add the --vision
flag when starting the server:
{
"mcpServers": {
"playwright": {
"command": "npx",
"args": [
"@playwright/mcp",
"--vision"
]
}
}
}
Vision Mode works best with the computer use models that are able to interact with elements using X Y coordinate space, based on the provided screenshot.
Snapshot Mode
The Playwright MCP provides a set of tools for browser automation. Here are all available tools:
-
browser_navigate
- Description: Navigate to a URL
- Parameters:
url
(string): The URL to navigate to
-
browser_go_back
- Description: Go back to the previous page
- Parameters: None
-
browser_go_forward
- Description: Go forward to the next page
- Parameters: None
-
browser_click
- Description: Perform click on a web page
- Parameters:
element
(string): Human-readable element description used to obtain permission to interact with the elementref
(string): Exact target element reference from the page snapshot
-
browser_hover
- Description: Hover over element on page
- Parameters:
element
(string): Human-readable element description used to obtain permission to interact with the elementref
(string): Exact target element reference from the page snapshot
-
browser_drag
- Description: Perform drag and drop between two elements
- Parameters:
startElement
(string): Human-readable source element description used to obtain permission to interact with the elementstartRef
(string): Exact source element reference from the page snapshotendElement
(string): Human-readable target element description used to obtain permission to interact with the elementendRef
(string): Exact target element reference from the page snapshot
-
browser_type
- Description: Type text into editable element
- Parameters:
element
(string): Human-readable element description used to obtain permission to interact with the elementref
(string): Exact target element reference from the page snapshottext
(string): Text to type into the elementsubmit
(boolean): Whether to submit entered text (press Enter after)
-
browser_press_key
- Description: Press a key on the keyboard
- Parameters:
key
(string): Name of the key to press or a character to generate, such asArrowLeft
ora
-
browser_snapshot
- Description: Capture accessibility snapshot of the current page (better than screenshot)
- Parameters: None
-
browser_save_as_pdf
- Description: Save page as PDF
- Parameters: None
-
browser_wait
- Description: Wait for a specified time in seconds
- Parameters:
time
(number): The time to wait in seconds (capped at 10 seconds)
-
browser_close
- Description: Close the page
- Parameters: None
Vision Mode
Vision Mode provides tools for visual-based interactions using screenshots. Here are all available tools:
-
browser_navigate
- Description: Navigate to a URL
- Parameters:
url
(string): The URL to navigate to
-
browser_go_back
- Description: Go back to the previous page
- Parameters: None
-
browser_go_forward
- Description: Go forward to the next page
- Parameters: None
-
browser_screenshot
- Description: Capture screenshot of the current page
- Parameters: None
-
browser_move_mouse
- Description: Move mouse to specified coordinates
- Parameters:
x
(number): X coordinatey
(number): Y coordinate
-
browser_click
- Description: Click at specified coordinates
- Parameters:
x
(number): X coordinate to click aty
(number): Y coordinate to click at
-
browser_drag
- Description: Perform drag and drop operation
- Parameters:
startX
(number): Start X coordinatestartY
(number): Start Y coordinateendX
(number): End X coordinateendY
(number): End Y coordinate
-
browser_type
- Description: Type text at specified coordinates
- Parameters:
text
(string): Text to typesubmit
(boolean): Whether to submit entered text (press Enter after)
-
browser_press_key
- Description: Press a key on the keyboard
- Parameters:
key
(string): Name of the key to press or a character to generate, such asArrowLeft
ora
-
browser_save_as_pdf
- Description: Save page as PDF
- Parameters: None
-
browser_wait
- Description: Wait for a specified time in seconds
- Parameters:
time
(number): The time to wait in seconds (capped at 10 seconds)
-
browser_close
- Description: Close the page
- Parameters: None