Browser Actions Reference
Browser-Use provides a comprehensive set of browser actions that agents can use to interact with web pages. This reference guide covers all available actions, their parameters, and examples of how to use them.
Navigation Actions
Actions that control browser navigation and page loading.
GO_TO
Navigates to a specified URL.
Parameters:
url(string): The URL to navigate to
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Navigate to Wikipedia's homepage",
actions=[BrowserAction.GO_TO]
)
result = await agent.run()BACK
Navigates back to the previous page in browser history.
Parameters: None
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Go back to the previous page after viewing an article",
actions=[BrowserAction.BACK]
)
result = await agent.run()FORWARD
Navigates forward in browser history.
Parameters: None
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Go forward after going back to the previous page",
actions=[BrowserAction.FORWARD]
)
result = await agent.run()REFRESH
Reloads the current page.
Parameters: None
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Refresh the page to check for updates",
actions=[BrowserAction.REFRESH]
)
result = await agent.run()Interaction Actions
Actions that interact with elements on a web page.
CLICK
Clicks on an element identified by a selector or natural language description.
Parameters:
selector(string, optional): CSS selector for the elementdescription(string, optional): Natural language description of the element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Click on the 'Sign In' button",
actions=[BrowserAction.CLICK]
)
result = await agent.run()TYPE
Types text into an input field.
Parameters:
text(string): Text to typeselector(string, optional): CSS selector for the input elementdescription(string, optional): Natural language description of the input element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Type 'artificial intelligence' into the search box",
actions=[BrowserAction.TYPE]
)
result = await agent.run()SELECT
Selects an option from a dropdown menu.
Parameters:
option(string): Option text or value to selectselector(string, optional): CSS selector for the select elementdescription(string, optional): Natural language description of the select element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Select 'English' from the language dropdown",
actions=[BrowserAction.SELECT]
)
result = await agent.run()HOVER
Hovers over an element.
Parameters:
selector(string, optional): CSS selector for the elementdescription(string, optional): Natural language description of the element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Hover over the user menu to reveal dropdown options",
actions=[BrowserAction.HOVER]
)
result = await agent.run()DRAG_AND_DROP
Drags an element and drops it onto another element.
Parameters:
source_selector(string, optional): CSS selector for the source elementtarget_selector(string, optional): CSS selector for the target elementsource_description(string, optional): Natural language description of the source elementtarget_description(string, optional): Natural language description of the target element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Drag the first item to the shopping cart",
actions=[BrowserAction.DRAG_AND_DROP]
)
result = await agent.run()Extraction Actions
Actions that extract information from web pages.
GET_TEXT
Extracts text from an element or the entire page.
Parameters:
selector(string, optional): CSS selector for the elementdescription(string, optional): Natural language description of the element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Get the text of the main article headline",
actions=[BrowserAction.GET_TEXT]
)
result = await agent.run()GET_ATTRIBUTE
Retrieves an attribute value from an element.
Parameters:
attribute(string): The attribute name (e.g., "href", "src")selector(string, optional): CSS selector for the elementdescription(string, optional): Natural language description of the element
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Get the URL from the first image in the gallery",
actions=[BrowserAction.GET_ATTRIBUTE]
)
result = await agent.run()EXTRACT_DATA
Extracts structured data from a page based on a specified pattern.
Parameters:
pattern(object): Object describing the data structure to extractscope(string, optional): CSS selector to limit extraction scope
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Extract all product names and prices from the search results",
actions=[BrowserAction.EXTRACT_DATA]
)
result = await agent.run()Page Control Actions
Actions that control the page view and state.
SCROLL
Scrolls the page in a specified direction.
Parameters:
direction(string): Direction to scroll ("up", "down", "left", "right")amount(string or number, optional): Amount to scroll ("little", "half", "full", or pixel value)
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Scroll down to see more content",
actions=[BrowserAction.SCROLL]
)
result = await agent.run()WAIT
Waits for a specified condition or duration.
Parameters:
condition(string, optional): Condition to wait for ("element", "network", "load")duration(number, optional): Duration to wait in millisecondsselector(string, optional): CSS selector to wait for
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Wait for the page to fully load before proceeding",
actions=[BrowserAction.WAIT]
)
result = await agent.run()SCREENSHOT
Takes a screenshot of the current page or a specific element.
Parameters:
selector(string, optional): CSS selector for the elementdescription(string, optional): Natural language description of the elementfilename(string, optional): Filename to save the screenshot
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Take a screenshot of the chart showing monthly sales",
actions=[BrowserAction.SCREENSHOT]
)
result = await agent.run()Tab Management Actions
Actions that manage browser tabs.
NEW_TAB
Opens a new browser tab.
Parameters:
url(string, optional): URL to open in the new tab
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Open a new tab with Google Maps",
actions=[BrowserAction.NEW_TAB]
)
result = await agent.run()CLOSE_TAB
Closes the current or specified tab.
Parameters:
index(number, optional): Index of the tab to close (0-based)
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Close the current tab after completing the form",
actions=[BrowserAction.CLOSE_TAB]
)
result = await agent.run()SWITCH_TAB
Switches to another open tab.
Parameters:
index(number): Index of the tab to switch to (0-based)
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Switch to the second tab to check search results",
actions=[BrowserAction.SWITCH_TAB]
)
result = await agent.run()Memory Actions
Actions related to the agent's memory system.
REMEMBER
Explicitly stores information in the agent's memory.
Parameters:
key(string): Key to store the information undervalue(any): Information to remembersource(string, optional): Source of the information
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Remember the current price of Bitcoin for later comparison",
actions=[BrowserAction.REMEMBER]
)
result = await agent.run()RECALL
Retrieves information from the agent's memory.
Parameters:
key(string, optional): Key to retrieve information fromquery(string, optional): Natural language query to search memory
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Recall the Bitcoin price we saved earlier",
actions=[BrowserAction.RECALL]
)
result = await agent.run()REFLECT
Analyzes and summarizes information in memory to draw conclusions.
Parameters:
topic(string, optional): Topic to reflect on
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Reflect on the product information we've gathered to make a recommendation",
actions=[BrowserAction.REFLECT]
)
result = await agent.run()Authentication Actions
Actions for handling authentication and login scenarios.
LOGIN
Logs in to a website using provided credentials.
Parameters:
url(string): URL of the login pageusername(string): Username or emailpassword(string): Passwordusername_selector(string, optional): CSS selector for the username fieldpassword_selector(string, optional): CSS selector for the password fieldsubmit_selector(string, optional): CSS selector for the login button
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Log in to the account using the provided credentials",
actions=[BrowserAction.LOGIN],
auth={
"username": "[email protected]",
"password": "secure_password_123"
}
)
result = await agent.run()SECURITY NOTE
Never hardcode credentials in your code. Always use environment variables, secure vaults, or user input for handling sensitive information.
File Actions
Actions for handling file uploads and downloads.
UPLOAD_FILE
Uploads a file to a form input.
Parameters:
file_path(string): Path to the file to uploadselector(string, optional): CSS selector for the file input elementdescription(string, optional): Natural language description of the file input
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Upload profile picture to the settings page",
actions=[BrowserAction.UPLOAD_FILE],
file_paths={
"profile_image": "/path/to/profile.jpg"
}
)
result = await agent.run()DOWNLOAD
Downloads a file from a link.
Parameters:
selector(string, optional): CSS selector for the download linkdescription(string, optional): Natural language description of the download linksave_path(string, optional): Path where the file should be saved
Example:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Download the PDF report from the dashboard",
actions=[BrowserAction.DOWNLOAD],
download_dir="/downloads"
)
result = await agent.run()Custom Actions
You can define custom actions to extend Browser-Use's capabilities.
Creating Custom Actions
from browser_use import Agent, CustomAction
# Define a custom action to toggle dark mode on websites
toggle_dark_mode = CustomAction(
name="TOGGLE_DARK_MODE",
description="Toggles dark mode on websites that support it",
function=lambda page: page.evaluate("""
document.body.classList.toggle('dark-mode');
return document.body.classList.contains('dark-mode') ? 'Dark mode enabled' : 'Dark mode disabled';
""")
)
# Use the custom action
agent = Agent(
task="Enable dark mode on the website",
actions=[toggle_dark_mode]
)
result = await agent.run()Action Combinations
Multiple actions can be combined to form complex workflows:
from browser_use import Agent, BrowserAction
agent = Agent(
task="Search for 'climate change news', open the top 3 results in new tabs, and summarize each article",
actions=[
BrowserAction.GO_TO,
BrowserAction.TYPE,
BrowserAction.CLICK,
BrowserAction.NEW_TAB,
BrowserAction.SWITCH_TAB,
BrowserAction.GET_TEXT,
BrowserAction.REMEMBER
]
)
result = await agent.run()Action Configuration
You can configure global settings for actions:
from browser_use import Agent, BrowserAction, ActionConfig
# Configure action settings
action_config = ActionConfig(
timeout=10000, # 10 seconds timeout for actions
retry_attempts=3, # Retry failed actions 3 times
delay_between_actions=500, # 500ms delay between actions
screenshot_on_failure=True # Take screenshots when actions fail
)
agent = Agent(
task="Complete the multi-step checkout process",
actions=[BrowserAction.ALL], # Enable all actions
action_config=action_config
)
result = await agent.run()Next Steps
- Learn about advanced browser configuration
- Explore memory system integration with actions
- See practical examples of action combinations