Browser Actions Reference

Browser-Use provides a comprehensive set of browser actions that agents can use to interact with web pages. This reference guide covers all available actions, their parameters, and examples of how to use them.

Actions that control browser navigation and page loading.

GO_TO

Navigates to a specified URL.

Parameters:

url (string): The URL to navigate to

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Navigate to Wikipedia's homepage",
    actions=[BrowserAction.GO_TO]
)
result = await agent.run()

BACK

Navigates back to the previous page in browser history.

Parameters: None

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Go back to the previous page after viewing an article",
    actions=[BrowserAction.BACK]
)
result = await agent.run()

FORWARD

Navigates forward in browser history.

Parameters: None

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Go forward after going back to the previous page",
    actions=[BrowserAction.FORWARD]
)
result = await agent.run()

REFRESH

Reloads the current page.

Parameters: None

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Refresh the page to check for updates",
    actions=[BrowserAction.REFRESH]
)
result = await agent.run()

Interaction Actions

Actions that interact with elements on a web page.

CLICK

Clicks on an element identified by a selector or natural language description.

Parameters:

selector (string, optional): CSS selector for the element
description (string, optional): Natural language description of the element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Click on the 'Sign In' button",
    actions=[BrowserAction.CLICK]
)
result = await agent.run()

TYPE

Types text into an input field.

Parameters:

text (string): Text to type
selector (string, optional): CSS selector for the input element
description (string, optional): Natural language description of the input element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Type 'artificial intelligence' into the search box",
    actions=[BrowserAction.TYPE]
)
result = await agent.run()

SELECT

Selects an option from a dropdown menu.

Parameters:

option (string): Option text or value to select
selector (string, optional): CSS selector for the select element
description (string, optional): Natural language description of the select element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Select 'English' from the language dropdown",
    actions=[BrowserAction.SELECT]
)
result = await agent.run()

HOVER

Hovers over an element.

Parameters:

selector (string, optional): CSS selector for the element
description (string, optional): Natural language description of the element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Hover over the user menu to reveal dropdown options",
    actions=[BrowserAction.HOVER]
)
result = await agent.run()

DRAG_AND_DROP

Drags an element and drops it onto another element.

Parameters:

source_selector (string, optional): CSS selector for the source element
target_selector (string, optional): CSS selector for the target element
source_description (string, optional): Natural language description of the source element
target_description (string, optional): Natural language description of the target element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Drag the first item to the shopping cart",
    actions=[BrowserAction.DRAG_AND_DROP]
)
result = await agent.run()

Extraction Actions

Actions that extract information from web pages.

GET_TEXT

Extracts text from an element or the entire page.

Parameters:

selector (string, optional): CSS selector for the element
description (string, optional): Natural language description of the element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Get the text of the main article headline",
    actions=[BrowserAction.GET_TEXT]
)
result = await agent.run()

GET_ATTRIBUTE

Retrieves an attribute value from an element.

Parameters:

attribute (string): The attribute name (e.g., "href", "src")
selector (string, optional): CSS selector for the element
description (string, optional): Natural language description of the element

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Get the URL from the first image in the gallery",
    actions=[BrowserAction.GET_ATTRIBUTE]
)
result = await agent.run()

EXTRACT_DATA

Extracts structured data from a page based on a specified pattern.

Parameters:

pattern (object): Object describing the data structure to extract
scope (string, optional): CSS selector to limit extraction scope

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Extract all product names and prices from the search results",
    actions=[BrowserAction.EXTRACT_DATA]
)
result = await agent.run()

Page Control Actions

Actions that control the page view and state.

SCROLL

Scrolls the page in a specified direction.

Parameters:

direction (string): Direction to scroll ("up", "down", "left", "right")
amount (string or number, optional): Amount to scroll ("little", "half", "full", or pixel value)

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Scroll down to see more content",
    actions=[BrowserAction.SCROLL]
)
result = await agent.run()

WAIT

Waits for a specified condition or duration.

Parameters:

condition (string, optional): Condition to wait for ("element", "network", "load")
duration (number, optional): Duration to wait in milliseconds
selector (string, optional): CSS selector to wait for

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Wait for the page to fully load before proceeding",
    actions=[BrowserAction.WAIT]
)
result = await agent.run()

SCREENSHOT

Takes a screenshot of the current page or a specific element.

Parameters:

selector (string, optional): CSS selector for the element
description (string, optional): Natural language description of the element
filename (string, optional): Filename to save the screenshot

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Take a screenshot of the chart showing monthly sales",
    actions=[BrowserAction.SCREENSHOT]
)
result = await agent.run()

Tab Management Actions

Actions that manage browser tabs.

NEW_TAB

Opens a new browser tab.

Parameters:

url (string, optional): URL to open in the new tab

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Open a new tab with Google Maps",
    actions=[BrowserAction.NEW_TAB]
)
result = await agent.run()

CLOSE_TAB

Closes the current or specified tab.

Parameters:

index (number, optional): Index of the tab to close (0-based)

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Close the current tab after completing the form",
    actions=[BrowserAction.CLOSE_TAB]
)
result = await agent.run()

SWITCH_TAB

Switches to another open tab.

Parameters:

index (number): Index of the tab to switch to (0-based)

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Switch to the second tab to check search results",
    actions=[BrowserAction.SWITCH_TAB]
)
result = await agent.run()

Memory Actions

Actions related to the agent's memory system.

REMEMBER

Explicitly stores information in the agent's memory.

Parameters:

key (string): Key to store the information under
value (any): Information to remember
source (string, optional): Source of the information

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Remember the current price of Bitcoin for later comparison",
    actions=[BrowserAction.REMEMBER]
)
result = await agent.run()

RECALL

Retrieves information from the agent's memory.

Parameters:

key (string, optional): Key to retrieve information from
query (string, optional): Natural language query to search memory

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Recall the Bitcoin price we saved earlier",
    actions=[BrowserAction.RECALL]
)
result = await agent.run()

REFLECT

Analyzes and summarizes information in memory to draw conclusions.

Parameters:

topic (string, optional): Topic to reflect on

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Reflect on the product information we've gathered to make a recommendation",
    actions=[BrowserAction.REFLECT]
)
result = await agent.run()

Authentication Actions

Actions for handling authentication and login scenarios.

Logs in to a website using provided credentials.

Parameters:

url (string): URL of the login page
username (string): Username or email
password (string): Password
username_selector (string, optional): CSS selector for the username field
password_selector (string, optional): CSS selector for the password field
submit_selector (string, optional): CSS selector for the login button

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Log in to the account using the provided credentials",
    actions=[BrowserAction.LOGIN],
    auth={
        "username": "[email protected]",
        "password": "secure_password_123"
    }
)
result = await agent.run()

SECURITY NOTE

Never hardcode credentials in your code. Always use environment variables, secure vaults, or user input for handling sensitive information.

File Actions

Actions for handling file uploads and downloads.

UPLOAD_FILE

Uploads a file to a form input.

Parameters:

file_path (string): Path to the file to upload
selector (string, optional): CSS selector for the file input element
description (string, optional): Natural language description of the file input

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Upload profile picture to the settings page",
    actions=[BrowserAction.UPLOAD_FILE],
    file_paths={
        "profile_image": "/path/to/profile.jpg"
    }
)
result = await agent.run()

DOWNLOAD

Downloads a file from a link.

Parameters:

selector (string, optional): CSS selector for the download link
description (string, optional): Natural language description of the download link
save_path (string, optional): Path where the file should be saved

Example:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Download the PDF report from the dashboard",
    actions=[BrowserAction.DOWNLOAD],
    download_dir="/downloads"
)
result = await agent.run()

Custom Actions

You can define custom actions to extend Browser-Use's capabilities.

Creating Custom Actions

python

from browser_use import Agent, CustomAction

# Define a custom action to toggle dark mode on websites
toggle_dark_mode = CustomAction(
    name="TOGGLE_DARK_MODE",
    description="Toggles dark mode on websites that support it",
    function=lambda page: page.evaluate("""
        document.body.classList.toggle('dark-mode');
        return document.body.classList.contains('dark-mode') ? 'Dark mode enabled' : 'Dark mode disabled';
    """)
)

# Use the custom action
agent = Agent(
    task="Enable dark mode on the website",
    actions=[toggle_dark_mode]
)
result = await agent.run()

Action Combinations

Multiple actions can be combined to form complex workflows:

python

from browser_use import Agent, BrowserAction

agent = Agent(
    task="Search for 'climate change news', open the top 3 results in new tabs, and summarize each article",
    actions=[
        BrowserAction.GO_TO,
        BrowserAction.TYPE,
        BrowserAction.CLICK,
        BrowserAction.NEW_TAB,
        BrowserAction.SWITCH_TAB,
        BrowserAction.GET_TEXT,
        BrowserAction.REMEMBER
    ]
)
result = await agent.run()

Action Configuration

You can configure global settings for actions:

python

from browser_use import Agent, BrowserAction, ActionConfig

# Configure action settings
action_config = ActionConfig(
    timeout=10000,  # 10 seconds timeout for actions
    retry_attempts=3,  # Retry failed actions 3 times
    delay_between_actions=500,  # 500ms delay between actions
    screenshot_on_failure=True  # Take screenshots when actions fail
)

agent = Agent(
    task="Complete the multi-step checkout process",
    actions=[BrowserAction.ALL],  # Enable all actions
    action_config=action_config
)
result = await agent.run()

Next Steps

Learn about advanced browser configuration
Explore memory system integration with actions
See practical examples of action combinations

Browser Actions Reference ​

Navigation Actions ​

GO_TO ​

BACK ​

FORWARD ​

REFRESH ​

Interaction Actions ​

CLICK ​

TYPE ​

SELECT ​

HOVER ​

DRAG_AND_DROP ​

Extraction Actions ​

GET_TEXT ​

GET_ATTRIBUTE ​

EXTRACT_DATA ​

Page Control Actions ​

SCROLL ​

WAIT ​

SCREENSHOT ​

Tab Management Actions ​

NEW_TAB ​

CLOSE_TAB ​

SWITCH_TAB ​

Memory Actions ​

REMEMBER ​

RECALL ​

REFLECT ​

Authentication Actions ​

LOGIN ​

File Actions ​

UPLOAD_FILE ​

DOWNLOAD ​

Custom Actions ​

Creating Custom Actions ​

Action Combinations ​

Action Configuration ​

Next Steps ​

Browser Actions Reference

Navigation Actions

GO_TO

BACK

FORWARD

REFRESH

Interaction Actions

CLICK

TYPE

SELECT

HOVER

DRAG_AND_DROP

Extraction Actions

GET_TEXT

GET_ATTRIBUTE

EXTRACT_DATA

Page Control Actions

SCROLL

WAIT

SCREENSHOT

Tab Management Actions

NEW_TAB

CLOSE_TAB

SWITCH_TAB

Memory Actions

REMEMBER

RECALL

REFLECT

Authentication Actions

LOGIN

File Actions

UPLOAD_FILE

DOWNLOAD

Custom Actions

Creating Custom Actions

Action Combinations

Action Configuration

Next Steps