Skip to content

Browser Actions Reference

Browser-Use provides a comprehensive set of browser actions that agents can use to interact with web pages. This reference guide covers all available actions, their parameters, and examples of how to use them.

Actions that control browser navigation and page loading.

GO_TO

Navigates to a specified URL.

Parameters:

  • url (string): The URL to navigate to

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Navigate to Wikipedia's homepage",
    actions=[BrowserAction.GO_TO]
)
result = await agent.run()

BACK

Navigates back to the previous page in browser history.

Parameters: None

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Go back to the previous page after viewing an article",
    actions=[BrowserAction.BACK]
)
result = await agent.run()

FORWARD

Navigates forward in browser history.

Parameters: None

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Go forward after going back to the previous page",
    actions=[BrowserAction.FORWARD]
)
result = await agent.run()

REFRESH

Reloads the current page.

Parameters: None

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Refresh the page to check for updates",
    actions=[BrowserAction.REFRESH]
)
result = await agent.run()

Interaction Actions

Actions that interact with elements on a web page.

CLICK

Clicks on an element identified by a selector or natural language description.

Parameters:

  • selector (string, optional): CSS selector for the element
  • description (string, optional): Natural language description of the element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Click on the 'Sign In' button",
    actions=[BrowserAction.CLICK]
)
result = await agent.run()

TYPE

Types text into an input field.

Parameters:

  • text (string): Text to type
  • selector (string, optional): CSS selector for the input element
  • description (string, optional): Natural language description of the input element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Type 'artificial intelligence' into the search box",
    actions=[BrowserAction.TYPE]
)
result = await agent.run()

SELECT

Selects an option from a dropdown menu.

Parameters:

  • option (string): Option text or value to select
  • selector (string, optional): CSS selector for the select element
  • description (string, optional): Natural language description of the select element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Select 'English' from the language dropdown",
    actions=[BrowserAction.SELECT]
)
result = await agent.run()

HOVER

Hovers over an element.

Parameters:

  • selector (string, optional): CSS selector for the element
  • description (string, optional): Natural language description of the element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Hover over the user menu to reveal dropdown options",
    actions=[BrowserAction.HOVER]
)
result = await agent.run()

DRAG_AND_DROP

Drags an element and drops it onto another element.

Parameters:

  • source_selector (string, optional): CSS selector for the source element
  • target_selector (string, optional): CSS selector for the target element
  • source_description (string, optional): Natural language description of the source element
  • target_description (string, optional): Natural language description of the target element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Drag the first item to the shopping cart",
    actions=[BrowserAction.DRAG_AND_DROP]
)
result = await agent.run()

Extraction Actions

Actions that extract information from web pages.

GET_TEXT

Extracts text from an element or the entire page.

Parameters:

  • selector (string, optional): CSS selector for the element
  • description (string, optional): Natural language description of the element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Get the text of the main article headline",
    actions=[BrowserAction.GET_TEXT]
)
result = await agent.run()

GET_ATTRIBUTE

Retrieves an attribute value from an element.

Parameters:

  • attribute (string): The attribute name (e.g., "href", "src")
  • selector (string, optional): CSS selector for the element
  • description (string, optional): Natural language description of the element

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Get the URL from the first image in the gallery",
    actions=[BrowserAction.GET_ATTRIBUTE]
)
result = await agent.run()

EXTRACT_DATA

Extracts structured data from a page based on a specified pattern.

Parameters:

  • pattern (object): Object describing the data structure to extract
  • scope (string, optional): CSS selector to limit extraction scope

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Extract all product names and prices from the search results",
    actions=[BrowserAction.EXTRACT_DATA]
)
result = await agent.run()

Page Control Actions

Actions that control the page view and state.

SCROLL

Scrolls the page in a specified direction.

Parameters:

  • direction (string): Direction to scroll ("up", "down", "left", "right")
  • amount (string or number, optional): Amount to scroll ("little", "half", "full", or pixel value)

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Scroll down to see more content",
    actions=[BrowserAction.SCROLL]
)
result = await agent.run()

WAIT

Waits for a specified condition or duration.

Parameters:

  • condition (string, optional): Condition to wait for ("element", "network", "load")
  • duration (number, optional): Duration to wait in milliseconds
  • selector (string, optional): CSS selector to wait for

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Wait for the page to fully load before proceeding",
    actions=[BrowserAction.WAIT]
)
result = await agent.run()

SCREENSHOT

Takes a screenshot of the current page or a specific element.

Parameters:

  • selector (string, optional): CSS selector for the element
  • description (string, optional): Natural language description of the element
  • filename (string, optional): Filename to save the screenshot

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Take a screenshot of the chart showing monthly sales",
    actions=[BrowserAction.SCREENSHOT]
)
result = await agent.run()

Tab Management Actions

Actions that manage browser tabs.

NEW_TAB

Opens a new browser tab.

Parameters:

  • url (string, optional): URL to open in the new tab

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Open a new tab with Google Maps",
    actions=[BrowserAction.NEW_TAB]
)
result = await agent.run()

CLOSE_TAB

Closes the current or specified tab.

Parameters:

  • index (number, optional): Index of the tab to close (0-based)

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Close the current tab after completing the form",
    actions=[BrowserAction.CLOSE_TAB]
)
result = await agent.run()

SWITCH_TAB

Switches to another open tab.

Parameters:

  • index (number): Index of the tab to switch to (0-based)

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Switch to the second tab to check search results",
    actions=[BrowserAction.SWITCH_TAB]
)
result = await agent.run()

Memory Actions

Actions related to the agent's memory system.

REMEMBER

Explicitly stores information in the agent's memory.

Parameters:

  • key (string): Key to store the information under
  • value (any): Information to remember
  • source (string, optional): Source of the information

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Remember the current price of Bitcoin for later comparison",
    actions=[BrowserAction.REMEMBER]
)
result = await agent.run()

RECALL

Retrieves information from the agent's memory.

Parameters:

  • key (string, optional): Key to retrieve information from
  • query (string, optional): Natural language query to search memory

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Recall the Bitcoin price we saved earlier",
    actions=[BrowserAction.RECALL]
)
result = await agent.run()

REFLECT

Analyzes and summarizes information in memory to draw conclusions.

Parameters:

  • topic (string, optional): Topic to reflect on

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Reflect on the product information we've gathered to make a recommendation",
    actions=[BrowserAction.REFLECT]
)
result = await agent.run()

Authentication Actions

Actions for handling authentication and login scenarios.

LOGIN

Logs in to a website using provided credentials.

Parameters:

  • url (string): URL of the login page
  • username (string): Username or email
  • password (string): Password
  • username_selector (string, optional): CSS selector for the username field
  • password_selector (string, optional): CSS selector for the password field
  • submit_selector (string, optional): CSS selector for the login button

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Log in to the account using the provided credentials",
    actions=[BrowserAction.LOGIN],
    auth={
        "username": "[email protected]",
        "password": "secure_password_123"
    }
)
result = await agent.run()

SECURITY NOTE

Never hardcode credentials in your code. Always use environment variables, secure vaults, or user input for handling sensitive information.

File Actions

Actions for handling file uploads and downloads.

UPLOAD_FILE

Uploads a file to a form input.

Parameters:

  • file_path (string): Path to the file to upload
  • selector (string, optional): CSS selector for the file input element
  • description (string, optional): Natural language description of the file input

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Upload profile picture to the settings page",
    actions=[BrowserAction.UPLOAD_FILE],
    file_paths={
        "profile_image": "/path/to/profile.jpg"
    }
)
result = await agent.run()

DOWNLOAD

Downloads a file from a link.

Parameters:

  • selector (string, optional): CSS selector for the download link
  • description (string, optional): Natural language description of the download link
  • save_path (string, optional): Path where the file should be saved

Example:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Download the PDF report from the dashboard",
    actions=[BrowserAction.DOWNLOAD],
    download_dir="/downloads"
)
result = await agent.run()

Custom Actions

You can define custom actions to extend Browser-Use's capabilities.

Creating Custom Actions

python
from browser_use import Agent, CustomAction

# Define a custom action to toggle dark mode on websites
toggle_dark_mode = CustomAction(
    name="TOGGLE_DARK_MODE",
    description="Toggles dark mode on websites that support it",
    function=lambda page: page.evaluate("""
        document.body.classList.toggle('dark-mode');
        return document.body.classList.contains('dark-mode') ? 'Dark mode enabled' : 'Dark mode disabled';
    """)
)

# Use the custom action
agent = Agent(
    task="Enable dark mode on the website",
    actions=[toggle_dark_mode]
)
result = await agent.run()

Action Combinations

Multiple actions can be combined to form complex workflows:

python
from browser_use import Agent, BrowserAction

agent = Agent(
    task="Search for 'climate change news', open the top 3 results in new tabs, and summarize each article",
    actions=[
        BrowserAction.GO_TO,
        BrowserAction.TYPE,
        BrowserAction.CLICK,
        BrowserAction.NEW_TAB,
        BrowserAction.SWITCH_TAB,
        BrowserAction.GET_TEXT,
        BrowserAction.REMEMBER
    ]
)
result = await agent.run()

Action Configuration

You can configure global settings for actions:

python
from browser_use import Agent, BrowserAction, ActionConfig

# Configure action settings
action_config = ActionConfig(
    timeout=10000,  # 10 seconds timeout for actions
    retry_attempts=3,  # Retry failed actions 3 times
    delay_between_actions=500,  # 500ms delay between actions
    screenshot_on_failure=True  # Take screenshots when actions fail
)

agent = Agent(
    task="Complete the multi-step checkout process",
    actions=[BrowserAction.ALL],  # Enable all actions
    action_config=action_config
)
result = await agent.run()

Next Steps