Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Fix: SeleniumScrapingTool initializes Chrome WebDriver at import time causing unwanted browser windows and memory leaks #231

Open
niketan-byte opened this issue Mar 2, 2025 · 0 comments · May be fixed by #233

Comments

@niketan-byte
Copy link

Description

When using crewai_tools version 0.36.0 (and possibly other versions above 0.32.0), the SeleniumScrapingTool class initializes a Chrome WebDriver instance immediately upon import/instantiation, even if the tool is not being used. This causes Chrome browser windows to automatically open during application startup when the module containing SeleniumScrapingTool is imported.

Additionally, these browser instances are not properly closed when the application reloads (especially in development mode with hot-reloading), leading to multiple Chrome processes accumulating in memory and causing severe memory issues over time.

Steps to Reproduce

  1. Install crewai_tools version 0.36.0 (any version above 0.32.0)
  2. Create a simple FastAPI application
  3. Import and instantiate SeleniumScrapingTool at module level:
    from crewai_tools import SeleniumScrapingTool
    
    # This line alone causes a Chrome browser to open during import
    selenium_scraper = SeleniumScrapingTool()
  4. Run the application with uvicorn main:app --reload
  5. Make a change to any file to trigger a reload
  6. Observe that a new Chrome window opens with each reload, while old processes remain in memory

Expected behavior

The Chrome WebDriver should only be initialized when the tool is actually used (when its _run method is called), not when the tool is instantiated. Additionally, browser instances should be properly cleaned up when no longer needed.

Code snippets

The problematic code is in the __init__ method of the SeleniumScrapingTool class, where it creates a WebDriver instance immediately:

def __init__(
    self,
    website_url: Optional[str] = None,
    cookie: Optional[dict] = None,
    css_element: Optional[str] = None,
    **kwargs,
):
    super().__init__(**kwargs)
    try:
        from selenium import webdriver
        from selenium.webdriver.chrome.options import Options
        from selenium.webdriver.common.by import By
    except ImportError:
        # Import handling code...
    
    # This line creates a WebDriver instance immediately upon instantiation
    self.driver = webdriver.Chrome()
    # ...

This is particularly problematic in development environments with hot-reloading, as each reload creates a new browser instance without properly cleaning up the previous ones.

Operating System

macOS

Python Version

3.11.4

crewAI Version

0.102.0

crewAI Tools Version

0.36.0

Evidence

When running a FastAPI application with hot-reloading enabled, you can observe:

  1. Chrome browser windows automatically opening during application startup
  2. New Chrome windows opening with each reload
  3. Increasing memory usage over time as shown in Activity Monitor/Task Manager
  4. Multiple chrome processes running in the background even after the application is stopped

The issue does not occur with crewai_tools version 0.32.0, confirming that the behavior was introduced in a more recent version.

Possible Solution

The solution is to implement lazy initialization of the WebDriver in the SeleniumScrapingTool class:

  1. Store the WebDriver class instead of creating an instance in __init__
  2. Add a _create_driver_instance method that creates the WebDriver only when needed
  3. Modify _create_driver to use the lazy initialization
  4. Improve the close method to ensure proper cleanup of resources

This approach ensures that:

  • No browser window opens until the tool is actually used
  • The tool still works exactly the same way when it is used
  • Proper cleanup happens to prevent memory leaks

I'm willing to submit a PR with this fix if desired.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant