AI Vision Web Scraper Agent using Selenium & OpenAI
Integrates with:
Overview
Unlock Advanced Web Scraping with this AI Agent
This workflow transforms n8n into a powerful AI-driven web scraping agent. Instead of relying on fragile CSS selectors, this agent uses Selenium to take a screenshot of a webpage and leverages OpenAI's GPT-4o vision model to 'see' and extract the exact information you need. It can intelligently find the right page to scrape via Google search, handle sites requiring a login by injecting session cookies, and is designed to be resilient against common anti-bot measures.
Key Features & Benefits
- AI-Powered Vision Extraction: Uses GPT-4o to analyze screenshots, making the scraping process resilient to changes in a website's underlying HTML structure.
- Intelligent Target Discovery: If you don't have a direct URL, the agent can perform a Google search to find the correct page based on your keywords.
- Authenticated Scraping: Seamlessly injects session cookies provided in the webhook payload to access and scrape data from behind login walls.
- Bot Evasion Techniques: Implements basic measures like modifying the browser agent and navigator properties to avoid common bot-detection systems.
- Flexible & Robust: Triggered via a webhook, it dynamically adapts to your data extraction requirements and includes comprehensive session management and error handling.
Use Cases
- Automating competitive analysis by scraping pricing, features, and reviews from competitor websites.
- Gathering B2B lead data from directories or company websites by visually extracting contact details and company information.
- Monitoring brand mentions or product listings across various e-commerce platforms without writing custom parsers.
- Aggregating market research data from articles, forums, or reports that are difficult to scrape with traditional tools.
Prerequisites
- An n8n instance (Cloud or self-hosted).
- A running Selenium container. The workflow is pre-configured for a Docker setup (see project repo: https://github.com/Touxan/n8n-ultimate-scraper).
- An OpenAI API Key with access to the
gpt-4omodel. - (Optional) A residential proxy service for large-scale scraping.
- (Optional) The companion browser extension for easily extracting session cookies.
Setup Instructions
- Deploy the Selenium container using the Docker Compose file from the associated GitHub project: https://github.com/Touxan/n8n-ultimate-scraper.
- Import the workflow JSON into your n8n instance.
- Configure all OpenAI nodes (
OpenAI,OpenAI Chat Model,Information Extractor) with your OpenAI API Key. - Verify that the HTTP Request nodes pointing to Selenium (e.g.,
http://selenium_chrome:4444/...) are correct for your network.selenium_chromemust be the resolvable hostname of your Selenium container. - (Optional) For scraping behind logins, install the cookie-collection browser extension from the project's GitHub.
- (Optional) To use a proxy, configure it in your Docker setup and add the
--proxy-serverargument to the 'Create Selenium Session' node's capabilities. - Activate the workflow and trigger it via its webhook URL using a POST request with a JSON body.
Want your own unique AI agent?
Talk to us - we know how to build custom AI agents for your specific needs.
Request a Consultation