Animated data flow diagram

AI Agent for Conversational Document Search in Supabase with OpenAI

Version: 1.0.0 | Last Updated: 2025-05-16

Integrates with:

Supabase OpenAI Langchain

Overview

Unlock Conversational Access to Your Documents with this AI Agent

Tired of manually sifting through document repositories? This AI Agent transforms your Supabase-stored files into an interactive knowledge base. It automates the ingestion, processing, and vectorization of your documents (PDFs, text files), allowing you to 'chat' with your data using natural language queries powered by OpenAI. Get precise, context-aware answers instantly, instead of spending hours searching.

Key Features & Benefits

  • Automated Document Pipeline: Fetches new files from Supabase storage, extracts text from PDFs and TXT files, and intelligently chunks content for optimal processing.
  • Advanced Semantic Search: Leverages OpenAI's text-embedding-3-small model to create powerful vector embeddings, enabling search based on meaning, not just keywords.
  • Supabase Vector Integration: Seamlessly stores and queries document embeddings using your Supabase project as a vector database (in a table typically named documents).
  • Conversational AI Agent: Provides a chat interface (via n8n's Chat Trigger) allowing users to ask questions in natural language and receive answers sourced directly from the ingested documents.
  • Efficient Processing: Includes logic to check against a Supabase files table to prevent re-processing of already indexed documents, saving on resources and time.
  • Langchain Powered: Utilizes Langchain components for robust AI agent construction, document loading, text splitting, and tool usage (Vector Store Tool).
  • Clear File Provenance: Embeds file identifiers as metadata during vectorization, enabling source tracking for retrieved information.

How it Works

This AI Agent operates in two main phases:

  1. Document Ingestion & Indexing: When triggered (e.g., manually or on a schedule), it scans your specified Supabase storage bucket for new files. For each new file, it downloads it, extracts text content, splits the text into manageable chunks, generates embeddings using OpenAI, and then stores these embeddings along with metadata (like the file ID) into your Supabase vector store. It also records the processed file in a separate Supabase files table to avoid duplicates.
  2. Conversational Q&A: An n8n Chat Trigger node exposes an endpoint for your chat application. When a user sends a query, the Langchain-powered AI Agent uses this query and an OpenAI chat model to search the vectorized documents in Supabase. It retrieves the most relevant document chunks and synthesizes an answer, effectively allowing you to 'chat with your files'.

Use Cases

  • B2C E-commerce: Instantly answer customer support questions by enabling your team to chat with your entire library of product specifications, FAQs, and return policy documents.
  • B2B SaaS: Equip your sales and support teams with an AI assistant that can quickly find precise information within extensive technical documentation, case studies, and internal wikis.
  • Founders & CTOs: Create a centralized, searchable knowledge hub from diverse company documents (project plans, research findings, meeting notes) to accelerate decision-making and knowledge sharing.
  • Heads of Automation: Deploy a scalable solution for internal Q&A, reducing the time employees spend searching for information and improving access to company knowledge.

Prerequisites

  • An n8n instance (Cloud or self-hosted).
  • Supabase project with Storage enabled. You'll need your Supabase URL and API Key (typically anon key for client-side operations shown, but service_role key for backend operations like inserts/HTTP requests to private storage if permissions require it).
  • OpenAI API Key with access to an embedding model (workflow configured for text-embedding-3-small) and a chat model (e.g., gpt-3.5-turbo, gpt-4).
  • In Supabase:
    • A table named files to track processed files (recommended columns: name TEXT, storage_id UUID or TEXT for the file's storage object ID).
    • A table (e.g., documents) for the vector store. The Langchain Supabase node can often create this with a default schema (content, embedding, metadata).

Setup Instructions

  1. Download the n8n workflow JSON file.
  2. Import the workflow into your n8n instance.
  3. Configure Credentials:
    • Create n8n credentials for Supabase (using your Supabase URL and relevant API key) and OpenAI (using your API key).
    • Update all Supabase-related nodes ('Get All files' HTTP Request, 'Get All Files' Supabase node, 'Download' HTTP Request, 'Create File record2', 'Insert into Supabase Vectorstore', 'Supabase Vector Store') to use your Supabase credential.
    • Update all OpenAI nodes ('Embeddings OpenAI', 'Embeddings OpenAI2', 'OpenAI Chat Model1', 'OpenAI Chat Model2') to use your OpenAI credential.
  4. Configure Supabase Integration Points:
    • Get All files (HTTP Request for Storage List): Modify the URL to point to your Supabase project's storage API endpoint for listing files in the desired bucket (e.g., https://YOUR_PROJECT_REF.supabase.co/storage/v1/object/list/YOUR_BUCKET_NAME). Adjust the jsonBody for prefix, limit, etc., as needed.
    • Get All Files (Supabase Table Read): Ensure this node is configured to read from your files table that tracks processed files.
    • Download (HTTP Request): Update the URL expression to correctly construct the download URL for files in your private storage bucket (e.g., https://YOUR_PROJECT_REF.supabase.co/storage/v1/object/private/{{ $json.name }}). Ensure authentication (Supabase API credential) is correctly set if accessing private files.
    • Create File record2 (Supabase Node): Ensure this node writes to your files table, mapping the correct fields (e.g., name, storage_id).
    • Insert into Supabase Vectorstore & Supabase Vector Store Nodes: Set the correct 'Table Name' (e.g., documents) for your vector store.
  5. Review File Processing Logic:
    • The 'If' node filters out already processed files and placeholder files (like '.emptyFolderPlaceholder'). Adjust conditions if your storage has different conventions.
    • The 'Switch' node routes based on file type (TXT, PDF). You can extend this for other file types and add corresponding extraction logic.
    • 'Recursive Character Text Splitter': Adjust chunkSize and chunkOverlap based on your content and chosen embedding model's context window.
  6. AI Agent Configuration:
    • 'Embeddings OpenAI' / 'Embeddings OpenAI2': Ensure the text-embedding-3-small model (or your preferred model) is selected.
    • 'OpenAI Chat Model1' / 'OpenAI Chat Model2': Select your desired chat model.
    • 'Vector Store Tool1': The description here is used by the AI agent to understand when to use this tool. Keep it descriptive.
  7. Set up Triggers:
    • Manual Trigger ('When clicking ‘Test workflow’'): Used for initiating the file ingestion process. You can replace this with a 'Schedule' trigger for regular automated runs.
    • Chat Trigger ('When chat message received'): This node exposes a webhook URL. Use this URL in your chat frontend application to send user messages to the AI Agent.
  8. Test the file ingestion part by running it manually with a few sample files in your Supabase storage bucket.
  9. Test the chat functionality by sending requests to the Chat Trigger's webhook URL.
  10. Activate the workflow for continuous operation.

Tags:

AI AgentSupabaseOpenAIDocument ProcessingRAGKnowledge BaseChatbotAutomationLangchain

Want your own unique AI agent?

Talk to us - we know how to build custom AI agents for your specific needs.

Schedule a Consultation