Animated data flow diagram

AI RAG Chatbot: Query EPUBs with Supabase, OpenAI & n8n

Version: 1.0.0 | Last Updated: 2025-05-16

Integrates with:

OpenAI Supabase Google Drive Langchain

Overview

Unlock Intelligent Document Q&A with this AI Agent

This n8n workflow empowers you to build a sophisticated Retrieval Augmented Generation (RAG) AI Agent. It processes documents (specifically EPUBs in this example, loaded from Google Drive), breaks them into manageable chunks, generates vector embeddings using OpenAI, and stores them in a Supabase PostgreSQL database equipped with the pgvector extension. Users can then interact with a chat interface to ask questions, and the agent retrieves relevant document sections from Supabase to provide informed answers using an OpenAI chat model.

This agent handles the full lifecycle: initial document ingestion, embedding, vector storage, intelligent retrieval, and AI-powered question answering. It also includes examples for upserting (updating) documents in your vector store and provides guidance on Supabase setup and data deletion.

Key Features & Benefits

  • End-to-End RAG Pipeline: Automates document loading, chunking, embedding, vector storage, and AI-driven Q&A.
  • Google Drive Integration: Loads EPUB files directly from Google Drive (easily adaptable for other sources/formats).
  • OpenAI Powered: Leverages OpenAI's text-embedding-3-small for efficient embeddings and powerful chat models (e.g., GPT-3.5 Turbo, GPT-4) for natural language responses.
  • Supabase Vector Store: Utilizes Supabase with pgvector for scalable and efficient semantic search over your documents.
  • Chat Interface: Provides an interactive chat trigger for users to ask questions about the document content.
  • Document Management: Demonstrates how to insert new documents and upsert (update) existing ones in the vector store.
  • Comprehensive Setup Guidance: Includes detailed notes within the workflow for Supabase database and table preparation, including required SQL functions.
  • Customizable Knowledge Base: Easily adapt to create a chatbot for your own books, manuals, or any text-based knowledge source.

Managing Vector Data (Deletion)

While this workflow provides nodes for inserting and upserting documents in Supabase, n8n currently doesn't have a dedicated Langchain node for deleting from Supabase vector stores. The workflow includes a sticky note ('Sticky Note4') with guidance on how to perform deletions using an n8n HTTP Request node to call the Supabase API directly. This ensures you have a path for complete data lifecycle management.

Use Cases

  • Creating a dedicated chatbot for a specific book, like the example 'How To Transform Your Life'.
  • Building an internal Q&A tool for company policies, technical documentation, or HR manuals stored as EPUBs.
  • Developing an AI assistant for students to query textbooks or research papers.
  • Offering an intelligent help system for users to find information within product guides.

Prerequisites

  • An n8n instance (Cloud or self-hosted).
  • OpenAI API Key with access to an embedding model (e.g., text-embedding-3-small) and a chat model (e.g., gpt-3.5-turbo or gpt-4).
  • Supabase account and project.
  • Supabase Database Setup:
    • The pgvector extension must be enabled in your Supabase database (Database > Extensions > Search for 'vector').
    • A table specifically configured for vector storage (e.g., Kadampa). This table requires columns like embedding VECTOR(1536) (adjust dimension for your chosen embedding model), metadata JSONB, and content TEXT. Refer to 'Sticky Note2' in the workflow for the exact ALTER TABLE SQL query.
    • The custom SQL function match_documents must be created in your Supabase SQL Editor. The required SQL is provided in 'Sticky Note2' in the workflow. Remember to replace "YOUR TABLE NAME" in the function's SQL with your actual table name.
    • Appropriate Row Level Security (RLS) policies must be configured for your table in Supabase (Authentication > Policies).
  • Google Drive credentials (if using the Google Drive node to load documents).

Setup Instructions

  1. Prepare Supabase (Crucial): Follow the detailed instructions in 'Sticky Note2' within the imported n8n workflow. This involves: a. Enabling the pgvector extension. b. Creating or altering your target table with the specified columns (embedding, metadata, content). Ensure the VECTOR dimension matches your OpenAI embedding model (e.g., 1536 for text-embedding-3-small). c. Creating the match_documents SQL function using the Supabase SQL Editor, replacing placeholders with your table name. d. Configuring Row Level Security policies for your table.
  2. Download the n8n workflow JSON file.
  3. Import the workflow into your n8n instance.
  4. Configure n8n Credentials: a. Create and select your OpenAI credential in all OpenAI nodes ('Embeddings OpenAI Insertion', 'Embeddings OpenAI Retrieval', 'Embeddings OpenAI Upserting', 'OpenAI Chat Model'). b. Create and select your Supabase credential (using API URL and Anon Key) in all Supabase Vector Store nodes ('Insert Documents', 'Retrieve by Query', 'Update Documents'). c. If using the 'Google Drive' node, configure and select your Google Drive credential.
  5. Configure Document Insertion Path: a. In the 'Google Drive' node, specify the URL or File ID of the EPUB document you want to process. b. Verify the 'Embeddings OpenAI Insertion' node is set to your desired embedding model (e.g., text-embedding-3-small). c. In the 'Insert Documents' (Supabase Vector Store) node, select your Supabase credential and choose the correct table name you prepared in Step 1. d. To populate your vector store, manually run the workflow from the 'Google Drive' node up to 'Insert Documents'.
  6. Configure Document Retrieval & Q&A Path: a. Ensure the 'Embeddings OpenAI Retrieval' node uses the same OpenAI credential and embedding model as the insertion path. b. In the 'Retrieve by Query' (Supabase Vector Store) node, select your Supabase credential, the correct table name, and ensure the queryName parameter is match_documents. c. In the 'OpenAI Chat Model' node, select your OpenAI credential and preferred chat model. d. Customize the initialMessages in the 'When chat message received' (Chat Trigger) node if desired. The webhook URL for this trigger is your chatbot endpoint.
  7. Configure Document Upserting Path (Optional): a. The 'Placeholder (File/Content to Upsert)' node is for demonstration. You'll need to adapt it or connect an upstream node that provides the id of the document to update and its new content/metadata. b. Configure the 'Embeddings OpenAI Upserting' and 'Update Documents' (Supabase Vector Store) nodes similarly to the insertion path, ensuring the correct table and document id are specified for updates.
  8. Test the chatbot by sending a message (question) to the webhook URL provided by the 'When chat message received' node after activating the workflow.
  9. Activate the workflow for ongoing use.

Tags:

AI AgentRAGSupabaseOpenAIVector DatabaseChatbotDocument Q&AEPUBKnowledge BaseAutomationLangchain

Want your own unique AI agent?

Talk to us - we know how to build custom AI agents for your specific needs.

Schedule a Consultation