Animated data flow diagram

AI RAG Agent: Context-Aware Document Chunking from Google Drive to Pinecone with Gemini

Version: 1.0.0 | Last Updated: 2025-05-16

Integrates with:

Google Drive OpenRouter Google Gemini Pinecone

Overview

Unlock Advanced Document Understanding and Retrieval with this AI Agent

This n8n workflow empowers you to build sophisticated Retrieval Augmented Generation (RAG) systems by intelligently processing and vectorizing documents from Google Drive. It goes beyond simple text splitting by using an AI Agent (powered by models like Gemini via OpenRouter) to generate contextual summaries for each document segment. These enriched chunks are then embedded using Google Gemini and stored in Pinecone, creating a highly effective knowledge base for your AI applications.

Key Features & Benefits

  • Automated Document Ingestion: Fetches specified documents directly from your Google Drive.
  • Custom Text Splitting: Utilizes a custom code node to split documents into meaningful sections based on a defined separator ([SECTIONEND]).
  • AI-Powered Contextual Chunking: For each section, an AI Agent (leveraging OpenRouter for LLM access) generates a concise context, situating the chunk within the overall document. This significantly improves the relevance of retrieved chunks for RAG.
  • High-Quality Embeddings: Employs Google Gemini's text-embedding-004 model to create potent vector embeddings for the context-enriched chunks.
  • Efficient Vector Storage: Seamlessly inserts the generated vectors into your Pinecone index, ready for similarity search.
  • Flexible LLM Integration: Uses OpenRouter, allowing you to choose from various LLMs (including Gemini) for the context generation task.
  • Streamlined RAG Pipeline: Provides a robust foundation for building Q&A systems, intelligent search, or any application requiring deep document understanding.

Use Cases

  • Building intelligent Q&A bots over specific Google Drive documents.
  • Creating context-aware semantic search for internal knowledge bases stored in Drive.
  • Powering AI assistants that need to retrieve and understand nuanced information from lengthy documents.
  • Automating the creation of highly relevant vector embeddings for custom RAG applications.
  • Improving search accuracy by enriching document chunks with AI-generated contextual summaries.

Prerequisites

  • An n8n instance (Cloud or self-hosted).
  • Google Drive credentials with access to the target document(s) (OAuth2 connection configured in n8n).
  • OpenRouter API Key.
  • Pinecone API Key and an existing Pinecone index (e.g., 'context-rag-test' as pre-configured or your own).
  • Google Gemini API Key (MakerSuite API Key for PaLM/Gemini models).

Setup Instructions

  1. Download the n8n workflow JSON file.
  2. Import the workflow into your n8n instance.
  3. Configure the 'Get Document From Google Drive' node: select your Google Drive OAuth2 credentials and update the 'File ID' parameter to point to your desired document.
  4. Configure the 'OpenRouter Chat Model' node: select or create your OpenRouter API credentials. You can also adjust the model used within its parameters.
  5. Review the prompt in the 'AI Agent - Prepare Context' node. It's designed to take the full document context and the current chunk to generate a contextual summary.
  6. Configure the 'Embeddings Google Gemini' node: select or create your Google Gemini (PaLM) API credentials.
  7. Configure the 'Pinecone Vector Store' node: select or create your Pinecone API credentials and ensure the 'Pinecone Index' parameter matches your target index name.
  8. The 'Split Document Text Into Sections' Code node uses '—---------------------------—-------------[SECTIONEND]—---------------------------—-------------' as a delimiter. Ensure your Google Drive document uses this delimiter if you want custom section splitting, or adapt the code node.
  9. Activate the workflow. It can be run manually via the 'When clicking ‘Test workflow’' trigger.

Tags:

AI AgentRAGGoogle DrivePineconeGeminiOpenRouterNLPDocument ProcessingVector DatabaseAutomationContextual Chunking

Want your own unique AI agent?

Talk to us - we know how to build custom AI agents for your specific needs.

Schedule a Consultation