AI Agent: Local LLM Performance Analyzer & Benchmarking Tool

Version: 1.0.0 | Last Updated: 2025-05-16

Integrates with:

Core AI Power

5/10

Automation Level

6/10

Integration Reach

2 systems

Setup Simplicity

5/10

Adaptability

6/10

Overview

Unlock Optimized AI with this Local LLM Performance Analyzer

This AI Agent empowers you to systematically test, analyze, and compare the performance of various Large Language Models (LLMs) running locally via LM Studio. Stop guessing and start making data-driven decisions about which local LLM best suits your specific business automation needs. This agent automates the prompting, response gathering, and analysis, providing clear metrics for evaluation.

Key Features & Benefits

Automated LLM Benchmarking: Connects directly to your LM Studio instance to retrieve a list of loaded models and iteratively tests each one based on your input prompt.
Dynamic Prompting & Model Iteration: Sends your test prompts to multiple local LLMs fetched from LM Studio dynamically.
AI-Driven Response Analysis: Utilizes a custom code node to calculate crucial performance and quality metrics for each LLM's response, including:
- Flesch-Kincaid Readability Score
- Word Count & Sentence Count
- Average Sentence Length & Average Word Length
- Response Time (latency calculation between start and end markers).
Customizable System Prompts: Tailor system messages in the 'Add System Prompt' node to guide LLM behavior (e.g., for conciseness, specific tone, or reading level) ensuring fair comparisons across models.
Adjustable LLM Parameters: Fine-tune settings like Temperature, Top P, and Presence Penalty directly within the 'Run Model with Dunamic Inputs' node for advanced model behavior control during tests.
Comprehensive Reporting (Optional): Automatically logs detailed test results (prompt, model, response, all calculated metrics, timestamps) to a Google Sheet for easy review, comparison, and historical tracking.
Streamlined Evaluation: Quickly identify the most performant, readable, or concise local LLM for your applications, saving significant manual effort in model selection.

Use Cases

B2C E-commerce: Test and select the most suitable local LLM for powering responsive and easy-to-understand customer support chatbots or product description generators, enhancing customer experience while maintaining data privacy.
B2B SaaS: Evaluate local LLMs for integrating cost-effective and performant AI features into your platform, such as automated report summarization, code generation assistance, or intelligent data analysis, without relying on external cloud APIs.
Solopreneurs & Founders: Experiment with various local open-source LLMs for tasks like content creation, email drafting, or research, identifying models that offer the best balance of quality and speed on your own hardware.
CTOs & Heads of Automation: Establish a standardized process for benchmarking local LLMs, ensuring that AI model selections are backed by empirical data on performance, readability, and resource efficiency for critical internal or client-facing automations.

Prerequisites

An n8n instance (Cloud or self-hosted).
LM Studio installed on a local or network-accessible machine.
Desired LLMs downloaded and loaded into the LM Studio server.
The IP address and port of the running LM Studio server (the workflow defaults to port 1234 and expects an OpenAI-compatible API endpoint at /v1).
An OpenAI credential configured in n8n. This can be a dummy credential (e.g., API key set to 'none') if your LM Studio server doesn't require authentication, as the Langchain OpenAI node expects this field.
(Optional) Google Cloud Platform project with Google Sheets API enabled and OAuth2 credentials configured in n8n if you wish to log results to Google Sheets.

Setup Instructions

Download the n8n workflow JSON file.
Import the workflow into your n8n instance.
LM Studio Configuration:
- Ensure LM Studio is installed and running.
- Load the LLMs you want to test into LM Studio.
- Start the LM Studio server (usually found in the "Local Server" tab). Note down its IP address and port (e.g., http://localhost:1234 or http://YOUR_LM_STUDIO_IP:1234).
Update LM Studio URLs in n8n:
- In the 'Get Models' (HTTP Request) node: Set the URL parameter to http://YOUR_LM_STUDIO_IP:PORT/v1/models (replace YOUR_LM_STUDIO_IP:PORT with your actual server address and port).
- In the 'Run Model with Dunamic Inputs' (LM Chat OpenAI) node: Go to the 'Options' tab and set the Base URL to http://YOUR_LM_STUDIO_IP:PORT/v1.
Configure LLM Node Credentials: In the 'Run Model with Dunamic Inputs' node, select your configured OpenAI credential. If LM Studio doesn't need an API key, ensure your selected credential handles this (e.g., a dummy key like 'NA' or 'test').
(Optional) Customize System Prompt: Modify the text in the 'Add System Prompt' (Set) node (parameter system_prompt) to define a consistent instruction for all tested LLMs.
(Optional) Configure Google Sheets Logging:
- Create a new Google Sheet.
- Add headers in the first row: Prompt, Time Sent, Time Received, Total Time Spent, Model, Response, Readability Score, Average Word Length, Word Count, Sentence Count, Average Sentence Length.
- In the 'Save Results to Google Sheets' node:
  - Select or create your Google Sheets OAuth2 credentials.
  - Enter your Google Sheet's Document ID (from its URL).
  - Select the correct Sheet Name (e.g., Sheet1 or gid=0).
  - Verify the column mapping in the 'Columns' parameter matches your sheet headers.
(Optional) Adjust LLM Parameters: Modify Temperature, Top P, Presence Penalty in the 'Run Model with Dunamic Inputs' node's 'Options' tab as needed for your tests.
Trigger the Workflow: The workflow starts with the 'When chat message received' trigger. Open the chat interface for this workflow in n8n (usually accessible from the workflow editor by clicking the play button on the trigger or using the chat panel) and send a message. This message will be used as the initial prompt for the LLMs.
Review the output of the 'Analyze LLM Response Metrics' node and, if configured, your Google Sheet for the comparison. Refer to the 'Sticky Note' in the workflow for guidance on interpreting readability scores.
Activate the workflow for regular use. For distinct test runs, clear previous chat history if the chat trigger behavior carries it over (as suggested in the workflow's 'Pro Tip' sticky note).

Tags:

AI AgentAutomationLM StudioLLM TestingBenchmarkingPerformance AnalysisNLPLocal AIModel Comparison

Want your own unique AI agent?

Talk to us - we know how to build custom AI agents for your specific needs.

Request a Consultation