🎉 Use coupon MYXERO to enjoy 20% recurring discount on any plan. View Pricing
AI Provider for llama.cpp
AI Provider for llama.cpp

AI Provider for llama.cpp

0/5 (0 ratings) — active installs Updated Apr 27, 2026

This plugin provides llama.cpp integration for the WordPress AI Client. It lets WordPress sites use large language models running via a llama.cpp server for text generation and other AI capabilities.

llama.cpp exposes an OpenAI-compatible API, and this provider uses that API to communicate with any GGUF model loaded into your llama.cpp server.

Features:

  • Text generation with any llama.cpp-loaded model
  • Automatic model discovery from your llama.cpp server
  • Function calling support
  • Structured output (JSON mode) support
  • Settings page to configure the server URL (default: http://127.0.0.1:8080)
  • Works without an API key for local instances

Requirements:

  • PHP 7.4 or higher
  • WordPress AI Client plugin must be installed and activated
  • llama.cpp server running locally or on a remote host

Getting Started

What do I need before using this plugin?

You need a running llama.cpp server (local or remote) and the WordPress AI Client plugin installed and activated.

How do I install llama.cpp?

On macOS run brew install llama.cpp. For other platforms, see the official docs at https://llama-cpp.com/download/

Where do I get a model?

Download a GGUF model from Hugging Face. TinyLlama 1.1B (Q4_K_M, ~636 MB) is a good starting point for testing.

How do I start the server?

Run llama-server --models-dir ~/models. The server starts on http://127.0.0.1:8080 by default.

How do I connect WordPress to my llama.cpp server?

Go to Settings > llama.cpp and enter your server URL. Leave it blank to use the default (http://127.0.0.1:8080).

For full step-by-step setup instructions, see the Getting Started Guide.