
This plugin provides llama.cpp integration for the WordPress AI Client. It lets WordPress sites use large language models running via a llama.cpp server for text generation and other AI capabilities.
llama.cpp exposes an OpenAI-compatible API, and this provider uses that API to communicate with any GGUF model loaded into your llama.cpp server.
Features:
Requirements:
You need a running llama.cpp server (local or remote) and the WordPress AI Client plugin installed and activated.
On macOS run brew install llama.cpp. For other platforms, see the official docs at https://llama-cpp.com/download/
Download a GGUF model from Hugging Face. TinyLlama 1.1B (Q4_K_M, ~636 MB) is a good starting point for testing.
Run llama-server --models-dir ~/models. The server starts on http://127.0.0.1:8080 by default.
Go to Settings > llama.cpp and enter your server URL. Leave it blank to use the default (http://127.0.0.1:8080).
For full step-by-step setup instructions, see the Getting Started Guide.