pal-mcp-server
The power of Claude Code / GeminiCLI / CodexCLI + [Gemini / OpenAI / OpenRouter / Azure / Grok / Ollama / Custom Model / All Of The Above] working as one.
A Python MCP server that lets a single AI coding tool like Claude Code or Gemini CLI call multiple AI models, Gemini, OpenAI, Grok, Ollama, and others, within the same workflow.
PAL MCP Server is a Python project that acts as a bridge between AI developer tools and multiple AI model providers. If you already use a command-line tool like Claude Code, Gemini CLI, or Codex CLI to help you write software, PAL lets that single tool call on several different AI models within the same workflow, rather than being locked to one.
The core idea is that different AI models have different strengths. One might handle very large amounts of code better, another might reason through problems more carefully, and a local model might let you work without sending data anywhere. PAL connects your chosen CLI to all of them at once, passing context back and forth so each model sees what the others said. You can ask for a code review from two different models and get a combined report, or have one model plan the work and another carry it out.
A tool called clink goes a step further by letting one AI CLI actually launch another CLI as a subprocess. Claude Code can spawn a Codex subagent to investigate something in isolation, get back a summary, and continue without filling its own memory with the details of that side investigation. This is useful for large tasks where you want to keep the main workspace uncluttered.
PAL is set up and run as an MCP server, which is a standard way for AI tools to call external capabilities. Installation involves Python, a configuration file, and API keys for whichever model providers you want to use. Once running, your AI CLI of choice can route requests to Gemini, OpenAI, Grok, Azure, Ollama, or others transparently.
The project positions itself as coordination glue rather than a replacement for any particular AI tool. You stay in control of the workflow; PAL just makes it practical to bring in the right model for each part of a larger task.
Where it fits
- Let Claude Code call Gemini or GPT-4 for a second opinion on the same code without switching tools.
- Use the clink tool to have Claude Code spawn a Codex subagent for an isolated investigation and get back only the summary.
- Route different parts of a large coding task to the model best suited for each, one for planning, another for execution, all from one CLI.