235 Gemini Cli Tool Architecture
title: “235 Gemini Cli Tool Architecture” tags: [“kb”]
1. Overview
The Gemini CLI’s tool-use functionality is built on a robust and extensible architecture that separates tool definition, registration, and execution. The system is designed around a central ToolRegistry that manages all available tools, whether they are built-in “core” tools, dynamically discovered from the local project, or provided by remote Model Context Protocol (MCP) servers.
2. Core Components
The architecture is primarily composed of three key TypeScript files within the @google/gemini-cli-core package:
tools.ts: Defines the foundational interfaces and abstract classes, such asDeclarativeTool(the base class for all tools),ToolInvocation(a single, validated tool call), andToolBuilder.tool-registry.ts: Implements theToolRegistryclass, which acts as a central repository for all tool definitions. It handles both the programmatic registration of core tools and the dynamic discovery of custom tools.config.ts: Implements theConfigclass, which acts as a dependency injector. It is responsible for instantiating and holding the singleToolRegistryinstance, which is then provided to other parts of the application.
3. The Tool Lifecycle
The process of using a tool from user prompt to final output follows a clear, multi-step lifecycle:
-
Initialization & Registration:
- On application startup, a central
Configobject is created. - The
Configobject’screateToolRegistry()method is called. - This method instantiates the
ToolRegistry. - It then programmatically registers all built-in core tools (e.g.,
ReadFileTool,ShellTool,EditTool). - Finally, it calls
discoverAllTools()on the registry to find and register any project-specific tools via a shell command or remote MCP servers.
- On application startup, a central
-
Schema Generation:
- The
GeminiClient(incore/client.ts), which manages the chat session, retrieves the populatedToolRegistryfrom theConfigobject. - It calls
toolRegistry.getFunctionDeclarations()to get an array of all tool schemas in the format required by the Gemini API.
- The
-
API Request:
- The user’s prompt, the conversation history, and the array of tool schemas are sent to the Gemini API.
-
Tool Invocation:
- The model processes the request and, if it decides to use a tool, responds with a
functionCallcontaining the tool’s name and arguments. - The application layer (e.g., the UI’s
useToolSchedulerhook) receives thisfunctionCall.
- The model processes the request and, if it decides to use a tool, responds with a
-
Execution:
- The application layer uses the
Configobject to get theToolRegistry. - It looks up the requested tool by name in the registry to get the appropriate
DeclarativeToolinstance. - It calls the tool’s
build()method with the arguments provided by the model. This step validates the parameters and returns aToolInvocationobject. - It calls the
execute()method on theToolInvocationobject, which runs the actual tool logic.
- The application layer uses the
-
Response and Completion:
- The output from the tool’s execution is formatted into a
functionResponsepart. - This
functionResponseis sent back to the Gemini API in the next turn of the conversation. - The model uses the tool’s output to formulate its final, user-facing response.
- The output from the tool’s execution is formatted into a
This architecture effectively decouples the chat logic from the tool implementation, allowing for a flexible system where tools can be added, removed, or discovered without altering the core conversational flow.