agents-cli
Scaffold, develop, evaluate, and deploy AI agents with Google ADK. Bundles 7 skills covering the full agent development lifecycle.
google-agents-cli-adk-code
ADK API quick reference — agent types, tools, callbacks, orchestration, and state management.
# ADK Cheatsheet
> **Before using this skill**, activate `/google-agents-cli-workflow` first — it contains the required development phases and scaffolding steps.
## Prerequisites
1. Run `agents-cli info` — if it shows project config, skip to the cheatsheet below
2. If no project exists: run `agents-cli scaffold create <name>`
3. If user has existing code: run `agents-cli scaffold enhance .`
Do NOT write agent code until a project is scaffolded.
> **Python only for now.** This cheatsheet currently covers the Python ADK SDK.
> Support for other languages is coming soon.
## Quick Reference — Most Common Patterns
### Agent Creation
```python
from google.adk.agents import Agent
root_agent = Agent(
name="my_agent",
model="gemini-flash-latest",
instruction="You are a helpful assistant that ...",
tools=[my_tool],
)
```
> **NEVER change an existing agent's `model=` value unless the user explicitly asks.** If a Gemini model returns a 404, it's almost always a `GOOGLE_CLOUD_LOCATION` issue — run the listing command to verify availability before changing anything. For model docs, fetch `https://adk.dev/agents/models/google-gemini/index.md`.
> ```bash
> uv run --with google-genai python -c "
> from google import genai
> client = genai.Client(vertexai=True, location='global')
> for m in client.models.list(): print(m.name)
> "
> ```
### Basic Tool
```python
from google.adk.tools import FunctionTool
def get_weather(city: str) -> dict:
"""Get current weather for a city."""
return {"city": city, "temp": "22°C", "condition": "sunny"}
weather_tool = FunctionTool(func=get_weather)
```
> **ADK built-in tool imports:** Import the tool instance, not the module.
> ```python
> from google.adk.tools.load_web_page import load_web_page # CORRECT
> from google.adk.tools import load_web_page # WRONG
> ```
### Simple Callback
```python
from google.adk.agents.callback_context import CallbackContext
async def initialize_state(callback_context: CallbackContext) -> None:
state = callback_context.state
if "history" not in state:
state["history"] = []
root_agent = Agent(
name="my_agent",
model="gemini-flash-latest",
instruction="...",
before_agent_callback=initialize_state,
)
```
---
## Reference Files
| File | When to read |
|------|-------------|
| `references/adk-python.md` | ADK Python API — agents, tools, callbacks, plugins, state, artifacts, context caching, session rewind |
| `references/adk-2.0.md` | **ADK 2.0 Workflow API (experimental pre-GA, opt-in)** — graph-based workflows, nodes, edges, HITL, parallel processing |
### ADK 2.0 Workflows — Opt-in
> **The Workflow API is experimental, pre-GA (ADK 2.0).** Do NOT recommend it by default.
>
> **Before suggesting workflow patterns**, explain the following to the user and ask if they want to proceed:
>
> 1. **What it is**: ADK 2.0 introduces a graph-based Workflow API — nodes (functions, LLM agents, tools) connected by edges with conditional routing, fan-out/fan-in parallelism, and human-in-the-loop interrupts.
> 2. **When it helps**: Complex multi-step pipelines needing deterministic control flow, parallel processing of list items, structured approval gates, or retry logic — cases where SequentialAgent/ParallelAgent/LoopAgent feel limiting.
> 3. **Risks**: Pre-GA — APIs may change before GA. Requires `google-adk >= 2.0.0` and **Python >= 3.11**. Incompatible with Live Streaming. Scaffolded projects need `pyproject.toml` changes before upgrade — see the reference file for step-by-step instructions.
>
> **Only read `references/adk-2.0.md` after the user explicitly opts in.** If they decline or are unsure, use the standard ADK 1.x orchestration patterns from `references/adk-python.md` (SequentialAgent, ParallelAgent, LoopAgent, BaseAgent).
## ADK Documentation
For the ADK docs index (titles and URLs for fetching documentation pages), use `curl https://adk.dev/llms.txt`.
## Related Skills
- `/google-agents-cli-workflow` — Development workflow, coding guidelines, and operational rules
- `/google-agents-cli-scaffold` — Project creation and enhancement with `agents-cli scaffold create` / `scaffold enhance`
- `/google-agents-cli-eval` — Evaluation methodology, evalset schema, and the eval-fix loop
- `/google-agents-cli-deploy` — Deployment targets, CI/CD pipelines, and production workflowsgoogle-agents-cli-deploy
Deploy ADK agents to Agent Runtime, Cloud Run, or GKE with CI/CD, secrets, and rollback.
# ADK Deployment Guide
> **Requires:** `agents-cli` (`uv tool install google-agents-cli`) — [install uv](https://docs.astral.sh/uv/getting-started/installation/index.md) first if needed.
> Prefer using the `agents-cli` commands throughout this guide — they wrap Terraform, Docker, and deployment into a tested pipeline. If your project isn't scaffolded yet, see `/google-agents-cli-scaffold` to add deployment support first.
### Reference Files
For deeper details, consult these reference files in `references/`:
- **`cloud-run.md`** — Scaling defaults, Dockerfile, session types, networking
- **`agent-runtime.md`** — deploy.py CLI, AdkApp pattern, Terraform resource, deployment metadata, CI/CD differences
- **`gke.md`** — GKE Autopilot cluster, Kubernetes manifests, Workload Identity, session types, networking
- **`terraform-patterns.md`** — Custom infrastructure, IAM, state management, importing resources
- **`batch-inference.md`** — BigQuery Remote Function trigger; for Pub/Sub / Eventarc see `/google-agents-cli-adk-code`
- **`cicd-pipeline.md`** — Full CI/CD pipeline setup, `infra cicd` flags, runner comparison, WIF auth, pipeline stages
- **`testing-deployed-agents.md`** — Testing instructions per deployment target, curl examples, load tests
> **Observability:** See the `/google-agents-cli-observability` skill for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.
---
## Deployment Target Decision Matrix
Choose the right deployment target based on your requirements:
| Criteria | Agent Runtime | Cloud Run | GKE |
|----------|-------------|-----------|-----|
| **Languages** | Python | Python | Python (+ others via custom containers) |
| **Scaling** | Managed auto-scaling (configurable min/max, concurrency) | Fully configurable (min/max instances, concurrency, CPU allocation) | Full Kubernetes scaling (HPA, VPA, node auto-provisioning) |
| **Networking** | VPC-SC and PSC supported | Full VPC support, direct VPC egress, IAP, ingress rules | Full Kubernetes networking|
| **Session state** | Native `VertexAiSessionService` (persistent, managed) | In-memory (dev), Cloud SQL, or Agent Platform Sessions backend | In-memory (dev), Cloud SQL, or Agent Platform Sessions backend |
| **Batch/event processing** | Not supported | Native trigger endpoints (Pub/Sub, Eventarc); see `/google-agents-cli-adk-code` | Custom (Kubernetes Jobs, Pub/Sub) |
| **Cost model** | vCPU-iours + memory-iours (not billed when idle) | Per-instance-second + min instance costs | Node pool costs (always-on or auto-provisioned) |
| **Setup complexity** | Lower (managed, purpose-built for agents) | Medium (Dockerfile, Terraform, networking) | Higher (Kubernetes expertise required) |
| **Best for** | Managed infrastructure, minimal ops | Custom infra, event-driven workloads | Full Kubernetes control |
**Ask the user** which deployment target fits their needs. Each is a valid production choice with different trade-offs.
> **Product name mapping:** "Agent Engine" / "Vertex AI Agent Engine" is now **Agent Runtime**. Use `--deployment-target agent_runtime`.
> **Ambient / scheduled / event-driven agents:** Agent Runtime does not support Pub/Sub, Eventarc, or Cloud Scheduler triggers. Use **Cloud Run** (recommended) or **GKE** for these workloads. See `/google-agents-cli-adk-code` Section 12 for the `trigger_sources` pattern.
> **OAuth / user consent agents:** Use **Agent Runtime** with Gemini Enterprise for agents that need OAuth 2.0 user consent (e.g., accessing Google Drive, Calendar, or other user-scoped APIs). Cloud Run does not currently support managed OAuth flows. See the `adk-ae-oauth` sample in `/google-agents-cli-workflow` Phase 2.
---
## Deploying to Dev
### Deploy Workflow
**Task tracking:** Deployment involves multiple sequential steps (infra setup, CI/CD configuration, deploy, verification). Use a task list to track progress through these steps — skipping one often causes failures in later steps that are hard to trace back.
1. If prototype (no deployment target), first enhance: `agents-cli scaffold enhance . --deployment-target <target>`
2. **Notify the human**: "Eval scores meet thresholds and tests pass. Ready to deploy to dev?"
3. **Wait for explicit approval**
4. Once approved: `agents-cli deploy`
> **Agent Runtime timeout recovery:** Agent Runtime deploys can take 5-10 minutes and may exceed command timeouts. If the deploy command is cancelled or times out, the deployment continues server-side. Run `agents-cli deploy --status` to check progress — poll every 60 seconds until it reports completion or failure.
**IMPORTANT**: Never run `agents-cli deploy` without explicit human approval.
> **Do NOT run `agents-cli infra single-project` before deploying.** It is not a prerequisite — `agents-cli deploy` works on its own. Run it separately if the user needs observability features (prompt-response logging, BigQuery analytics) — see `/google-agents-cli-observability`.
### Single-Project Infrastructure Setup (Optional — Advanced)
`agents-cli infra single-project` runs `terraform apply` in `deployment/terraform/single-project/`. Use this to **provision single-project GCP infrastructure without CI/CD** (service accounts, IAM bindings, telemetry resources, Artifact Registry). Also useful to test things in a single project before going to production. It is NOT required for deploying.
```bash
# Optional — provision infrastructure in a single GCP project
agents-cli infra single-project
```
> **Note:** `agents-cli deploy` doesn't automatically use the Terraform-created `app_sa`. Pass the service account via `agents-cli deploy --service-account SA_EMAIL` or `uv run -m app.app_utils.deploy --service-account SA_EMAIL` for Agent Runtime targets.
### Deploy Flag Reference
| Flag | Description | Targets |
|------|-------------|---------|
| `--project` | GCP project ID | All |
| `--region` | GCP region | All |
| `--service-account` | Service account email for the deployed agent | All |
| `--secrets` | Comma-separated `ENV=SECRET` or `ENV=SECRET:VERSION` pairs | Agent Runtime |
| `--update-env-vars` | Comma-separated `KEY=VALUE` environment variables | Agent Runtime, Cloud Run |
| `--agent-identity` | Enable [agent identity](https://docs.cloud.google.com/gemini-enterprise-agent-platform/scale/runtime/agent-identity) (Preview) | Agent Runtime |
| `--memory` | Memory limit (default: `4Gi`) | Cloud Run |
| `--port` | Container port | Cloud Run |
| `--iap` | Enable Identity-Aware Proxy | Cloud Run |
| `--image` | Container image URI (skips source build) | Cloud Run, GKE |
| `--no-wait` | Start deployment and return immediately | Agent Runtime, Cloud Run |
| `--status` | Check the status of a pending `--no-wait` deployment | Agent Runtime, Cloud Run |
| `--list` | List existing deployments and exit | All |
| `--dry-run` / `-n` | Print what would be executed without running it | All |
| `--no-confirm-project` | Skip project confirmation prompt | All |
Run `agents-cli deploy --help` for the full flag reference.
> **Advanced Cloud Run Deploys:** If you need features not exposed via `agents-cli` flags, use `--dry-run` (or `-n`) to print the full `gcloud` command, copy it, and add additional arguments as needed.
> **Project Confirmation:** If the project is resolved automatically (not passed via `--project`), the command will prompt for confirmation in interactive mode. Since agents typically run in non-interactive mode, you MUST pass `--no-confirm-project` to proceed if you are relying on automatic project resolution.
---
## Production Deployment — CI/CD Pipeline
For the full CI/CD pipeline setup guide — prerequisites, `infra cicd` flags, runner comparison, WIF authentication, pipeline stages, and production approval — see `references/cicd-pipeline.md`.
---
## Cloud Run Specifics
For detailed infrastructure configuration (scaling defaults, Dockerfile, FastAPI endpoints, session types, networking), see `references/cloud-run.md`. For ADK docs on Cloud Run deployment, fetch `https://adk.dev/deploy/cloud-run/index.md`.
For event-driven / ambient agent deployment on Cloud Run, see the [`ambient-expense-agent`](https://github.com/google/adk-samples/tree/main/python/agents/ambient-expense-agent) sample and `/google-agents-cli-adk-code` for the `trigger_sources` pattern.
---
## Agent Runtime Specifics
Agent Runtime is a managed Vertex AI service for deploying Python ADK agents. Uses source-based deployment (no Dockerfile) via `deploy.py` and the `AdkApp` class.
> **No `gcloud` CLI exists for Agent Runtime.** Deploy via `agents-cli deploy` or `deploy.py`. Query via the Python `vertexai.Client` SDK.
Deployments can take 5-10 minutes. Use `--no-wait` to start a deployment and return immediately, then check on it later with `--status`:
```bash
# Start deployment without blocking
agents-cli deploy --no-wait
# Check on progress later
agents-cli deploy --status
```
When `--status` detects the operation has completed, it writes `deployment_metadata.json` and prints the same success output as a normal deploy.
For detailed infrastructure configuration (deploy.py flags, AdkApp pattern, Terraform resource, deployment metadata, session/artifact services, CI/CD differences), see `references/agent-runtime.md`. For ADK docs on Agent Runtime deployment, fetch `https://adk.dev/deploy/agent-runtime/index.md`.
---
## GKE Specifics
For detailed infrastructure configuration (Kubernetes manifests, Terraform resources, Workload Identity, session types, networking), see `references/gke.md`. For ADK docs on GKE deployment, fetch `https://adk.dev/deploy/gke/index.md`.
---
## Service Account Architecture
Scaffolded projects use two service accounts:
- **`app_sa`** (per environment) — Runtime identity for the deployed agent. Roles defined in `deployment/terraform/iam.tf`.
- **`cicd_runner_sa`** (CI/CD project) — CI/CD pipeline identity (GitHub Actions / Cloud Build). Lives in the CI/CD project (defaults to prod project), needs permissions in **both** staging and prod projects.
Check `deployment/terraform/iam.tf` for exact role bindings. Cross-project permissions (Cloud Run service agents, artifact registry access) are also configured there.
**Common 403 errors:**
- "Permission denied on Cloud Run" → `cicd_runner_sa` missing deployment role in the target project
- "Cannot act as service account" → Missing `iam.serviceAccountUser` binding on `app_sa`
- "Secret access denied" → `app_sa` missing `secretmanager.secretAccessor`
- "Cloud SQL connection failed / Not authorized" → Runtime service account missing `roles/cloudsql.client`
- "Artifact Registry read denied" → Cloud Run service agent missing read access in CI/CD project
---
## Required Permissions for CI/CD Setup
- **`roles/secretmanager.admin`** granted to the Cloud Build service account (`service-<PROJECT_NUMBER>@gcp-sa-cloudbuild.iam.gserviceaccount.com`) in the CI/CD project. This allows Cloud Build to access the GitHub token stored in Secret Manager.
---
## Required APIs
The following Google Cloud APIs must be enabled in your project for the skills and deployment to work:
- **`cloudbuild.googleapis.com`** — Required for building container images and running CI/CD pipelines.
- **`secretmanager.googleapis.com`** — Required for managing secrets and API keys.
- **`run.googleapis.com`** — Required for deploying to Cloud Run.
Ensure these are enabled before running deployment or CI/CD setup commands:
```bash
gcloud services enable cloudbuild.googleapis.com secretmanager.googleapis.com run.googleapis.com --project=YOUR_PROJECT_ID
```
---
## Secret Manager (for API Credentials)
Instead of passing sensitive keys as environment variables, use GCP Secret Manager.
```bash
# Create a secret
echo -n "YOUR_API_KEY" | gcloud secrets create MY_SECRET_NAME --data-file=-
# Update an existing secret
echo -n "NEW_API_KEY" | gcloud secrets versions add MY_SECRET_NAME --data-file=-
```
**Grant access:** For Cloud Run, grant `secretmanager.secretAccessor` to `app_sa`. For Agent Runtime, grant it to the platform-managed SA (`service-PROJECT_NUMBER@gcp-sa-aiplatform-re.iam.gserviceaccount.com`). For GKE, grant `secretmanager.secretAccessor` to `app_sa`. Access secrets via Kubernetes Secrets or directly via the Secret Manager API with Workload Identity.
**Pass secrets at deploy time (Agent Runtime):**
```bash
agents-cli deploy --secrets "API_KEY=my-api-key,DB_PASS=db-password:2"
```
Format: `ENV_VAR=SECRET_ID` or `ENV_VAR=SECRET_ID:VERSION` (defaults to latest). Access in code via `os.environ.get("API_KEY")`.
---
## Cloud SQL Permissions (Manual Deployment)
When using Cloud SQL with Cloud Run in a **manual deployment** (e.g., adding `--add-cloudsql-instances` in non-Terraform setups), you must manually grant the `Cloud SQL Client` role to the runtime service account.
Without this, the deployment may succeed but fail at runtime with `cloudsql.instances.get` authorization errors.
```bash
gcloud projects add-iam-policy-binding YOUR_PROJECT_ID \
--member="serviceAccount:YOUR_RUNTIME_SA_EMAIL" \
--role="roles/cloudsql.client"
```
> **Note:** In full Terraform-managed setups (`infra cicd` / `infra single-project`), this role is configured and managed automatically.
---
## Observability
See the **agents-cli-observability** skill for observability configuration (Cloud Trace, prompt-response logging, BigQuery Analytics, third-party integrations).
---
## Testing Your Deployed Agent
The quickest way to test a deployed agent is `agents-cli run --url <service-url> --mode <a2a|adk> "your prompt"` — it handles auth, sessions, and streaming automatically (supports Agent Runtime and Cloud Run).
For advanced testing (custom headers, session reuse, scripting, load tests), see `references/testing-deployed-agents.md`.
---
## Deploying with a UI (IAP)
IAP (Identity-Aware Proxy) secures a Cloud Run service so only authorized Google accounts can access it. Support for IAP deployment via `agents-cli deploy` is planned for a future release.
For Agent Runtime with a custom frontend, use a **decoupled deployment** — deploy the frontend separately to Cloud Run or Cloud Storage, connecting to the Agent Runtime backend API.
For more information on IAP with Cloud Run, see the [Cloud Console IAP settings](https://cloud.google.com/run/docs/securing/identity-aware-proxy-cloud-run#manage_user_or_group_access).
---
## Rollback & Recovery
The primary rollback mechanism is **git-based**: fix the issue, commit, and push to `main`. The CI/CD pipeline will automatically build and deploy the new version through staging → production.
For immediate Cloud Run rollback without a new commit, use revision traffic shifting:
```bash
gcloud run revisions list --service=SERVICE_NAME --region=REGION
gcloud run services update-traffic SERVICE_NAME \
--to-revisions=REVISION_NAME=100 --region=REGION
```
Agent Runtime doesn't support revision-based rollback — fix and redeploy via `agents-cli deploy`.
For GKE rollback, use `kubectl rollout undo`:
```bash
kubectl rollout undo deployment/DEPLOYMENT_NAME -n NAMESPACE
kubectl rollout status deployment/DEPLOYMENT_NAME -n NAMESPACE
```
---
## Custom Infrastructure (Terraform)
**CRITICAL**: When your agent requires custom infrastructure (Cloud SQL, Pub/Sub, Eventarc, BigQuery, etc.), you MUST define it in Terraform — never create resources manually via `gcloud` commands. Exception: quick experimentation is fine with `gcloud` or console, but production infrastructure must be in Terraform.
For custom infrastructure patterns, consult `references/terraform-patterns.md` for:
- Where to put custom Terraform files (single-project vs CI/CD)
- Resource examples (Pub/Sub, BigQuery, Eventarc triggers)
- IAM bindings for custom resources
- Terraform state management (remote vs local, importing resources)
- Common infrastructure patterns
---
## Troubleshooting
| Issue | Solution |
|-------|----------|
| Terraform state locked | `terraform force-unlock -force LOCK_ID` in deployment/terraform/ |
| GitHub Actions auth failed | Re-run `terraform apply` in CI/CD terraform dir; verify WIF pool/provider |
| Cloud Build authorization pending | Use `github_actions` runner instead |
| Resource already exists | `terraform import` (see `references/terraform-patterns.md`) |
| Agent Runtime deploy timeout / hangs | Deployments take 5-10 min; check if engine was created (see Agent Runtime Specifics) |
| Secret not available | Verify `secretAccessor` granted to `app_sa` (not the default compute SA) |
| Cloud SQL connection failed / 403 | Grant `roles/cloudsql.client` to the runtime service account when using manual deployments |
| 403 on deploy | Check `deployment/terraform/iam.tf` — `cicd_runner_sa` needs deployment + SA impersonation roles in the target project |
| 403 when testing Cloud Run | Default is `--no-allow-unauthenticated`; include `Authorization: Bearer $(gcloud auth print-identity-token)` header |
| Cold starts too slow | Set `min_instance_count > 0` in Cloud Run Terraform config |
| Cloud Run 503 errors | Check resource limits (memory/CPU), increase `max_instance_count`, or check container crash logs |
| 403 right after granting IAM role | IAM propagation is not instant — wait a couple of minutes before retrying. Don't keep re-granting the same role |
| Resource seems missing but Terraform created it | Run `terraform state list` to check what Terraform actually manages. Resources created via `null_resource` + `local-exec` (e.g., BQ linked datasets) won't appear in `gcloud` CLI output |
| Deployment failed or agent not responding | Check Cloud Logging: `gcloud logging read "resource.type=cloud_run_revision AND resource.labels.service_name=SERVICE" --project=PROJECT --limit=50 --format="table(timestamp,severity,textPayload)"` for Cloud Run, or `gcloud logging read "resource.type=aiplatform.googleapis.com/ReasoningEngine" --project=PROJECT --limit=50` for Agent Runtime |
| Agent returns errors after deploy | Open Cloud Logging in Console → filter by service name (Cloud Run) or reasoning engine resource (Agent Runtime) → look for Python tracebacks or permission errors in recent log entries |
---
## Platform Registration
For registering deployed agents with Gemini Enterprise, see `/google-agents-cli-publish`.
---
## Related Skills
- `/google-agents-cli-workflow` — Development workflow, coding guidelines, and operational rules
- `/google-agents-cli-adk-code` — ADK Python API quick reference for writing agent code
- `/google-agents-cli-eval` — Evaluation methodology, evalset schema, and the eval-fix loop
- `/google-agents-cli-scaffold` — Project creation and enhancement with `agents-cli scaffold create` / `scaffold enhance`
- `/google-agents-cli-observability` — Cloud Trace, logging, BigQuery Analytics, and third-party integrations
- `/google-agents-cli-publish` — Gemini Enterprise registrationgoogle-agents-cli-eval
Run and debug ADK agent evaluations — evalsets, LLM-as-judge, tool trajectory scoring, and the eval-fix loop.
# ADK Evaluation Guide
> **Requires:** `agents-cli` (`uv tool install google-agents-cli`) — [install uv](https://docs.astral.sh/uv/getting-started/installation/index.md) first if needed.
> **Scaffolded project?** If you used `/google-agents-cli-scaffold`, you already have `agents-cli eval run`, `tests/eval/evalsets/`, and `tests/eval/eval_config.json`. Start with `agents-cli eval run` and iterate from there.
## Reference Files
| File | Contents |
|------|----------|
| `references/criteria-guide.md` | Complete metrics reference — all 8 criteria, match types, custom metrics, judge model config |
| `references/user-simulation.md` | Dynamic conversation testing — ConversationScenario, user simulator config, compatible metrics |
| `references/builtin-tools-eval.md` | google_search and model-internal tools — trajectory behavior, metric compatibility |
| `references/multimodal-eval.md` | Multimodal inputs — evalset schema, built-in metric limitations, custom evaluator pattern |
---
## The Eval-Fix Loop
Evaluation is iterative. When a score is below threshold, diagnose the cause, fix it, rerun — don't just report the failure.
### How to iterate
1. **Start small**: Begin with 1-2 eval cases, not the full suite
2. **Run eval**: `agents-cli eval run`
3. **Read the scores** — identify what failed and why
4. **Fix the code** — adjust prompts, tool logic, instructions, or the evalset
5. **Rerun eval** — verify the fix worked
6. **Repeat steps 3-5** until the case passes
7. **Only then** add more eval cases and expand coverage
**Expect 5-10+ iterations.** This is normal — each iteration makes the agent better.
**Task tracking:** When doing 5+ eval-fix iterations, use a task list to track which cases you've fixed, which are still failing, and what you've tried. This prevents re-attempting the same fix or losing track of regression across iterations.
### Shortcuts That Waste Time
Recognize these rationalizations and push back — they always cost more time than they save:
| Shortcut | Why it fails |
|----------|-------------|
| "I'll tune the eval thresholds down to make it pass" | Lowering thresholds hides real failures. If the agent can't meet the bar, fix the agent — don't move the bar. |
| "This eval case is flaky, I'll skip it" | Flaky evals reveal non-determinism in your agent. Fix with `temperature=0`, rubric-based metrics, or more specific instructions — don't delete the signal. |
| "I just need to fix the evalset, not the agent" | If you're always adjusting expected outputs, your agent has a behavior problem. Fix the instructions or tool logic first. |
### What to fix when scores fail
| Failure | What to change |
|---------|---------------|
| `tool_trajectory_avg_score` low | Fix agent instructions (tool ordering), update evalset `tool_uses`, or switch to `IN_ORDER`/`ANY_ORDER` match type |
| `response_match_score` low | Adjust agent instruction wording, or relax the expected response |
| `final_response_match_v2` low | Refine agent instructions, or adjust expected response — this is semantic, not lexical |
| `rubric_based` score low | Refine agent instructions to address the specific rubric that failed |
| `hallucinations_v1` low | Tighten agent instructions to stay grounded in tool output |
| Agent calls wrong tools | Fix tool descriptions, agent instructions, or tool_config |
| Agent calls extra tools | Use `IN_ORDER`/`ANY_ORDER` match type, add strict stop instructions, or switch to `rubric_based_tool_use_quality_v1` |
---
## Choosing the Right Criteria
| Goal | Recommended Metric |
|------|--------------------|
| Regression testing / CI/CD (fast, deterministic) | `tool_trajectory_avg_score` + `response_match_score` |
| Semantic response correctness (flexible phrasing OK) | `final_response_match_v2` |
| Response quality without reference answer | `rubric_based_final_response_quality_v1` |
| Validate tool usage reasoning | `rubric_based_tool_use_quality_v1` |
| Detect hallucinated claims | `hallucinations_v1` |
| Safety compliance | `safety_v1` |
| Dynamic multi-turn conversations | User simulation + `hallucinations_v1` / `safety_v1` (see `references/user-simulation.md`) |
| Multimodal input (image, audio, file) | `tool_trajectory_avg_score` + custom metric for response quality (see `references/multimodal-eval.md`) |
For the complete metrics reference with config examples, match types, and custom metrics, see `references/criteria-guide.md`.
---
## Running Evaluations
```bash
# Scaffolded projects — agents-cli:
agents-cli eval run --evalset tests/eval/evalsets/my_evalset.json
# With explicit config file:
agents-cli eval run --evalset tests/eval/evalsets/my_evalset.json --config tests/eval/eval_config.json
# Run all evalsets in tests/eval/evalsets/:
agents-cli eval run --all
```
**`agents-cli eval run` options:** `--evalset PATH`, `--config PATH`, `--all`
**Compare two result files:**
```bash
agents-cli eval compare baseline.json candidate.json
```
---
## Configuration Schema (`eval_config.json`)
Both camelCase and snake_case field names are accepted (Pydantic aliases). The examples below use snake_case, matching the official ADK docs.
### Full example
```json
{
"criteria": {
"tool_trajectory_avg_score": {
"threshold": 1.0,
"match_type": "IN_ORDER"
},
"final_response_match_v2": {
"threshold": 0.8,
"judge_model_options": {
"judge_model": "gemini-flash-latest",
"num_samples": 5
}
},
"rubric_based_final_response_quality_v1": {
"threshold": 0.8,
"rubrics": [
{
"rubric_id": "professionalism",
"rubric_content": { "text_property": "The response must be professional and helpful." }
},
{
"rubric_id": "safety",
"rubric_content": { "text_property": "The agent must NEVER book without asking for confirmation." }
}
]
}
}
}
```
Simple threshold shorthand is also valid: `"response_match_score": 0.8`
For custom metrics, `judge_model_options` details, and `user_simulator_config`, see `references/criteria-guide.md`.
---
## EvalSet Schema (`evalset.json`)
```json
{
"eval_set_id": "my_eval_set",
"name": "My Eval Set",
"description": "Tests core capabilities",
"eval_cases": [
{
"eval_id": "search_test",
"conversation": [
{
"invocation_id": "inv_1",
"user_content": { "parts": [{ "text": "Find a flight to NYC" }] },
"final_response": {
"role": "model",
"parts": [{ "text": "I found a flight for $500. Want to book?" }]
},
"intermediate_data": {
"tool_uses": [
{ "name": "search_flights", "args": { "destination": "NYC" } }
],
"intermediate_responses": [
["sub_agent_name", [{ "text": "Found 3 flights to NYC." }]]
]
}
}
],
"session_input": { "app_name": "my_app", "user_id": "user_1", "state": {} }
}
]
}
```
**Key fields:**
- `intermediate_data.tool_uses` — expected tool call trajectory (chronological order)
- `intermediate_data.intermediate_responses` — expected sub-agent responses (for multi-agent systems)
- `session_input.state` — initial session state (overrides Python-level initialization)
- `conversation_scenario` — alternative to `conversation` for user simulation (see `references/user-simulation.md`)
---
## Common Gotchas
### The Proactivity Trajectory Gap
LLMs often perform extra actions not asked for (e.g., `google_search` after `save_preferences`). This causes `tool_trajectory_avg_score` failures with `EXACT` match. Solutions:
1. **Use `IN_ORDER` or `ANY_ORDER` match type** — tolerates extra tool calls between expected ones
2. Include ALL tools the agent might call in your expected trajectory
3. Use `rubric_based_tool_use_quality_v1` instead of trajectory matching
4. Add strict stop instructions: "Stop after calling save_preferences. Do NOT search."
### Multi-turn conversations require tool_uses for ALL turns
The `tool_trajectory_avg_score` evaluates each invocation. If you don't specify expected tool calls for intermediate turns, the evaluation will fail even if the agent called the right tools.
```json
{
"conversation": [
{
"invocation_id": "inv_1",
"user_content": { "parts": [{"text": "Find me a flight from NYC to London"}] },
"intermediate_data": {
"tool_uses": [
{ "name": "search_flights", "args": {"origin": "NYC", "destination": "LON"} }
]
}
},
{
"invocation_id": "inv_2",
"user_content": { "parts": [{"text": "Book the first option"}] },
"final_response": { "role": "model", "parts": [{"text": "Booking confirmed!"}] },
"intermediate_data": {
"tool_uses": [
{ "name": "book_flight", "args": {"flight_id": "1"} }
]
}
}
]
}
```
### App name must match directory name
The `App` object's `name` parameter MUST match the directory containing your agent:
```python
# CORRECT - matches the "app" directory
app = App(root_agent=root_agent, name="app")
# WRONG - causes "Session not found" errors
app = App(root_agent=root_agent, name="flight_booking_assistant")
```
### The `before_agent_callback` Pattern (State Initialization)
Always use a callback to initialize session state variables used in your instruction template. This prevents `KeyError` crashes on the first turn:
```python
async def initialize_state(callback_context: CallbackContext) -> None:
state = callback_context.state
if "user_preferences" not in state:
state["user_preferences"] = {}
root_agent = Agent(
name="my_agent",
before_agent_callback=initialize_state,
instruction="Based on preferences: {user_preferences}...",
)
```
### Eval-State Overrides (Type Mismatch Danger)
Be careful with `session_input.state` in your evalset. It overrides Python-level initialization:
WRONG — initializes feedback_history as a string, breaks `.append()`:
```json
"state": { "feedback_history": "" }
```
CORRECT — matches the Python type (list):
```json
"state": { "feedback_history": [] }
```
### Model thinking mode may bypass tools
Models with "thinking" enabled may skip tool calls. Use `tool_config` with `mode="ANY"` to force tool usage, or switch to a non-thinking model for predictable tool calling.
---
## Common Eval Failure Causes
| Symptom | Cause | Fix |
|---------|-------|-----|
| Missing `tool_uses` in intermediate turns | Trajectory expects match per invocation | Add expected tool calls to all turns |
| Agent mentions data not in tool output | Hallucination | Tighten agent instructions; add `hallucinations_v1` metric |
| "Session not found" error | App name mismatch | Ensure App `name` matches directory name |
| Score fluctuates between runs | Non-deterministic model | Set `temperature=0` or use rubric-based eval |
| `tool_trajectory_avg_score` always 0 | Agent uses `google_search` (model-internal) | Remove trajectory metric; see `references/builtin-tools-eval.md` |
| Trajectory fails but tools are correct | Extra tools called | Switch to `IN_ORDER`/`ANY_ORDER` match type |
| LLM judge ignores image/audio in eval | `get_text_from_content()` skips non-text parts | Use custom metric with vision-capable judge (see `references/multimodal-eval.md`) |
---
## Deep Dive: ADK Docs
For the official evaluation documentation, fetch these pages:
- **Evaluation overview**: `https://adk.dev/evaluate/index.md`
- **Criteria reference**: `https://adk.dev/evaluate/criteria/index.md`
- **User simulation**: `https://adk.dev/evaluate/user-sim/index.md`
---
## Debugging Example
User says: "tool_trajectory_avg_score is 0, what's wrong?"
1. Check if agent uses `google_search` — if so, see `references/builtin-tools-eval.md`
2. Check if using `EXACT` match and agent calls extra tools — try `IN_ORDER`
3. Compare expected `tool_uses` in evalset with actual agent behavior
4. Fix mismatch (update evalset or agent instructions)
---
## Proving Your Work
Don't assert that eval passes — show the evidence. Concrete output prevents false confidence and catches issues early.
- **After running eval:** Paste the scores table output so the user can see exactly what passed and failed.
- **After fixing a failure:** Show before/after scores for the specific case you fixed, and confirm no other cases regressed.
- **Before declaring "eval passes":** Confirm ALL cases pass, not just the one you were working on. Run `agents-cli eval run` (or `agents-cli eval run --all`) one final time.
- **Before moving to deploy:** Show the final `agents-cli eval run` output with all cases above threshold. This is the gate — no exceptions.
---
## Related Skills
- `/google-agents-cli-workflow` — Development workflow and the spec-driven build-evaluate-deploy lifecycle
- `/google-agents-cli-adk-code` — ADK Python API quick reference for writing agent code
- `/google-agents-cli-scaffold` — Project creation and enhancement with `agents-cli scaffold create` / `scaffold enhance`
- `/google-agents-cli-deploy` — Deployment targets, CI/CD pipelines, and production workflows
- `/google-agents-cli-observability` — Cloud Trace, logging, and monitoring for debugging agent behaviorgoogle-agents-cli-observability
>
# ADK Observability Guide
> **Cloud Trace** works out of the box — no infrastructure needed. **Prompt-response logging** and **BigQuery Agent Analytics** require Terraform-provisioned infrastructure (service account, GCS bucket, BigQuery dataset). Run `agents-cli infra single-project --project PROJECT_ID` to provision these resources. See `references/cloud-trace-and-logging.md` for details, env vars, and verification commands. If your project isn't scaffolded yet, see `/google-agents-cli-scaffold` first.
### Order of operations for `agent_runtime` deployments
For `deployment_target = agent_runtime`, run `agents-cli infra single-project` **before** the first `agents-cli deploy`. The Terraform module owns the entire Reasoning Engine resource (display_name, service account, deployment spec, env vars), so applying it after a SDK-based deploy creates a state mismatch — Terraform has no record of the SDK-deployed instance and cannot layer env vars onto it without taking ownership of the whole resource.
If you have already run `agents-cli deploy`, you have two options:
1. **Switch to Terraform-managed.** Delete the SDK-deployed Reasoning Engine, then run `agents-cli infra single-project` followed by `agents-cli deploy`. Sessions and any in-flight state on the previous instance are lost.
2. **Keep the SDK-deployed instance.** Skip `infra single-project` and set the observability env vars on the running instance directly via the `vertexai` client `update` API. You will also need to grant the instance's service account the IAM permissions required to emit telemetry — writing to the logs GCS bucket, BigQuery dataset access, log writer, etc. See `deployment/terraform/single-project/iam.tf` and `telemetry.tf` in your scaffolded project for the full set of bindings the Terraform module would otherwise provision. Terraform-managed env vars are not available in this mode.
### Reference Files
| File | Contents |
|------|----------|
| `references/cloud-trace-and-logging.md` | Scaffolded project details — Terraform-provisioned resources, environment variables, verification commands, enabling/disabling locally |
| `references/bigquery-agent-analytics.md` | BQ Agent Analytics plugin — enabling, key features, GCS offloading, tool provenance |
---
## Observability Tiers
Choose the right level of observability based on your needs:
| Tier | What It Does | Scope | Default State | Best For |
|------|-------------|-------|---------------|----------|
| **Cloud Trace** | Distributed tracing — execution flow, latency, errors via OpenTelemetry spans | All templates, all environments | Always enabled | Debugging latency, understanding agent execution flow |
| **Prompt-Response Logging** | GenAI interactions exported to GCS, BigQuery, and Cloud Logging | ADK agents only | Disabled locally, enabled when deployed | Auditing LLM interactions, compliance |
| **BigQuery Agent Analytics** | Structured agent events (LLM calls, tool use, outcomes) to BigQuery | ADK agents with plugin enabled | Opt-in (`--bq-analytics` at scaffold time) | Conversational analytics, custom dashboards, LLM-as-judge evals |
| **Third-Party Integrations** | External observability platforms (AgentOps, Phoenix, MLflow, etc.) | Any ADK agent | Opt-in, per-provider setup | Team collaboration, specialized visualization, prompt management |
**Ask the user** which tier(s) they need — they can be combined. Cloud Trace is always on; the others are additive.
---
## Cloud Trace
ADK uses OpenTelemetry to emit distributed traces. Every agent invocation produces spans that track the full execution flow.
### Span Hierarchy
```
invocation
└── agent_run (one per agent in the chain)
├── call_llm (model request/response)
└── execute_tool (tool execution)
```
### Setup by Deployment Type
| Deployment | Setup |
|-----------|-------|
| **Agent Runtime** | Automatic — traces are exported to Cloud Trace by default |
| **Cloud Run (scaffolded)** | Automatic — `otel_to_cloud=True` in the FastAPI app |
| **GKE (scaffolded)** | Automatic — `otel_to_cloud=True` in the FastAPI app |
| **Cloud Run / GKE (manual)** | Configure OpenTelemetry exporter in your app |
| **Local dev** | Works with `agents-cli playground`; traces visible in Cloud Console |
View traces: **Cloud Console → Trace → Trace explorer**
For detailed setup instructions (Agent Runtime CLI/SDK, Cloud Run, custom deployments), fetch `https://adk.dev/integrations/cloud-trace/index.md`.
---
## Prompt-Response Logging
Captures GenAI interactions (model name, tokens, timing) and exports to GCS (JSONL) and BigQuery (via direct log sinks and external tables). Privacy-preserving by default — only metadata is logged unless explicitly configured otherwise.
Key env var: `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` — set to `NO_CONTENT` (metadata only, default in deployed envs), `true` (full content), or `false` (disabled). Logging is disabled locally unless `LOGS_BUCKET_NAME` is set.
For scaffolded project details (Terraform resources, env vars, privacy modes, enabling/disabling, verification commands), see `references/cloud-trace-and-logging.md`.
For ADK logging docs (log levels, configuration, debugging), fetch `https://adk.dev/observability/logging/index.md`.
---
## BigQuery Agent Analytics Plugin
Optional plugin that logs structured agent events to BigQuery. Enable with `--bq-analytics` at scaffold time. See `references/bigquery-agent-analytics.md` for details.
---
## Third-Party Integrations
ADK supports several third-party observability platforms. Each uses OpenTelemetry or custom instrumentation to capture agent behavior.
| Platform | Key Differentiator | Setup Complexity | Self-Hosted Option |
|----------|-------------------|-----------------|-------------------|
| **AgentOps** | Session replays, 2-line setup, replaces native telemetry | Minimal | No (SaaS) |
| **Arize AX** | Commercial platform, production monitoring, evaluation dashboards | Low | No (SaaS) |
| **Phoenix** | Open-source, custom evaluators, experiment testing | Low | Yes |
| **MLflow** | OTel traces to MLflow Tracking Server, span tree visualization | Medium (needs SQL backend) | Yes |
| **Monocle** | 1-call setup, VS Code Gantt chart visualizer | Minimal | Yes (local files) |
| **Weave** | W&B platform, team collaboration, timeline views | Low | No (SaaS) |
| **Freeplay** | Prompt management + evals + observability in one platform | Low | No (SaaS) |
**Ask the user** which platform they prefer — present the trade-offs and let them choose. For setup details, fetch the relevant ADK docs page from the Deep Dive table below.
---
## Troubleshooting
| Issue | Solution |
|-------|----------|
| No traces in Cloud Trace | Verify `otel_to_cloud=True` in FastAPI app; check service account has `cloudtrace.agent` role |
| Prompt-response data not appearing | Check `LOGS_BUCKET_NAME` is set; verify SA has `storage.objectCreator` on the bucket; check app logs for telemetry setup warnings |
| Privacy mode misconfigured | Check `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` value — use `NO_CONTENT` for metadata-only, `false` to disable |
| BigQuery Analytics not logging | Verify plugin is configured in `app/agent.py`; check `BQ_ANALYTICS_DATASET_ID` env var is set |
| Third-party integration not capturing spans | Check provider-specific env vars (API keys, endpoints); some providers (AgentOps) replace native telemetry |
| Traces missing tool spans | Tool execution spans appear under `execute_tool` — check trace explorer filters |
| High telemetry costs | Switch to `NO_CONTENT` mode; reduce BigQuery retention; disable unused tiers |
---
## Deep Dive: ADK Docs (WebFetch URLs)
For detailed documentation beyond what this skill covers, fetch these pages:
| Topic | URL |
|-------|-----|
| Observability overview | `https://adk.dev/observability/index.md` |
| Agent activity logging | `https://adk.dev/observability/logging/index.md` |
| Cloud Trace integration | `https://adk.dev/integrations/cloud-trace/index.md` |
| BigQuery Agent Analytics | `https://adk.dev/integrations/bigquery-agent-analytics/index.md` |
| AgentOps | `https://adk.dev/integrations/agentops/index.md` |
| Arize AX | `https://adk.dev/integrations/arize-ax/index.md` |
| Phoenix (Arize) | `https://adk.dev/integrations/phoenix/index.md` |
| MLflow tracing | `https://adk.dev/integrations/mlflow-tracing/index.md` |
| Monocle | `https://adk.dev/integrations/monocle/index.md` |
| W&B Weave | `https://adk.dev/integrations/weave/index.md` |
| Freeplay | `https://adk.dev/integrations/freeplay/index.md` |
---
## Related Skills
- `/google-agents-cli-deploy` — Deployment targets, CI/CD pipelines, and production workflows
- `/google-agents-cli-workflow` — Development workflow, coding guidelines, and operational rules
- `/google-agents-cli-adk-code` — ADK Python API quick reference for writing agent codegoogle-agents-cli-publish
Register ADK or A2A agents with Gemini Enterprise via agents-cli publish gemini-enterprise.
# Gemini Enterprise Registration
> **Requires:** A deployed agent. For Agent Runtime, `deployment_metadata.json` (created by `agents-cli deploy`) enables auto-detection. For Cloud Run or GKE, provide the agent card URL and flags directly.
## Prerequisites
1. **Agent must be deployed** — the agent must be running and reachable
2. **Gemini Enterprise app must exist** — Create one in Google Cloud Console → Gemini Enterprise → Apps before registering
3. **`deployment_metadata.json`** (Agent Runtime only) — Created automatically by `agents-cli deploy`; contains the agent runtime ID, deployment target, and A2A flag
## Required Permissions for A2A on Cloud Run
- **`roles/run.servicesInvoker`** granted to the Discovery Engine service account (`service-<PROJECT_NUMBER>@gcp-sa-discoveryengine.iam.gserviceaccount.com`) on the Cloud Run service.
---
## Registration Modes
### ADK Registration (default)
For standard ADK agents deployed to Agent Runtime. The agent is registered directly via its reasoning engine resource name.
```bash
agents-cli publish gemini-enterprise \
--agent-runtime-id projects/123456/locations/us-east1/reasoningEngines/789 \
--gemini-enterprise-app-id projects/123456/locations/global/collections/default_collection/engines/my-app \
--display-name "My Agent" \
--description "Handles customer queries" \
--tool-description "Answers questions about products"
```
### A2A Registration
For agents using the Agent-to-Agent protocol. Requires an agent card URL — the command fetches the card and registers it.
```bash
# A2A on Cloud Run
agents-cli publish gemini-enterprise \
--registration-type a2a \
--agent-card-url https://my-service-abc123.us-east1.run.app/a2a/app/.well-known/agent-card.json \
--gemini-enterprise-app-id projects/123456/locations/global/collections/default_collection/engines/my-app \
--display-name "My A2A Agent"
# A2A on Agent Runtime (card URL is auto-constructed from metadata)
agents-cli publish gemini-enterprise \
--registration-type a2a \
--gemini-enterprise-app-id projects/123456/locations/global/collections/default_collection/engines/my-app
```
---
## Programmatic Mode (CI/CD)
The command is non-interactive by default — pass all required values via flags or environment variables. This makes it safe for CI/CD pipelines.
### Via flags
```bash
agents-cli publish gemini-enterprise \
--agent-runtime-id "$AGENT_RUNTIME_ID" \
--gemini-enterprise-app-id "$GEMINI_ENTERPRISE_APP_ID" \
--display-name "Production Agent" \
--registration-type adk
```
### Via environment variables
Every flag has an env var alternative:
```bash
export AGENT_RUNTIME_ID="projects/123456/locations/us-east1/reasoningEngines/789"
export GEMINI_ENTERPRISE_APP_ID="projects/123456/locations/global/collections/default_collection/engines/my-app"
export GEMINI_DISPLAY_NAME="Production Agent"
export GEMINI_DESCRIPTION="Handles customer queries"
agents-cli publish gemini-enterprise
```
---
## Interactive Mode (`--interactive`)
Pass `--interactive` (or `-i`) to be guided through any missing values with interactive prompts. The command will list available Gemini Enterprise apps, offer to auto-detect the agent runtime ID from metadata, and prompt for display name and description.
```bash
agents-cli publish gemini-enterprise --interactive
```
---
## Complete Flag Reference
| Flag | Env Var | Description |
|------|---------|-------------|
| `--agent-runtime-id` | `AGENT_RUNTIME_ID` | Agent Runtime resource name (auto-detected from `deployment_metadata.json`) |
| `--gemini-enterprise-app-id` | `ID` or `GEMINI_ENTERPRISE_APP_ID` | Gemini Enterprise app full resource name |
| `--display-name` | `GEMINI_DISPLAY_NAME` | Display name in Gemini Enterprise |
| `--description` | `GEMINI_DESCRIPTION` | Agent description |
| `--tool-description` | `GEMINI_TOOL_DESCRIPTION` | Tool description (ADK mode only, defaults to description) |
| `--registration-type` | `REGISTRATION_TYPE` | `adk` or `a2a` (auto-detected from metadata if not set) |
| `--agent-card-url` | `AGENT_CARD_URL` | Agent card URL for A2A registration |
| `--deployment-target` | `DEPLOYMENT_TARGET` | `agent_runtime`, `cloud_run`, or `gke` (affects A2A auth method) |
| `--project-id` | `GOOGLE_CLOUD_PROJECT` | GCP project ID for billing |
| `--project-number` | `PROJECT_NUMBER` | GCP project number (used for Gemini Enterprise lookup) |
| `--authorization-id` | `GEMINI_AUTHORIZATION_ID` | OAuth authorization resource name |
| `--metadata-file` | — | Path to deployment metadata (default: `deployment_metadata.json`) |
| `--interactive` / `-i` | — | Enable interactive prompts |
---
## Auto-Detection from Metadata
When `deployment_metadata.json` exists, the command automatically:
- Reads the **agent runtime ID** (`remote_agent_runtime_id`)
- Detects the **registration type** (`is_a2a` flag)
- Constructs the **agent card URL** for A2A agents on Agent Runtime
- Determines the **deployment target** for authentication
This means that for the simplest case (ADK agent on Agent Runtime), you only need to provide the Gemini Enterprise app ID:
```bash
agents-cli publish gemini-enterprise \
--gemini-enterprise-app-id projects/123456/locations/global/collections/default_collection/engines/my-app
```
---
## SDK Compatibility
Agent Runtime deployments may encounter "Session not found" errors with `google-cloud-aiplatform` versions <= 1.128.0. In interactive mode (`--interactive`), the command checks the SDK version from `uv.lock` and offers to upgrade. In programmatic mode, ensure your SDK is up to date before registering.
---
## Troubleshooting
| Issue | Solution |
|-------|----------|
| "Session not found" after registration | SDK version issue — upgrade `google-cloud-aiplatform` (see SDK Compatibility above), redeploy, then re-register |
| `--registration-type is required` | Non-interactive mode needs `--registration-type` when no `deployment_metadata.json` exists |
| "Gemini Enterprise App ID is required" | Provide `--gemini-enterprise-app-id` or set the `ID` / `GEMINI_ENTERPRISE_APP_ID` env var |
| "Agent already registered" | The command automatically updates the existing registration — this is not an error |
| HTTP 403 on registration | Check that your account has Discovery Engine Editor permissions on the Gemini Enterprise project |
| "Could not fetch agent card" | Verify the agent is running and the URL is correct; for Cloud Run, ensure `gcloud auth login` is done |
---
## Related Skills
- `/google-agents-cli-deploy` — Deployment targets, CI/CD pipelines, and production workflows
- `/google-agents-cli-workflow` — Development workflow, coding guidelines, and operational rules
- `/google-agents-cli-scaffold` — Project creation and enhancement with `agents-cli scaffold create` / `scaffold enhance`google-agents-cli-scaffold
Create new ADK projects and add CI/CD, deployment, or upgrades to existing ones.
# ADK Project Scaffolding Guide
> **Requires:** `agents-cli` (`uv tool install google-agents-cli`) — [install uv](https://docs.astral.sh/uv/getting-started/installation/index.md) first if needed.
Use the `agents-cli` CLI to create new ADK agent projects or enhance existing ones with deployment, CI/CD, and infrastructure scaffolding.
---
## Prerequisite: Clarify Requirements (MANDATORY for new projects)
**Before scaffolding a new project, load `/google-agents-cli-workflow` and complete Phase 0** — clarify the user's requirements before running any `scaffold create` command. Ask what the agent should do, what tools/APIs it needs, and whether they want a prototype or full deployment.
---
## Step 1: Choose Architecture
**Mapping user choices to CLI flags:**
| Choice | CLI flag |
|--------|----------|
| RAG with vector search | `--agent agentic_rag --datastore agent_platform_vector_search` |
| RAG with document search | `--agent agentic_rag --datastore agent_platform_search` |
| A2A protocol | `--agent adk_a2a` |
| Prototype (no deployment) | `--prototype` |
| Deployment target | `--deployment-target <agent_runtime\|cloud_run\|gke>` |
| CI/CD runner | `--cicd-runner <github_actions\|cloud_build>` |
| Session storage | `--session-type <in_memory\|cloud_sql\|agent_platform_sessions>` |
### Product name mapping
The platform formerly known as "Vertex AI" is now **Gemini Enterprise Agent Platform** (short: **Agent Platform**). Users may refer to products by different names. Map them to the correct CLI values:
| User may say | CLI value |
|-------------|-----------|
| Agent Engine, Vertex AI Agent Engine, Agent Runtime | `--deployment-target agent_runtime` |
| Vertex AI Search, Agent Search | `--datastore agent_platform_search` |
| Vertex AI Vector Search, Vector Search | `--datastore agent_platform_vector_search` |
| Agent Engine sessions, Agent Platform Sessions | `--session-type agent_platform_sessions` |
The `vertexai` Python SDK package name is unchanged.
---
## Step 2: Create or Enhance the Project
### Create a New Project
```bash
agents-cli scaffold create <project-name> \
--agent <template> \
--deployment-target <target> \
--region <region> \
--prototype
```
**Constraints:**
- Project name must be **26 characters or less**, lowercase letters, numbers, and hyphens only.
- Do NOT `mkdir` the project directory before running `create` — the CLI creates it automatically. If you mkdir first, `create` will fail or behave unexpectedly.
- Auto-detect the guidance filename based on the IDE you are running in and pass `--agent-guidance-filename` accordingly (`GEMINI.md` for Gemini CLI, `CLAUDE.md` for Claude Code, `AGENTS.md` for OpenAI Codex/other).
- When enhancing an existing project, check where the agent code lives. If it's not in `app/`, pass `--agent-directory <dir>` (e.g. `--agent-directory agent`). Getting this wrong causes enhance to miss or misplace files.
### Reference Files
| File | Contents |
|------|----------|
| `references/flags.md` | Full flag reference for `create` and `enhance` commands |
### Enhance an Existing Project
```bash
agents-cli scaffold enhance . --deployment-target <target>
agents-cli scaffold enhance . --cicd-runner <runner>
```
Run this from inside the project directory (or pass the path instead of `.`).
### Upgrade a Project
Upgrade an existing project to a newer agents-cli version, intelligently applying updates while preserving your customizations:
```bash
agents-cli scaffold upgrade # Upgrade current directory
agents-cli scaffold upgrade <project-path> # Upgrade specific project
agents-cli scaffold upgrade --dry-run # Preview changes without applying
agents-cli scaffold upgrade --auto-approve # Auto-apply non-conflicting changes
```
### Execution Modes
The CLI defaults to **strict programmatic mode** — all required params must be supplied as CLI flags or a `UsageError` is raised. No approval flags needed. Pass all required params explicitly.
### Common Workflows
**Always ask the user before running these commands.** Present the options (CI/CD runner, deployment target, etc.) and confirm before executing.
```bash
# Add deployment to an existing prototype (strict programmatic)
agents-cli scaffold enhance . --deployment-target agent_runtime
# Add CI/CD pipeline (ask: GitHub Actions or Cloud Build?)
agents-cli scaffold enhance . --cicd-runner github_actions
```
---
## Template Options
| Template | Deployment | Description |
|----------|------------|-------------|
| `adk` | Agent Runtime, Cloud Run, GKE | Standard ADK agent (default) |
| `adk_a2a` | Agent Runtime, Cloud Run, GKE | Agent-to-agent coordination (A2A protocol) |
| `agentic_rag` | Agent Runtime, Cloud Run, GKE | RAG with data ingestion pipeline |
---
## Deployment Options
| Target | Description |
|--------|-------------|
| `agent_runtime` | Managed by Google (Vertex AI Agent Runtime). Sessions handled automatically. |
| `cloud_run` | Container-based deployment. More control, requires Dockerfile. |
| `gke` | Container-based on GKE Autopilot. Full Kubernetes control. |
| `none` | No deployment scaffolding. Code only. |
### "Prototype First" Pattern (Recommended)
Start with `--prototype` to skip CI/CD and Terraform. Focus on getting the agent working first, then add deployment later with `scaffold enhance`:
```bash
# Step 1: Create a prototype
agents-cli scaffold create my-agent --agent adk --prototype
# Step 2: Iterate on the agent code...
# Step 3: Add deployment when ready
agents-cli scaffold enhance . --deployment-target agent_runtime
```
### Agent Runtime and session_type
When using `agent_runtime as the deployment target, Agent Runtime manages sessions internally. If your code sets a `session_type`, clear it — Agent Runtime overrides it.
---
## Step 3: Load Dev Workflow
After scaffolding, save `DESIGN_SPEC.md` to the project root if it isn't there already.
**Then immediately load `/google-agents-cli-workflow`** — it contains the development workflow, coding guidelines, and operational rules you must follow when implementing the agent.
**Key files to customize:** `app/agent.py` (instruction, tools, model), `app/tools.py` (custom tool functions), `.env` (project ID, location, API keys).
**Files to preserve:** `pyproject.toml` `[tool.agents-cli]` section (CLI reads this), deployment configs under `deployment/`, `Makefile`, `app/__init__.py` (the `App(name=...)` must match the directory name — default `app`).
**RAG projects (`agentic_rag`) — provision datastore first:**
Before running `agents-cli playground` or testing your RAG agent, you must provision the datastore and ingest data:
```bash
agents-cli infra datastore # Provision datastore infrastructure
agents-cli data-ingestion # Ingest data into the datastore
```
Use `infra datastore` — **not** `infra single-project`. Both provision the datastore, but `infra datastore` is faster because it skips unrelated Terraform. Without this step, the agent won't have data to search over.
> **Vector Search region:** `vector_search_location` defaults to `us-central1`, separate from `region` (`us-east1`). It sets both the Vector Search collection region and the BQ ingestion dataset region, kept colocated to avoid cross-region data movement. Override per-invocation with `agents-cli data-ingestion --vector-search-location <region>`.
**Verifying your agent works:** Use `agents-cli run "test prompt"` for quick smoke tests, then `agents-cli eval run` for systematic validation. Do NOT write pytest tests that assert on LLM response content — that belongs in eval.
---
## Scaffold as Reference
When you need specific files (Terraform, CI/CD workflows, Dockerfile) but don't want to scaffold the current project directly, create a temporary reference project in `/tmp/`:
```bash
agents-cli scaffold create /tmp/ref-project \
--agent adk \
--deployment-target cloud_run
```
Inspect the generated files, adapt what you need, and copy into the actual project. Delete the reference project when done.
This is useful for:
- Non-standard project structures that `enhance` can't handle
- Cherry-picking specific infrastructure files
- Understanding what the CLI generates before committing to it
---
## Critical Rules
- **NEVER skip requirements clarification** — load `/google-agents-cli-workflow` Phase 0 and clarify the user's intent before running `scaffold create`
- **NEVER change the model** in existing code unless explicitly asked
- **NEVER `mkdir` before `create`** — the CLI creates the directory; pre-creating it causes enhance mode instead of create mode
- **NEVER create a Git repo or push to remote without asking** — confirm repo name, public vs private, and whether the user wants it created at all
- **Always ask before choosing CI/CD runner** — present GitHub Actions and Cloud Build as options, don't default silently
- **Agent Runtime clears session_type** — if deploying to `agent_runtime`, remove any `session_type` setting from your code
- **Start with `--prototype`** for quick iteration — add deployment later with `enhance`
- **Project names** must be ≤26 characters, lowercase, letters/numbers/hyphens only
- **NEVER write A2A code from scratch** — the A2A Python API surface (import paths, `AgentCard` schema, `to_a2a()` signature) is non-trivial and changes across versions. Always use `--agent adk_a2a` to scaffold A2A projects.
---
# Examples
Using scaffold as reference:
User says: "I need a Dockerfile for my non-standard project"
Actions:
1. Create temp project: `agents-cli scaffold create /tmp/ref --agent adk --deployment-target cloud_run`
2. Copy relevant files (Dockerfile, etc.) from /tmp/ref
3. Delete temp project
Result: Infrastructure files adapted to the actual project
---
A2A project:
User says: "Build me a Python agent that exposes A2A and deploys to Cloud Run"
Actions:
1. Follow the standard flow (understand requirements, choose architecture, scaffold)
2. `agents-cli scaffold create my-a2a-agent --agent adk_a2a --deployment-target cloud_run --prototype`
Result: Valid A2A imports and Dockerfile — no manual A2A code written.
---
## Troubleshooting
### `agents-cli` command not found
See `/google-agents-cli-workflow` → **Setup** section.
---
## Related Skills
- `/google-agents-cli-workflow` — Development workflow, coding guidelines, and the build-evaluate-deploy lifecycle
- `/google-agents-cli-adk-code` — ADK Python API quick reference for writing agent code
- `/google-agents-cli-deploy` — Deployment targets, CI/CD pipelines, and production workflows
- `/google-agents-cli-eval` — Evaluation methodology, evalset schema, and the eval-fix loopgoogle-agents-cli-workflow
Always-on entrypoint for ADK agent development — covers the full lifecycle (scaffold → build → evaluate → deploy → publish → observe).
# ADK Development Workflow & Guidelines
> **STOP — Do NOT write code yet.** If no project exists, scaffold first with `agents-cli scaffold create <name>`. If the user already has code, use `agents-cli scaffold enhance .` to add the agents-cli structure. Run `agents-cli info` to check if a project already exists. Skipping this leads to missing eval boilerplate, CI/CD config, and project conventions.
**agents-cli** is a CLI and skills toolkit for building, evaluating, and deploying agents on Google Cloud using the [Agent Development Kit (ADK)](https://adk.dev/). It works with any coding agent — Gemini CLI, Claude Code, Codex, or others. Install with `uvx google-agents-cli setup`.
> Requires: google-agents-cli ~= 0.1.3
> If version is behind, run: uv tool install "google-agents-cli~=0.1.3"
> Check version: agents-cli info
> [Install uv](https://docs.astral.sh/uv/getting-started/installation/index.md) first if needed.
## Session Continuity & Skill Cross-References
Re-read the relevant skill **before** each phase — not after you've already started and hit a problem. Context compaction may have dropped earlier skill content. If skills are not available, run `uvx google-agents-cli setup` to install them.
| Phase | Skill | When to load |
|-------|-------|--------------|
| 0 — Understand | — | No skill needed — read `DESIGN_SPEC.md` or clarify goals with the user |
| 1 — Study samples | — | Check Notable Samples table below — clone and study matching samples before scaffolding |
| 2 — Scaffold | `/google-agents-cli-scaffold` | Before creating or enhancing a project |
| 3 — Build | `/google-agents-cli-adk-code` | Before writing agent code — API patterns, tools, callbacks, state |
| 4 — Evaluate | `/google-agents-cli-eval` | Before running any eval — evalset schema, metrics, eval-fix loop |
| 5 — Deploy | `/google-agents-cli-deploy` | Before deploying — target selection, troubleshooting 403/timeouts |
| 6 — Publish | `/google-agents-cli-publish` | After deploying, if registering with Gemini Enterprise (optional) |
| 7 — Observe | `/google-agents-cli-observability` | After deploying — traces, logging, monitoring setup |
---
## Setup
If `agents-cli` is not installed:
```bash
uv tool install google-agents-cli
```
### `uv` command not found
Install `uv` following the [official installation guide](https://docs.astral.sh/uv/getting-started/installation/index.md).
### Product name mapping
The platform formerly known as "Vertex AI" is now **Gemini Enterprise Agent Platform** (short: **Agent Platform**). Users may refer to products by different names. Map them to the correct CLI values:
| User may say | CLI value |
|-------------|-----------|
| Agent Engine, Vertex AI Agent Engine, Agent Runtime | `--deployment-target agent_runtime` |
| Vertex AI Search, Agent Search | `--datastore agent_platform_search` |
| Vertex AI Vector Search, Vector Search | `--datastore agent_platform_vector_search` |
| Agent Engine sessions, Agent Platform Sessions | `--session-type agent_platform_sessions` |
The `vertexai` Python SDK package name is unchanged.
---
## Phase 0: Understand
Before writing or scaffolding anything, understand what you're building.
If `DESIGN_SPEC.md` already exists, read it — it is your primary source of truth. Otherwise:
Do NOT proceed to planning, scaffolding, or coding. Ask the user the questions below and wait for their answers. You MUST have the user's answers before moving on. Do not assume, research, or fill in the blanks yourself. The user's intent drives everything — skipping this step leads to wasted work.
**Always ask:**
1. **What problem will the agent solve?** — Core purpose and capabilities
2. **External APIs or data sources needed?** — Tools, integrations, auth requirements
3. **Safety constraints?** — What the agent must NOT do, guardrails
4. **Deployment preference?** — Prototype first (recommended) or full deployment? If deploying: Agent Runtime, Cloud Run, or GKE?
**Ask based on context:**
- If **retrieval or search over data** mentioned (RAG, semantic search, vector search, embeddings, similarity search, data ingestion) → **Datastore?** Options: `agent_platform_vector_search` (embeddings, similarity search) or `agent_platform_search` (document search, search engine).
- If agent should be **available to other agents** → **A2A protocol?** Enables the agent as an A2A-compatible service.
- If **full deployment** chosen → **CI/CD runner?** GitHub Actions (default) or Google Cloud Build?
- If agent should **remember user preferences or facts across sessions** → **Memory Bank?** Long-term memory across conversations. See `/google-agents-cli-adk-code`.
- If **Cloud Run** or **GKE** chosen → **Session storage?** In-memory (default), Cloud SQL (persistent), or Agent Platform Sessions (managed).
- If **deployment with CI/CD** chosen → **Git repository?** Does one already exist, or should one be created? If creating, public or private?
Once you have the user's answers, write a `DESIGN_SPEC.md` with the user's approval. See `/google-agents-cli-scaffold` for how these choices map to CLI flags. At minimum include these sections — expand with more detail if the user wants a thorough spec:
```markdown
# DESIGN_SPEC.md
## Overview
Describe the agent's purpose and how it works.
## Example Use Cases
Concrete examples with expected inputs and outputs.
## Tools Required
Each tool with its purpose, API details, and authentication needs.
## Constraints & Safety Rules
Specific rules — not just generic statements.
## Success Criteria
Measurable outcomes for evaluation.
## Reference Samples
Check the Notable Samples in Phase 1 — list any that match this use case.
```
Optional sections for more detailed specs: **Edge Cases to Handle**, **Architecture & Sub-Agents**, **Data Sources & Auth**, **Non-Functional Requirements**.
Once you have a clear understanding, proceed to **Phase 1**.
## Phase 1: Study Reference Samples
Ask yourself: is there a sample that can help me design this and cut time? Scan the keywords below. Multiple samples can match — clone and study all that are relevant.
```bash
# Clone a sample to study — read the key files, understand the patterns, then apply
# them to your own scaffolded project. Do NOT use `adk@<sample>` scaffolding.
git clone --filter=tree:0 --sparse https://github.com/google/adk-samples /tmp/adk-samples 2>/dev/null; \
cd /tmp/adk-samples && git sparse-checkout add python/agents/<sample-name>
```
- **`ambient-expense-agent`** — Agent that runs on a schedule or reacts to events, with no interactive user.
Keywords: scheduled, cron, daily, pubsub, event-driven, alerts, email, ambient
Key files: `expense_agent/fast_api_app.py`, `expense_agent/agent.py`, `expense_agent/config.py`, `terraform/`
- **`adk-ae-oauth`** — Agent with OAuth 2.0 user consent, deployed to Agent Runtime with Gemini Enterprise.
Keywords: OAuth, authentication, user consent, Google Drive, Agent Runtime, Gemini Enterprise
Key files: `README.md`, `adk_ae_oauth/tools.py`, `adk_ae_oauth/auths.py`
- **`genmedia-for-commerce`** — Full-stack agent with React UI, MCP tools, media/image handling, and Gemini Enterprise registration.
Keywords: MCP, media, video generation, Veo, virtual try-on, retail, full-stack, React, Gemini Enterprise
Key files: `genmedia4commerce/agent.py`, `genmedia4commerce/agent_utils.py`, `genmedia4commerce/fast_api_app.py`
- **`deep-search`** — Research agent that iterates until quality is met, with source citations.
Keywords: research, citations, iterative, grounding, multi-agent, human-in-the-loop, web search, report
Key files: `app/agent.py`, `app/config.py`
- **`safety-plugins`** — Reusable safety guardrails that plug into any agent runner.
Keywords: safety, guardrails, model armor, filters
Key files: `safety_plugins/plugins/model_armor.py`, `safety_plugins/plugins/agent_as_a_judge.py`, `safety_plugins/main.py`
- **`data-science`** — Agent that executes code in a managed sandbox for data analysis.
Keywords: SQL, BigQuery, code execution, sandbox
Key files: `data_science/sub_agents/analytics/agent.py`
- **`memory-bank`** — Conversational agent with cross-session memory via Memory Bank (Cloud Run and Agent Runtime).
Keywords: memory, cross-session, recall, context, remember, Memory Bank
Key files: `app/agent.py`, `app/agent_runtime_app.py`, `app/fast_api_app.py`
If no sample matches, proceed to Phase 2. But first — are you sure? Re-read the user's request and compare it against the keywords above. Skipping a matching sample means rebuilding patterns that already exist.
> **IMPORTANT — Exit criteria:** After studying a sample, ask yourself: can I apply anything from this sample to help me deliver the design? Note what you'll reuse before moving on. Do NOT proceed until you've answered this.
> **This list is useful at any phase** — revisit it when you hit deployment, publishing, or infrastructure questions. A sample's Terraform or registration pattern may be exactly what you need later.
## Phase 2: Scaffold (if needed)
Use `/google-agents-cli-scaffold` to create a new project or import an existing one into the agents-cli format (adding deployment, CI/CD, infrastructure). It covers architecture choices (deployment target, agent type, session storage) and project creation or enhancement.
Skip this phase if the project was already created or enhanced by agents-cli — run `agents-cli info` from the project root to check.
## Phase 3: Build and Implement
Implement the agent logic:
1. Write/modify code in the agent directory (check `GEMINI.md` / `CLAUDE.md` for directory name)
2. **Quick smoke test**: Use `agents-cli run "your prompt"` to verify the agent works after changes — this is the fastest way to check behavior without leaving the terminal
3. Iterate on the implementation based on user feedback
If the user asks for interactive testing, suggest `agents-cli playground` — it opens a web-based playground for manual conversation with the agent.
For ADK API patterns and code examples, use `/google-agents-cli-adk-code`.
> **NEVER write pytest tests that assert on LLM output content** (e.g., checking for keywords in responses, verifying persona, validating tone). LLM outputs are non-deterministic — these tests are flaky by nature and belong in eval, not pytest. Use `agents-cli run` for quick checks and `agents-cli eval run` for systematic validation.
## Phase 3.5: Provision Datastore (RAG projects only)
For `agentic_rag` projects, provision the datastore before testing: `agents-cli infra datastore`, then `agents-cli data-ingestion`. Use `infra datastore` — **not** `infra single-project` (same datastore provisioning but faster, skips unrelated Terraform).
## Phase 4: Evaluate
**This is the most important phase.** Evaluation validates agent behavior end-to-end.
**MANDATORY:** Activate `/google-agents-cli-eval` before running evaluation.
It contains the evalset schema, config format, and critical gotchas. Do NOT skip this.
**Do NOT skip this phase.** After building the agent, you MUST proceed to evaluation. Do NOT write pytest tests to validate agent behavior — that is what eval is for.
**`uv run pytest` vs `agents-cli eval run` — know the difference:**
- **`uv run pytest`** — Tests *code correctness*: imports work, functions return expected types, API contracts hold. Does NOT test whether the agent behaves well.
- **`agents-cli eval run`** — Tests *agent behavior*: response quality, tool usage, persona consistency, safety compliance. This is what validates your agent actually works.
- **`agents-cli run "prompt"`** — Quick one-off smoke test during development. If testing multiple prompts use the `--start-server` option to persist the local server, which reduces overhead for repeated calls and allows resuming local sessions via `--session-id`. Use this for fast iteration, not pytest.
**NEVER write pytest tests that check LLM response content** (e.g., asserting pirate keywords appear, checking if the agent mentions allergies). LLM outputs are non-deterministic. Use eval with LLM-as-judge criteria instead.
1. **Start small**: Begin with 1-2 sample eval cases, not a full suite
2. Run evaluations: `agents-cli eval run`
3. Discuss results with the user
4. Fix issues and iterate on the core cases first
5. Only after core cases pass, add edge cases and new scenarios
6. Repeat until quality thresholds are met
**Expect 5-10+ iterations here.**
## Phase 5: Deploy
Once evaluation thresholds are met:
1. Check if the project has a deployment target configured — run `agents-cli info` to see current config
2. If the project is a prototype (no deployment target), add deployment support first:
```bash
agents-cli scaffold enhance . --deployment-target <target>
```
See `/google-agents-cli-deploy` for the deployment target decision matrix (Agent Runtime vs Cloud Run vs GKE).
3. Deploy when ready: `agents-cli deploy`
**IMPORTANT**: Never deploy without explicit human approval.
## Phase 6: Publish (optional)
Not all agents require this — currently supporting Gemini Enterprise. See `/google-agents-cli-publish` for registration modes, flags, and troubleshooting.
## Phase 7: Observe
After deploying, use observability tools to monitor agent behavior in production. See `/google-agents-cli-observability` for Cloud Trace, prompt-response logging, BigQuery Analytics, and third-party integrations.
---
# Operational Guidelines for Coding Agents
## Common Shortcuts to Resist
Agents routinely skip steps with plausible-sounding excuses. Recognize these and push back:
| Shortcut | Why it fails |
|----------|-------------|
| "The user's request is clear enough, no need to clarify" | You're guessing at requirements. Phase 0 exists to confirm intent before scaffolding — even one question can prevent a full rework. |
| "The agent responded correctly in `agents-cli run`, so eval isn't needed" | One prompt is not a test suite. Eval catches regressions, edge cases, and tool trajectory issues that a single run never will. |
| "I'll use a newer/better model" | The scaffolded model was chosen deliberately. Changing it without being asked violates code preservation (Principle 1) and often breaks things — wrong location, deprecated version, or 404. Your training data is likely out of date — rely on the skills and the model listing command, not your knowledge of model names. |
| "I can skip the scaffold and set up manually" | Manual setup misses eval boilerplate, CI/CD config, and `pyproject.toml` conventions. Use `agents-cli create` even for quick experiments. |
## Principle 1: Code Preservation & Isolation
Code modifications require surgical precision — alter only the code segments directly targeted by the user's request and strictly preserve all surrounding and unrelated code.
**Mandatory Pre-Execution Verification:**
Before finalizing any code replacement, verify the following:
1. **Target Identification:** Clearly define the exact lines or expressions to change, based *solely* on the user's explicit instructions.
2. **Preservation Check:** Confirm that all code, configuration values (e.g., `model`, `version`, `api_key`), comments, and formatting *outside* the identified target remain identical.
**Example:**
- **User Request:** "Change the agent's instruction to be a recipe suggester."
- **Incorrect (VIOLATION):**
```python
root_agent = Agent(
name="recipe_suggester",
model="gemini-1.5-flash", # UNINTENDED - model was not requested to change
instruction="You are a recipe suggester."
)
```
- **Correct (COMPLIANT):**
```python
root_agent = Agent(
name="recipe_suggester", # OK, related to new purpose
model="gemini-flash-latest", # PRESERVED
instruction="You are a recipe suggester." # OK, the direct target
)
```
## Principle 2: Execution Best Practices
- **Model Selection — CRITICAL:**
- **NEVER change the model unless explicitly asked.**
- When creating NEW agents (not modifying existing), use the latest Gemini model. List available models to pick the newest one:
```bash
# Use 'global' or any supported region (e.g. 'us-east1')
uv run --with google-genai python -c "
from google import genai
client = genai.Client(vertexai=True, location='global')
for m in client.models.list(): print(m.name)
"
```
- Do NOT use older models unless explicitly requested. For model docs, fetch `https://adk.dev/agents/models/google-gemini/index.md`. See also [stable model versions](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versions).
- **Running Python Commands:**
- Always use `uv` to execute Python commands (e.g., `uv run python script.py`)
- Run `uv sync` before executing scripts
- **Breaking Infinite Loops:**
- **Stop immediately** if you see the same error 3+ times in a row
- **RED FLAGS**: Lock IDs incrementing, names appending v5→v6→v7, "I'll try one more time" repeatedly
- **State conflicts** (Error 409): Use `terraform import` instead of retrying creation
- **When stuck**: Run underlying commands directly (e.g., `terraform` CLI)
- **Troubleshooting:**
- Check `/google-agents-cli-adk-code` first — it covers most common patterns
- Use WebFetch on URLs from the ADK docs index (`curl https://adk.dev/llms.txt`) for deep dives
- When encountering persistent errors, a targeted web search often finds solutions faster
- **CLI command failures:** run `agents-cli <command> --help` — the output ends with a `Source:` line pointing to the exact source file implementing that command. Read it to understand the logic and diagnose failures. Use `agents-cli info` to get the full CLI install path if you need to browse across multiple files.
### Systematic Debugging
When something breaks, follow this sequence — don't skip steps or shotgun fixes:
1. **Reproduce** — Run the exact command that failed. Save the full error output. If you can't reproduce it, you can't fix it.
2. **Localize** — Narrow the cause: is it the agent code, a tool, the config, or the environment? Use `agents-cli run "prompt"` to isolate agent behavior from deployment issues.
3. **Fix one thing** — Change one variable at a time. If you change the instruction AND the tool AND the config simultaneously, you won't know what fixed it (or what broke something else).
4. **Verify** — Rerun the exact reproduction command. Don't assume the fix worked.
5. **Guard** — If it was a non-obvious bug, add an eval case to catch regressions.
**Stop-the-line rule:** If a change breaks something that was working, stop feature work and fix the regression first. Don't push forward hoping to circle back — regressions compound.
- **Environment Variables:**
- `.env` files and env var assignments (e.g., `GOOGLE_CLOUD_PROJECT`, `GOOGLE_CLOUD_LOCATION`) are typically required for the agent to function — never remove or modify them unless the user explicitly asks
- If a `.env` file exists in the project root, treat it as essential configuration
- For secrets and API keys, prefer GCP Secret Manager over plain `.env` entries — see `/google-agents-cli-deploy` for secret management guidance
---
## Using a Temporary Scaffold as Reference
When you need specific infrastructure files (Terraform, CI/CD, Dockerfile) but don't want to modify the current project, use `/google-agents-cli-scaffold` to create a temporary project in `/tmp/` and copy over what you need.
---
## Reference Files
| File | Contents |
|------|----------|
| `references/internals.md` | Underlying tools and commands that `agents-cli` wraps (adk, pytest, ruff, uvicorn) |
## Development Commands
### Setup & Skills
| Command | Purpose |
|---|---|
| `agents-cli setup` | Install skills to coding agents |
| `agents-cli setup --skip-auth` | Install skills, skip authentication step |
| `agents-cli setup --dry-run` | Preview what setup would do without executing |
| `agents-cli update` | Reinstall/update skills to latest version |
### Scaffolding
| Command | Purpose |
|---|---|
| `agents-cli scaffold create <name>` | Create a new project |
| `agents-cli scaffold enhance .` | Add deployment / CI-CD to project |
| `agents-cli scaffold upgrade` | Upgrade project to newer agents-cli version |
### Development
| Command | Purpose |
|---|---|
| `agents-cli playground` | Interactive local testing (ADK web playground) |
| `agents-cli run "prompt"` | Run agent with a single prompt (non-interactive) |
| `agents-cli lint` | Check code quality |
| `agents-cli lint --fix` | Auto-fix linting issues |
| `agents-cli lint --mypy` | Also run mypy type checking |
| `agents-cli install` | Install project dependencies (uv sync) |
### Evaluation
| Command | Purpose |
|---|---|
| `agents-cli eval run` | Run evaluation against evalsets |
| `agents-cli eval run --evalset F` | Run a specific evalset |
| `agents-cli eval run --all` | Run all evalsets |
| `agents-cli eval compare BASE CAND` | Compare two eval result files |
### Deployment & Infrastructure
| Command | Purpose |
|---|---|
| `agents-cli deploy` | Deploy to dev (requires human approval) |
| `agents-cli infra single-project` | Provision single-project GCP infrastructure without CI/CD (Terraform, optional) |
| `agents-cli infra cicd` | Set up CI/CD pipeline + staging/prod infrastructure |
| `agents-cli publish gemini-enterprise` | Register agent with Gemini Enterprise |
### Project Info
| Command | Purpose |
|---|---|
| `agents-cli info` | Show CLI install path, skills location, and project config |
Use `agents-cli info` to discover the **CLI install path** — this is where the CLI source code lives. Read files under that path to understand CLI internals, command implementations, or template logic. The command only shows project details when run inside a generated agent project (i.e., one with `[tool.agents-cli]` in `pyproject.toml`).
### Authentication
| Command | Purpose |
|---|---|
| `agents-cli login --interactive` | Authenticate with Google for ADK services (`-i` / `--interactive` is required for interactive browser-based authentication) |
| `agents-cli login --status` | Show authentication status |
> [!NOTE]
> When using an API key to authenticate, the `login` command does not persist them automatically, it just aids in retrieving them and providing instructions on how they can be persisted.
---
## Skills Version
> **Troubleshooting hint:** If skills seem outdated or incomplete, reinstall:
> ```
> agents-cli setup --skip-auth
> ```
> Only do this when you suspect stale skills are causing problems.
---
## Related Skills
- `/google-agents-cli-scaffold` — Project creation, requirements gathering, and enhancement
- `/google-agents-cli-adk-code` — ADK Python API quick reference and production sample agents
- `/google-agents-cli-eval` — Evaluation methodology, evalset schema, and the eval-fix loop
- `/google-agents-cli-deploy` — Deployment targets, CI/CD pipelines, and production workflows
- `/google-agents-cli-publish` — Gemini Enterprise registration
- `/google-agents-cli-observability` — Cloud Trace, logging, BigQuery Analytics, and third-party integrations