CursorPool
← 返回首页

RubixKube

RubixKube Site Reliability Intelligence for modern infrastructure: investigate incidents, review evidence-backed RCAs, and track remediation across Kubernetes, AWS, GCP, Linux VMs, and hybrid platforms.

cursor.directory·1
MCP

rubixkube

MCP server: rubixkube

{
  "type": "http",
  "url": "https://mcp.rubixkube.ai/mcp"
}
Skill

active-issues

Check currently active infra issues

## When to Use This Skill

Load when the user asks:
- "What needs attention?" / "What's broken?"
- "Any critical alerts?" / "Show me high severity issues"
- "What's wrong in namespace/project X?" / "Issues in environment Y?"
- "What are the open issues?"

## Tool: `active_issues`

Returns open issues grouped by severity (critical → high → medium → low). Each result includes an `insight_id` for deep investigation.

Parameters (all optional):
- `cluster_id` (str) — scope to one connected environment. The API name is `cluster_id`, but it can represent Kubernetes, cloud, or VM environments.
- `namespace` (str) — scope to one namespace, cloud project/account namespace, or logical grouping when present
- `severity` (str) — filter to `critical`, `high`, `medium`, or `low`

```
active_issues()
active_issues(severity="critical")
active_issues(cluster_id="prod-cluster", namespace="payments")
active_issues(cluster_id="prod-cluster", severity="high")
```

## Reading the Output

Each issue line shows:
- `[insight_id]` — pass this to `investigate` for root cause + RCA + linked actions
- Severity label
- Message (truncated to 70 chars)
- Namespace/project and environment location
- "RCA available" or "No RCA" — if RCA available, full analysis exists

## Chaining

- See an issue worth investigating? Pass its `insight_id` to `investigate` skill.
- Want the full RCA report directly? Use `rca-report` skill with the `report_id` from `investigate` output.
- Looking for what to fix? Use `pending-actions` skill to see linked remediation tasks.
Skill

cluster-health

## When to Use This Skill

Load when the user asks:
- "How's prod doing?" / "Check staging"
- "What's the status of environment X?"
- "Show me the [cluster/project/account] environment"
- "Is [prod/staging/GCP project/AWS account] healthy?"

## Tool: `cluster_health`

Requires `cluster_id`. In RubixKube this identifier may represent a Kubernetes cluster, cloud project/account, VM environment, or other connected environment. If the user says "prod" or "staging" without an ID, call `platform_status` first to get the environment list and IDs.

```
cluster_health(cluster_id="prod-cluster-id")
```

## Getting the cluster_id

If unknown:
1. Call `platform_status()` — returns connected environments with IDs and names
2. Match the user's name ("prod", "staging", project/account name) to an environment
3. Call `cluster_health(cluster_id="<matched id>")`

## Reading the Output

The report includes:
- **Cluster metadata** — status, version, type, region, resource totals
- **Nodes** — name, status, roles (up to 10 shown)
- **Resources** — workloads, cloud resources, VMs, services, and active namespaces/projects where available
- **Active issues** — severity breakdown (critical/high/medium/low counts)
- **Recent RCAs** — last 5 RCA reports with title and confidence score

## Chaining

- See active issues in the environment? Use `active_issues(cluster_id=...)` for the full list.
- Want to investigate a specific issue? Use `investigate` skill with an `insight_id`.
- Recent RCA with a known ID? Use `rca-report` skill for the complete analysis.
Skill

infra-status

Check infra status

## When to Use This Skill

Load when the user asks:
- "What's going on in prod?" / "How's the platform?"
- "Anything broken?" / "Is everything healthy?"
- "What happened in the last [N] hours?"
- "Post-deploy check" / "Morning standup summary"
- "Give me an overview of the environment"

## Tools

### `platform_status`
Single-shot dashboard — call this first for any overview request.

Returns: connected environments with health status, active issue counts by severity, total RCA reports, open action count, and active namespaces/projects where available.

```
platform_status()
```

No parameters required. Always available.

### `recent_activity`
Timeline of events — use when the user wants to know *what changed*, not just current state.

Parameters:
- `hours` (int, default 2) — lookback window. Increase for "what happened today?" (8h) or "this week" (168h).
- `cluster_id` (str, optional) — scope to a specific connected environment. The API name is `cluster_id`, but RubixKube environments can include Kubernetes clusters, cloud projects/accounts, and VM estates.

```
recent_activity(hours=2)
recent_activity(hours=8, cluster_id="prod-cluster")
```

## Chaining

- Got active issues from `platform_status`? Use `active_issues` skill for a filtered list.
- See an environment that looks degraded? Use `cluster_health` skill with its `cluster_id`.
- User wants to dig into a specific incident? Use `investigate` skill with an `insight_id`.
Skill

investigate

Investigate an incident

## When to Use This Skill

Load when the user asks:
- "Why is [service/pod/cloud resource/VM/namespace] broken?"
- "Investigate this error / crash / OOM"
- "What's the root cause of [issue]?"
- "Look into insight [ID]"
- "Debug the payment service" / "What's wrong with checkout?"

## Tool: `investigate`

Two modes — prefer `insight_id` when you have it:

### Mode 1: Direct lookup by ID (preferred)
Fetches the insight, its RCA report, and all linked remediation actions in a single call.

```
investigate(insight_id="<id from active_issues output>")
```

### Mode 2: Search by text
Searches insight messages, namespaces, types, and affected resources. Returns matching insights with their IDs.

```
investigate(search="payment service")
investigate(search="OOMKilled")
investigate(search="crashloopbackoff")
```

Use search when the user describes a symptom without an ID. Once you find the relevant insight, call again with its `insight_id` for full details.

## Reading the Output

The investigation report contains:
- **Insight** — severity, status, namespace/project, environment, affected resources, first/last seen
- **Root Cause Analysis** — root cause, contributing factors, impact description, remediation steps, rollback instructions (if RCA exists)
- **Related Actions** — open remediation tasks linked to this issue, with their `action_id`s

If "Root Cause Analysis: not yet generated" — RCA is still in progress. Check back or look at the insight details for initial signals.

## Chaining

- Found a `report_id` in the output? Use `rca-report` skill for the complete RCA document.
- See action items you want details on? Use `pending-actions` skill.
- Want environment-wide context? Use `cluster-health` skill with the `cluster_id` from the insight.
Skill

pending-actions

Tasks remaining to be picked up by owners

## When to Use This Skill

Load when the user asks:
- "What should I work on?" / "What's on my plate?"
- "Show me open action items" / "What's pending?"
- "Actions for environment X" / "High priority items"
- "What's being worked on?" / "Remediation progress"

## Tool: `pending_actions`

Returns open actions grouped by priority (high → medium → low). Each action shows its linked insight or RCA ID so you can trace back to the originating incident.

Parameters (all optional):
- `cluster_id` (str) — scope to one connected environment. The API name is `cluster_id`, but it can represent Kubernetes, cloud, or VM environments.
- `priority` (str) — `high`, `medium`, or `low`
- `status` (str) — filter by status: `todo`, `assigned`, `in_progress`, `waiting`, `review`

```
pending_actions()
pending_actions(priority="high")
pending_actions(cluster_id="prod-cluster")
pending_actions(status="in_progress")
pending_actions(cluster_id="prod-cluster", priority="high")
```

## Reading the Output

Each action shows:
- `[action_id]` — unique identifier
- Title — what needs to be done
- Status — todo / assigned / in_progress / waiting / review
- Cluster (if scoped)
- "From: INSIGHT-[id]" or "From: RCA-[id]" — the originating incident

## Chaining

- Want full context on the linked incident? Use `investigate(insight_id=...)` with the linked ID.
- Need the complete RCA before acting? Use `rca-report` skill with the linked `report_id`.
- Not sure which environment to filter by? Use `platform_status` first to get environment IDs.
Skill

rca-report

Show Root Cause Analysis of an incident

## When to Use This Skill

Load when the user asks:
- "Show me the RCA" / "Give me the full root cause analysis"
- "What was the investigation for [incident]?"
- "Remediation steps for [issue]"
- "Show me the rollback instructions"
- "What evidence did the system find?"

## Tool: `rca_details`

Requires `report_id`. Get it from:
- `investigate(insight_id=...)` output — the RCA section shows the `report_id`
- `active_issues` output — issues marked "RCA available" have a linked report
- `cluster_health` output — recent RCAs section shows `report_id`s for a connected environment

```
rca_details(report_id="<id>")
```

**Never guess an ID.** Always get `report_id` from a prior tool call.

## Reading the Output

The full RCA document includes:
- **Metadata** — severity, confidence %, completeness %, environment/cluster, namespace/project, timestamp
- **Root Cause** — the definitive finding
- **Contributing Factors** — secondary causes that enabled or worsened the incident
- **Impact** — what was affected and how severely
- **Affected Services** — list of services involved
- **Timeline** — chronological sequence of events
- **Remediation Steps** — numbered, ordered steps to resolve the issue
- **Rollback** — how to revert if remediation causes new problems
- **Evidence** — data sources and observations that support the analysis
- **Action Items** — recommended follow-up tasks with priority and due dates
- **Full Markdown Report** — complete narrative (if available, first 3000 chars shown)

## Chaining

- Have action items from this RCA? Use `pending-actions` skill to see their current status.
- Want to investigate a related issue? Use `investigate` skill with an `insight_id`.

来源:https://github.com/rubixkube-io/rubixkube-for-ai