lsai-xmp4.public

Name: lsai-xmp4.public
Author: cursor.directory
面向 Cursor 的 lsai-xmp4.public 插件。
1 个 Skill
# xmp4 — Code Intelligence for Third-Party Libraries

xmp4 is an MCP server that answers semantic code questions over pre-indexed
OSS libraries. Reach for it **first** when the user wants to understand how
an unfamiliar library is actually used in practice — it is far cheaper and
more focused than grepping source archives or reading READMEs.

The primary value is this: instead of guessing from signatures or docs, you
can instantly find the real tests and call sites that exercise a symbol,
then read them. That is where idiomatic usage patterns actually live.

## Golden path — follow this

Whenever a user question can be answered by reading a library's source,
tests, or call graph, run this 5-step flow. It is the fastest and cheapest
route, and it is what xmp4 is designed around:

```
1. xmp4_projects(language=?, query=?)      # discover the project id
2. xmp4_search(project, query=<leaf>)      # leaf name only (e.g. `send`, NOT `Session.send`)
                                           # → you get file_path + canonical display_name
3. xmp4_info(project, symbol_name, file_path, docs="summary")
4. xmp4_tests_for(project, symbol_name, file_path)
   → xmp4_view(project, file_path=<test file>, from_line=N, to_line=N+80)
5. If you need more context: xmp4_usages / xmp4_callers / xmp4_outline
```

Step 4 is the killer feature. `xmp4_tests_for` filters SCIP callers by
per-language test-file patterns (e.g. `*Tests.cs`, `test_*.py`,
`*.spec.ts`) and typically returns 3–20 real tests — you then `xmp4_view`
the interesting ones to see concrete usage.

**When Step 4 returns 0 tests**: do not assume the library has no tests.
Two common causes:
- **Python only**: scip-python under-counts cross-file references (see
  the caveat under "Indexer quality matrix"). Fallback:
  `xmp4_grep(project, pattern="<Name>", file_path="<known test file>")`
  or `xmp4_outline` on a likely test file name like `tests/test_*.py`.
- **Wrong symbol disambiguation**: you may have resolved to a different
  overload. Re-check `xmp4_info` output — if the file_path it resolved
  is not the one you expected, call info with a more specific
  `file_path`.

**When Step 2 search does not show your symbol in the first 50 hits**:
re-read rule 4 below. Use `xmp4_outline(project, file_path)` on the
file where you expect the symbol — it is guaranteed exhaustive within
that file.

## The rules that actually matter

These are the rules that change whether you get an answer in 2 calls vs. 10.

**1. Always pass `file_path`.** On info/source/usages/callers/callees/
tests_for/hierarchy, omitting `file_path` forces a cross-file
disambiguation walk that is roughly 80× slower (~4 ms → ~300 ms). Get
the path from `xmp4_search` and carry it through every follow-up call.
Many symbols share names — without `file_path` the resolver picks the
first match, which is often wrong.

**2. Method names for callers/callees, not class names.** Callers and
callees operate on functions and methods. Asking for callers of `Flask`
(a class) gives instantiation sites, not method calls on instances. If
you want to know who invokes a class's methods, resolve a specific
method first.

**3. Search on leaf names, not dotted.** `xmp4_search` is a
**substring match on the symbol's `display_name`** (= the leaf, e.g.
`send`, not `Session.send`). Dotted-name resolution is a feature of
info/usages/callers/callees/source/hierarchy/tests_for — NOT of search.
Passing `Session.send` to search returns zero results; passing `send`
returns the real hits (then disambiguate by `file_path`). Copy the
canonical `display_name` + `file_path` from search results verbatim into
subsequent calls.

**4. Search ranking is substring by file position, not exactness — the
canonical symbol may be missing from the first page.** Example: in
tokio, `xmp4_search("spawn", max_results=50)` returns 50 substring
matches (`spawn_local_on`, `spawn_blocking`, `spawn_pending_tasks`, …)
and does NOT include the canonical `Function spawn` at
`tokio/src/task/spawn.rs:174`. Fallback: when you expect a symbol in a
known file, call `xmp4_outline(project, file_path)` instead of iterating
search. Outline lists every symbol in that file with kind + line — it
cannot miss an exact match.

**5. `xmp4_outline` is the most reliable first look.** When you do not
know a file's structure, outline it. You will see every symbol with its
kind and line — the quickest way to orient yourself before drilling
deeper. It is also the fallback for the rule above.

**6. Prefer view over source.** When you already know the line range,
`xmp4_view(project, file_path, from_line, to_line)` reads the raw file
directly — fast and cheap. Use `xmp4_source` only when you want the
SCIP next-def heuristic to carve out a specific symbol body, and keep
in mind that it occasionally truncates on CSharp/Java. If `xmp4_source`
returns a body that starts mid-statement or is unexpectedly short,
switch to `xmp4_view` with `from_line` taken from the `xmp4_search`
result.

## Cost budget — preserve the shared server

The xmp4 server is shared with many clients and RAM is the bottleneck.
Knowing which tools are cheap and which are expensive keeps it healthy
and keeps your requests fast.

**Cheap — free to use at will.** Everything SCIP-backed reads in-memory
metadata loaded at startup; a query is a hash/tree lookup in
microseconds with no transient allocations. This covers: `xmp4_projects`,
`xmp4_search`, `xmp4_info`, `xmp4_usages`, `xmp4_callers`, `xmp4_callees`,
`xmp4_tests_for`, `xmp4_outline`, `xmp4_hierarchy`, `xmp4_symbol_at`,
`xmp4_deps`, `xmp4_server`, `xmp4_guide`.

**Moderate — bounded I/O.** `xmp4_view` reads a bounded line range from
disk (hard cap 500 lines per call). `xmp4_source` is slightly heavier
because it walks SCIP ranges to compute the body. Keep view ranges
tight (40–80 lines) unless there is a specific reason to grab more.

**Heavy — do not use by default.** `xmp4_grep` runs ripgrep internals.
Even in single-file mode it is 2–3× more expensive than a SCIP lookup;
in multi-file walk mode it is ~200× more expensive and mmap-heavy.

## Grep policy — single file, last resort

`xmp4_grep` has two tiers:

- **Free tier.** `file_path` (single file, repo-relative) is required.
  Every call scopes to exactly one file.
- **Premium tier.** Multi-file walks via `file_glob` are available only
  when the server is configured with `XMP4_PREMIUM_GREP_WALK=true`.
  Requesting `file_glob` without premium returns an explicit error.

Use grep **only** for text that is provably not a SCIP symbol: config
keys, URL or connection-string literals, environment variable names in
code, macro-generated identifiers absent from the index, comments, or
docstring content.

**Never use grep to find tests that use a method** — `xmp4_tests_for` is
the cheap, SCIP-backed way to get that list. Likewise never grep to find
callers or usages — those have dedicated SCIP tools that are far
cheaper and return structured results.

Never grep in a loop. One targeted single-file call, then navigate with
`xmp4_view`. If you find yourself wanting to grep across many files,
that is almost always a sign that a SCIP tool (`xmp4_search`,
`xmp4_tests_for`, `xmp4_usages`) is the right answer.

## Language kind matrix

The `kind` field in `xmp4_search` results is what the SCIP indexer
assigned for the target language. The tier-1 languages have dependable,
consistent kind tagging:

| Language   | Class-like kind             | Method/function kind    |
|------------|-----------------------------|-------------------------|
| CSharp     | Type                        | Method                  |
| TypeScript | Type (interface), Class     | Method, Function        |
| Python     | Type (class)                | Method, Function        |
| Java       | Class, Interface            | Method                  |
| Rust       | Struct, Trait, Enum         | Function, Method        |
| PHP        | Type (class)                | Method, Function        |

Tier-2 quick notes: Go emits Type/Method (often Symbol when scip-go
cannot infer); JavaScript emits Symbol with scip-node quirks (quoted
`display_name` like `'animate.leave'`); C++ mixes Symbol entries; Dart
emits Class/Method. When you filter with `xmp4_search(kind=...)`, pick
the kind from the row above.

## Dotted-name resolution and its limits

You can pass `Parent.method` to `xmp4_info`, `xmp4_usages`,
`xmp4_callers`, `xmp4_callees`, `xmp4_source`, `xmp4_hierarchy`,
`xmp4_tests_for`. The resolver splits on the rightmost `.`, looks up
`method` in the name index, and filters candidates whose `symbol_id`
contains `Parent` as a descriptor segment.

Three known limitations to recognize — if you hit one of these, do not
keep retrying the same query, switch approach:

- **CSharp explicit interface implementations.** Methods declared as
  `void IDisposable.Dispose()` have `display_name`
  `System.IDisposable.Dispose`, not `Dispose`. Asking `MyClass.Dispose`
  may return `symbol_not_found`. Workaround: search and call the
  qualified name (`System.IDisposable.Dispose`) directly.
- **Multi-level nesting (3+ segments)** currently fails because the
  matcher checks literal `Outer.Inner` against SCIP's `Outer#Inner#`
  form. Workaround: use `Inner.method` (2 segments) and disambiguate
  with `file_path`.
- **Inherited members are not walked.** `Flask.logger` returns
  `symbol_not_found` because `logger` is defined on base class `App`.
  Workaround: query `App.logger` directly, or use
  `xmp4_hierarchy(Flask)` to discover bases first.

## Indexer quality matrix (tier-1)

| Language   | xmp4_source body                                               | xmp4_callers / usages                                               | xmp4_hierarchy.base | xmp4_info signature                       | xmp4_deps                 |
|------------|----------------------------------------------------------------|---------------------------------------------------------------------|---------------------|-------------------------------------------|---------------------------|
| Python     | Full body                                                      | **Under-counts cross-file refs** (scip-python quirk — see caveat below) | Yes                 | May be sparse (e.g. `def send` with no args / no docstring even when body is full) | `name version (pip)`      |
| TypeScript | Full body when source is indexed; **some libraries (e.g. axios) ship only `.d.ts`/`.d.cts` — you get the type surface, not the implementation** | Reliable | Empty               | Good (TS type signatures)                 | `name version (npm)`      |
| Rust       | Full body                                                      | Reliable                                                            | Empty               | Good (function sig + `///` rustdoc)       | `name version (cargo)`    |
| Java       | Mostly full; some methods fall back when SCIP enclosing_range is absent | Reliable                                                          | Empty               | Good                                      | Sparse `. . (maven)`      |
| CSharp     | Full body via next-def heuristic; rare cases truncate          | Reliable                                                            | Yes (when present)  | Good                                      | Sparse `. . (nuget)`      |
| PHP        | Bounded                                                        | Reliable                                                            | Empty               | Good                                      | `name version (composer)` |

If `xmp4_source` returns a body that looks truncated or starts mid-
statement, jump immediately to `xmp4_view(project, file_path,
from_line=<line from xmp4_search>, to_line=from_line+80)`. Do not
iterate `xmp4_source` with different symbol names hoping to get a
cleaner cut — the next-def boundary is what it is.

### Python usages caveat (important)

`xmp4_usages`, `xmp4_callers`, and `xmp4_tests_for` **under-count
cross-file references** on Python projects due to upstream
`scip-python` indexer behavior. Exact-file refs are reliable;
cross-module gaps are not an xmp4 bug.

Symptoms:
- A clearly public method (e.g. `requests.Session.send`) shows only 1–2
  usages, all in its own file.
- `xmp4_tests_for` returns `0 tests found` on a symbol that obviously
  has dozens of tests.

Fallback when you hit this on Python:
1. `xmp4_outline(project, file_path="<path/to/test_file.py>")` on a
   plausible test file to see what it contains.
2. `xmp4_grep(project, pattern="\\.send\\(", file_path="tests/test_requests.py")`
   — single-file grep in a likely test file. Still free-tier, still
   cheap, and hits text matches the SCIP index missed.
3. If you already know the library structure (from earlier
   `xmp4_projects`/`xmp4_search`), try `xmp4_view` on a test file you
   recognize.

Do NOT interpret "0 tests found" as "no tests exist" on Python — the
index is the one under-reporting, not the library.

### TypeScript type-only index caveat

Some TypeScript libraries are indexed from their **`.d.ts` / `.d.cts`
type declaration files only** — the runtime implementation is not
indexed. `xmp4_search`, `xmp4_outline`, and `xmp4_source` on those
projects will show the type surface (interfaces, type aliases,
declared methods) but not function bodies. `xmp4_info` still works
(signatures come from the `.d.ts`). Check your search results: if all
hits point to `.d.ts` / `.d.cts`, that is the whole index, and
`xmp4_source` / `xmp4_tests_for` will be empty or shallow.

## Pagination

Every list-returning tool accepts:

- `page` (1-based, default 1). `page=0` returns `invalid page (1-based)`.
- `page_size` (default 20, max 100). `0` or `>100` returns an error.

Every response begins with a header line — including empty results (do
not compare output against the literal `No results.`):

```
N <item>(s) found (page X/Y)
```

For 0 hits the line is `0 results found (page 1/1)` (or `0 match(es)`,
`0 dep(s)`, etc., with the per-tool noun) followed by `No results.`.

### `max_results` vs `page_size`

- `max_results` (xmp4_search only, default 50) is an absolute ceiling on
  candidates collected from the full SCIP index.
- `page_size` (default 20, max 100) paginates within that ceiling.

To scan beyond the first 50 hits for a substring-heavy query, raise
`max_results` first, then page through.

## `docs` parameter on `xmp4_info`

- `docs="none"` (default): header only; if docs exist, appends
  `[docs available: call with docs=summary|full]`.
- `docs="summary"`: first paragraph of each doc block.
- `docs="full"`: every doc block, unbounded.

## `output_format`

Accepts `"compact"` (default) or `"verbose"`. Verbose adds documentation
lines and richer per-entry detail. `xmp4_source` and `xmp4_deps` ignore
this parameter (their output shape is fixed).

## When source reads return `source base path does not exist`

`xmp4_source` and `xmp4_view` read actual file content from a
per-project `SourceBasePath` on the server data volume. The other tools
(info, outline, search, callers, usages, hierarchy) still work without
source on disk — they read SCIP metadata only.

If you see:

```
Error: invalid_param: source base path does not exist: <path>
```

the index was published but the source tree is not materialized yet.
That is a server-side data-pipeline state, not a client mistake. Use
info/outline/search/callers/usages for that project until source is
re-materialized; do not retry `xmp4_source` hoping for a different
answer.

## Tool reference

| Tool             | Purpose                                                | Cost     | Paginated    |
|------------------|--------------------------------------------------------|----------|--------------|
| xmp4_projects    | Find project by name/language                          | cheap    | yes          |
| xmp4_search      | Find symbols by name                                   | cheap    | yes          |
| xmp4_info        | Symbol details + docs                                  | cheap    | no           |
| xmp4_usages      | All references to a symbol                             | cheap    | yes          |
| xmp4_callers     | Who calls this method (optional `depth=1..=5`)         | cheap    | yes          |
| xmp4_callees     | What this method calls (optional `depth=1..=5`)        | cheap    | yes          |
| xmp4_outline     | All symbols in a file                                  | cheap    | yes          |
| xmp4_hierarchy   | Inheritance chain (derived list paginates)             | cheap    | derived only |
| xmp4_source      | Extract symbol source (SCIP next-def heuristic)        | moderate | no           |
| xmp4_view        | Raw file excerpt by line range (cap 500 lines)         | moderate | no           |
| xmp4_grep        | Text search. Free: `file_path` required. Premium: multi-file walk via `file_glob`. | heavy    | yes          |
| xmp4_symbol_at   | Position→symbol lookup (LSP-style)                     | cheap    | no           |
| xmp4_tests_for   | Tests exercising a symbol (callers × test-file pattern)| cheap    | yes          |
| xmp4_deps        | External dependencies                                  | cheap    | yes          |
| xmp4_server      | Server version + stats                                 | cheap    | no           |
| xmp4_guide       | Skill pointer + minimal cheatsheet                     | cheap    | no           |

## Error variants

- `invalid_param`: malformed `page`/`page_size`/other, missing source
  base path, missing `file_path` in free-tier grep, path traversal in
  `file_path` (rejected: absolute paths and `..` segments),
  `file_glob` without premium tier.
- `invalid_kind`: unknown SCIP kind filter; message lists valid values.
- `symbol_not_found`: no candidate resolved for the symbol name.
- `repo_not_found`: project identifier not in the index.

## Common mistakes — watch for these

- **Grepping as a default search.** Grep is the slowest tool. Whenever
  the target is a SCIP symbol, prefer `xmp4_search` / `xmp4_tests_for` /
  `xmp4_usages`. Grep is for text that is NOT indexed.
- **Passing a dotted name to `xmp4_search`.** Search is substring match
  on the leaf `display_name`. Pass `send`, not `Session.send` — dotted
  resolution is only available on info/usages/callers/etc.
- **Trusting `xmp4_search` to always surface the canonical symbol in
  the first page.** Ranking is position-in-file, not exactness. If a
  common leaf name (`spawn`, `send`, `read`, `run`, …) should be in a
  specific file, skip search and call `xmp4_outline(project, file_path)`.
- **Omitting `file_path`** on info/usages/callers/source/tests_for/
  hierarchy. The resolver fast-path is ~80× faster when `file_path` is
  supplied.
- **Concluding "no tests exist" from `xmp4_tests_for = 0` on Python.**
  scip-python under-counts cross-file references. Fallback to
  `xmp4_outline` of a known test file or a single-file `xmp4_grep` in
  `tests/*`.
- **Calling `xmp4_callers("ClassName")`** instead of a method name.
  Callers/callees work on functions and methods.
- **Treating `0 results found (page 1/1)` as an error string.** It is
  the normal empty-result header.
- **Asking `MyClass.Method`** for an inherited member or explicit
  interface impl — see dotted-name limitations above.
- **Relying on `xmp4_repos`** (deprecated since v0.6.0 — use
  `xmp4_projects`).
- **Grepping `tests/**` to find usage examples.** `xmp4_tests_for` is
  the SCIP-backed way; grep walks every file.

## Quick reference card

```
# Discover
xmp4_projects(language=?, query=?)

# Resolve
xmp4_search(project, query) → file_path + display_name
xmp4_info(project, symbol_name, file_path, docs="summary")

# Navigate
xmp4_outline(project, file_path)
xmp4_callers / callees / usages / hierarchy (project, symbol_name, file_path)

# Read
xmp4_source(project, symbol_name, file_path)
xmp4_view(project, file_path, from_line, to_line)

# Tests-driven exploration (primary use case)
xmp4_tests_for(project, symbol_name, file_path)
→ xmp4_view(project, <test file>, from_line, to_line+80)

# Text grep — last resort, single file only in free tier
xmp4_grep(project, pattern, file_path)
```

## Versioning and re-fetch

When `xmp4_guide` on the server reports a skill version higher than the
one you saved locally, re-fetch the skill file. The URL is stable; the
query string `?v={version}` acts as the cache key.

Authoritative source: <https://example4.ai/xmp4-skill.md>.