econometrics
econometrics plugin for Cursor
checker
Phase 8 parallel checker. Launch three independent subagents simultaneously
# Checker — Phase 8 Parallel Orchestrator
## Role
You are the **Phase 8 Orchestrator**. Your sole job is to:
1. Read the shared baseline context
2. Launch **three subagents in parallel** (in a single message)
3. Wait for all three to finish
4. Synthesize their outputs into a unified Phase 8 report
You do **not** run any econometric analysis yourself. You delegate everything.
---
## Step 1: Read Baseline Context
Before launching subagents, read the shared context file:
```
phase8/context.json
```
If this file does not exist, ask the user to provide:
- `data_path`: path to the cleaned dataset
- `depvar`: dependent variable name
- `treatment`: main independent variable / treatment indicator
- `controls`: list of control variables
- `fe`: fixed effects specification (e.g., `["id", "year"]`)
- `cluster`: clustering level for standard errors
- `method`: estimation method (OLS / IV / DID / RDD / Panel FE)
- `baseline_coef`: point estimate from Phase 6
- `baseline_se`: standard error from Phase 6
- `N`: sample size
Then write this information to `phase8/context.json` before proceeding.
---
## Step 2: Prepare Output Directories
```bash
mkdir -p phase8/robustness
mkdir -p phase8/heterogeneity
mkdir -p phase8/mechanism
```
---
## Step 3: Launch Three Subagents in Parallel
**CRITICAL**: All three Agent tool calls must be issued in a **single message**.
Do not wait for one to finish before launching the others.
### Agent A — Robustness Checker
```
subagent_type: general-purpose
description: "Run robustness checks"
prompt: |
You are a robustness check specialist for an econometric analysis.
## Your Task
Read `phase8/context.json` to get the model specification.
Run a comprehensive set of robustness checks appropriate for the method specified.
Save ALL outputs to `phase8/robustness/`.
## Required Outputs
1. `phase8/robustness/robustness_checks.py` (or .R / .do depending on context)
— executable script that runs all checks
2. `phase8/robustness/table_robustness.tex`
— LaTeX robustness table (landscape, threeparttable format)
3. `phase8/robustness/summary.md`
— plain-language summary: which specs hold, which deviate, overall verdict
## Checks to Run (adapt to method in context.json)
### For OLS / Panel FE:
- R1: Main specification (baseline, for reference)
- R2: Alternative SE clustering (entity / state / two-way)
- R3: Winsorize DV at p1/p99
- R4: Add/remove control variables (Oster δ-test)
- R5: Alternative sample period or subperiod
- R6: Placebo treatment (randomized assignment, effect should be ~0)
### For DID:
- R1: Main TWFE
- R2: Callaway-Sant'Anna estimator
- R3: Sun-Abraham estimator
- R4: Placebo treatment dates (1–2 years earlier)
- R5: Alternative control group
- R6: Wild cluster bootstrap (if clusters < 30)
### For IV:
- R1: Main 2SLS
- R2: LIML
- R3: Alternative instrument set
- R4: Control set sensitivity
- R5: Placebo instrument (no first stage expected)
### For RDD:
- R1: Optimal bandwidth (rdrobust default)
- R2: 50% bandwidth
- R3: 150% bandwidth
- R4: Polynomial order p=2
- R5: Donut-hole (exclude ±1 unit from cutoff)
- R6: Placebo cutoffs
## LaTeX Table Format
Use landscape orientation. Structure:
| Spec | Description | β̂ | SE | p-val | N | R² |
Include Oster δ as a spanning footnote row (not a regular column).
## Summary Requirements (summary.md)
- State verdict: does baseline result hold across all specs?
- Flag any specs where the coefficient changes meaningfully (>20% change)
- Note if significance is lost in any spec and explain why
```
---
### Agent B — Heterogeneity Analyst
```
subagent_type: general-purpose
description: "Run heterogeneity analysis"
prompt: |
You are a heterogeneity analysis specialist for an econometric analysis.
## Your Task
Read `phase8/context.json` to get the model specification.
Identify the most economically meaningful dimensions of heterogeneity.
Save ALL outputs to `phase8/heterogeneity/`.
## Required Outputs
1. `phase8/heterogeneity/heterogeneity_analysis.py` (or .R / .do)
— executable script
2. `phase8/heterogeneity/table_heterogeneity.tex`
— LaTeX table with subgroup coefficients side by side
3. `phase8/heterogeneity/coef_plot.pdf`
— coefficient plot (coefplot style) showing estimates + 95% CI across groups
4. `phase8/heterogeneity/summary.md`
— economic interpretation of heterogeneity patterns
## Analysis to Run
### Step 1: Identify Heterogeneity Dimensions
Based on the research context, choose 3–4 theoretically motivated split dimensions.
Examples: above/below median income, urban/rural, pre/post policy change,
high/low exposure group, young/old, treated early/late.
### Step 2: Subgroup Regressions
For each dimension, run the main specification separately on each subgroup.
Report coefficient, SE, N for each group.
Test if coefficients are statistically different across groups
(use Seemingly Unrelated Regression / interaction test).
### Step 3: Interaction Term Regression
Add treatment × subgroup_indicator to the main spec.
Interpret the interaction coefficient as differential effect.
### Step 4: Coefficient Plot
Plot all subgroup estimates with 95% CI.
Use econometrics:figure skill style guidelines:
- White background, minimal grid
- AER-style fonts (serif, 11pt)
- Horizontal dot-whisker layout
- Save as PDF
## Summary Requirements (summary.md)
- Which subgroup drives the main effect?
- Are heterogeneous effects statistically significant?
- What is the economic interpretation?
- Does heterogeneity support or challenge the proposed mechanism?
```
---
### Agent C — Mechanism Tester
```
subagent_type: general-purpose
description: "Run mechanism tests"
prompt: |
You are a mechanism analysis specialist for an econometric analysis.
## Your Task
Read `phase8/context.json` to get the model specification.
Design and run tests that illuminate the causal channels through which
the treatment affects the outcome.
Save ALL outputs to `phase8/mechanism/`.
## Required Outputs
1. `phase8/mechanism/mechanism_tests.py` (or .R / .do)
— executable script
2. `phase8/mechanism/table_mechanism.tex`
— LaTeX table with mechanism regression results
3. `phase8/mechanism/summary.md`
— economic narrative of the transmission channel
## Analysis to Run
### Step 1: Identify Plausible Channels
Based on the research context, hypothesize 2–3 mechanisms through which
treatment → outcome. For each mechanism, identify a measurable
intermediate variable (mediator M).
### Step 2: Three-Equation Mediation
For each mediator M:
- Equation 1: outcome ~ treatment + controls + FE (total effect)
- Equation 2: M ~ treatment + controls + FE (first stage of channel)
- Equation 3: outcome ~ treatment + M + controls + FE (direct effect)
Report how much the treatment coefficient shrinks when M is included.
This is the indirect effect (treatment → M → outcome).
### Step 3: Sobel / Bootstrap Mediation Test
Test H0: indirect effect = 0
Use bootstrap SE for the product-of-coefficients (β₂ × β₃).
Report: indirect effect, 95% bootstrap CI, proportion of total effect mediated.
### Step 4: Intermediate Outcome Regressions
Run the main specification with each intermediate outcome as the DV.
If the treatment affects intermediate outcomes consistent with the
proposed mechanism, this supports the channel.
### Step 5: Ruling Out Competing Channels
For each alternative mechanism, test whether the data are inconsistent
with that channel (e.g., treatment should NOT affect variable X if
the proposed channel is correct but the alternative is wrong).
## Summary Requirements (summary.md)
- Which channel accounts for what share of the total effect?
- Is the mediation statistically significant?
- Can competing channels be ruled out?
- Write 1–2 sentences suitable for the paper's Results section
```
---
## Step 4: Synthesize Results
After all three subagents complete, read their `summary.md` files and produce:
### `phase8/phase8_report.md`
```markdown
# Phase 8 Report: Robustness, Heterogeneity & Mechanism
## 8.1 Robustness Summary
[paste robustness/summary.md content, edited for flow]
## 8.2 Heterogeneity Summary
[paste heterogeneity/summary.md content, edited for flow]
## 8.3 Mechanism Summary
[paste mechanism/summary.md content, edited for flow]
## 8.4 Overall Assessment
- Does the main result survive all robustness checks? [YES/NO + explanation]
- Who is the effect concentrated in? [key heterogeneity finding]
- What is the primary transmission channel? [mechanism finding]
- Any concerns requiring attention before Phase 9? [flag issues]
```
---
## Step 5: Report to User
Present the Phase 8 report and explicitly ask:
> "Phase 8 完成。以下是三项检验的汇总报告。请确认:
> 1. 稳健性结果是否符合预期?
> 2. 异质性发现是否与你的理论预测一致?
> 3. 机制检验是否支持你的研究假说?
> 待你确认后,我们进入 Phase 9(全文写作)。"
---
## Error Handling
If any subagent fails or produces incomplete output:
- Do **not** block the entire Phase 8
- Report the failure in the synthesis
- Offer to re-run only the failed track
## File Conflict Prevention
Each subagent writes exclusively to its own subdirectory.
The orchestrator only reads (never writes) to subagent directories until Step 4.
Subagents must **copy** the dataset to their own subdirectory if they need to modify it.beamer-ppt
Create Beamer-style academic PPTX presentations using python-pptx. Produces publication-quality .pptx files with navy-blue Metropolis theme (16:9, frame title bars, progress bar) for conference talks, job market presentations, and seminar slides. Called by /present command.
# Beamer-ppt-Creator
## Purpose
This skill generates professional academic **PPTX** presentations that faithfully replicate the visual style of LaTeX Beamer (Metropolis theme). Output is a `.pptx` file that can be opened, edited, and presented directly in PowerPoint or LibreOffice Impress — no LaTeX installation required.
## When to Use
- Called by `/present` command to produce the final `slides/slides.pptx`
- Preparing conference, seminar, or job market slides
- Converting a completed economics paper into a slide deck
## Design Principles
- **One idea per slide** — split if content overflows
- **Minimum 20pt** for body text; 24pt for frame titles
- **Consistent palette** — navy blue primary, one accent color only
- **Figures over tables** — embed PNG images at ≥ 200 DPI
- **Last slide = Takeaways**, never "Questions?"
---
## Implementation
This skill executes Python code using `python-pptx`. Always install dependencies first:
```bash
pip install python-pptx pdf2image --break-system-packages
apt-get install -y poppler-utils 2>/dev/null || true
```
### Color Palettes by Theme
| Theme | Title Bar `bg` | Accent | Slide `bg` |
|-------|---------------|--------|-----------|
| **A. Metropolis** (default) | `RGB(0, 35, 82)` navy | `RGB(180, 30, 30)` red | `RGB(245, 245, 245)` light gray |
| **B. Minimal** (job market) | `RGB(0, 35, 82)` navy | `RGB(0, 35, 82)` navy | `RGB(255, 255, 255)` white |
| **C. Madrid** (traditional) | `RGB(31, 73, 125)` dark blue | `RGB(189, 152, 44)` gold | `RGB(255, 255, 255)` white |
### Core Helper Functions
```python
from pptx import Presentation
from pptx.util import Inches, Pt, Emu
from pptx.dml.color import RGBColor
from pptx.enum.text import PP_ALIGN
import os
# ── Presentation setup ───────────────────────────────────────────
prs = Presentation()
prs.slide_width = Inches(13.33) # 16:9 widescreen (Beamer aspectratio=169)
prs.slide_height = Inches(7.5)
# ── Color definitions (Metropolis theme) ─────────────────────────
NAVY = RGBColor(0, 35, 82)
RED = RGBColor(180, 30, 30)
LGRAY = RGBColor(245, 245, 245)
WHITE = RGBColor(255, 255, 255)
BLACK = RGBColor(30, 30, 30)
MGRAY = RGBColor(100, 100, 100)
def add_bg(slide, prs, color):
"""Full-slide background rectangle."""
shape = slide.shapes.add_shape(
1, 0, 0, prs.slide_width, prs.slide_height)
shape.fill.solid()
shape.fill.fore_color.rgb = color
shape.line.fill.background()
return shape
def add_frame_title(slide, prs, text, bg=NAVY, fg=WHITE):
"""Navy title bar (1.1 in tall) — mimics Beamer \\frametitle."""
bar = slide.shapes.add_shape(
1, 0, 0, prs.slide_width, Inches(1.1))
bar.fill.solid()
bar.fill.fore_color.rgb = bg
bar.line.fill.background()
tf = bar.text_frame
tf.word_wrap = False
tf.margin_left = Inches(0.3)
tf.margin_top = Inches(0.22)
p = tf.paragraphs[0]
p.text = text
p.font.bold = True
p.font.size = Pt(24)
p.font.color.rgb = fg
p.alignment = PP_ALIGN.LEFT
def add_progress_bar(slide, prs, current, total, color=NAVY):
"""Metropolis-style thin progress bar at bottom."""
h = Inches(0.055)
top = prs.slide_height - h
# Gray track
track = slide.shapes.add_shape(
1, 0, top, prs.slide_width, h)
track.fill.solid()
track.fill.fore_color.rgb = RGBColor(200, 200, 200)
track.line.fill.background()
# Filled portion
filled_w = int(prs.slide_width * current / max(total, 1))
if filled_w > 0:
bar = slide.shapes.add_shape(1, 0, top, filled_w, h)
bar.fill.solid()
bar.fill.fore_color.rgb = color
bar.line.fill.background()
def add_speaker_notes(slide, notes_text):
"""Add speaker notes to a slide."""
slide.notes_slide.notes_text_frame.text = notes_text
```
### Slide Factory Functions
```python
# ── 1. Title slide ───────────────────────────────────────────────
def make_title_slide(prs, title, subtitle, author, institute, date_line):
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_bg(slide, prs, NAVY)
def _tb(left, top, w, h):
tb = slide.shapes.add_textbox(
Inches(left), Inches(top), Inches(w), Inches(h))
tb.text_frame.word_wrap = True
return tb.text_frame
# Paper title
tf = _tb(1, 1.7, 11.33, 2.0)
p = tf.paragraphs[0]
p.text = title; p.font.bold = True
p.font.size = Pt(34); p.font.color.rgb = WHITE
p.alignment = PP_ALIGN.CENTER
# Subtitle
if subtitle:
p2 = tf.add_paragraph()
p2.text = subtitle; p2.font.size = Pt(20)
p2.font.color.rgb = LGRAY; p2.alignment = PP_ALIGN.CENTER
# Author + institute
tf2 = _tb(1, 4.3, 11.33, 1.4)
p3 = tf2.paragraphs[0]
p3.text = author; p3.font.size = Pt(18)
p3.font.color.rgb = WHITE; p3.alignment = PP_ALIGN.CENTER
p4 = tf2.add_paragraph()
p4.text = institute; p4.font.size = Pt(15)
p4.font.color.rgb = LGRAY; p4.alignment = PP_ALIGN.CENTER
# Date / conference
tf3 = _tb(1, 6.1, 11.33, 0.8)
p5 = tf3.paragraphs[0]
p5.text = date_line; p5.font.size = Pt(13)
p5.font.color.rgb = LGRAY; p5.alignment = PP_ALIGN.CENTER
return slide
# ── 2. Content slide (bullet list) ──────────────────────────────
def make_content_slide(prs, title, bullets,
current=None, total=None, bg=LGRAY):
"""
bullets: list of (indent_level, text) tuples.
indent_level 0 = top-level bullet, 1 = sub-bullet.
"""
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_bg(slide, prs, bg)
add_frame_title(slide, prs, title)
tb = slide.shapes.add_textbox(
Inches(0.5), Inches(1.3), Inches(12.33), Inches(5.8))
tf = tb.text_frame; tf.word_wrap = True
for i, (lvl, text) in enumerate(bullets):
p = tf.paragraphs[i] if i == 0 else tf.add_paragraph()
p.text = text; p.level = lvl
p.font.size = Pt(20 if lvl == 0 else 17)
p.font.color.rgb = BLACK
p.space_before = Pt(8 if lvl == 0 else 4)
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
# ── 3. Figure slide ──────────────────────────────────────────────
def make_figure_slide(prs, title, img_path, caption="",
current=None, total=None):
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_bg(slide, prs, LGRAY)
add_frame_title(slide, prs, title)
slide.shapes.add_picture(
img_path,
left=Inches(1.17), top=Inches(1.3),
width=Inches(11.0), height=Inches(5.2))
if caption:
cap = slide.shapes.add_textbox(
Inches(0.5), Inches(6.6), Inches(12.33), Inches(0.7))
cap.text_frame.paragraphs[0].text = caption
cap.text_frame.paragraphs[0].font.size = Pt(11)
cap.text_frame.paragraphs[0].font.color.rgb = MGRAY
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
# ── 4. Regression table slide ────────────────────────────────────
def make_table_slide(prs, title, headers, rows,
footnote="", highlight_last_col=True,
current=None, total=None):
"""
headers: list of str (first col is row label).
rows: list of lists of str.
Last column is treated as the preferred specification (bolded).
"""
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_bg(slide, prs, LGRAY)
add_frame_title(slide, prs, title)
nc = len(headers); nr = len(rows) + 1
tbl = slide.shapes.add_table(
nr, nc,
Inches(0.5), Inches(1.4),
Inches(12.33), Inches(4.5)).table
# Header row — navy background, white bold text
for j, h in enumerate(headers):
c = tbl.cell(0, j)
c.text = h
c.text_frame.paragraphs[0].font.bold = True
c.text_frame.paragraphs[0].font.size = Pt(14)
c.text_frame.paragraphs[0].font.color.rgb = WHITE
c.fill.solid(); c.fill.fore_color.rgb = NAVY
# Data rows
for i, row in enumerate(rows):
for j, val in enumerate(row):
c = tbl.cell(i + 1, j)
c.text = str(val)
c.text_frame.paragraphs[0].font.size = Pt(13)
if highlight_last_col and j == nc - 1:
c.text_frame.paragraphs[0].font.bold = True
if footnote:
fn = slide.shapes.add_textbox(
Inches(0.5), Inches(6.0), Inches(12.33), Inches(1.2))
fn.text_frame.paragraphs[0].text = footnote
fn.text_frame.paragraphs[0].font.size = Pt(10)
fn.text_frame.paragraphs[0].font.color.rgb = MGRAY
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
# ── 5. Two-column slide ──────────────────────────────────────────
def make_two_col_slide(prs, title, left_bullets, right_bullets,
current=None, total=None):
"""Two-column layout (e.g. Robustness slide)."""
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_bg(slide, prs, LGRAY)
add_frame_title(slide, prs, title)
for col_bullets, left_offset in [(left_bullets, 0.4),
(right_bullets, 6.9)]:
tb = slide.shapes.add_textbox(
Inches(left_offset), Inches(1.35),
Inches(5.8), Inches(5.8))
tf = tb.text_frame; tf.word_wrap = True
for i, (lvl, text) in enumerate(col_bullets):
p = tf.paragraphs[i] if i == 0 else tf.add_paragraph()
p.text = text; p.level = lvl
p.font.size = Pt(18 if lvl == 0 else 15)
p.font.color.rgb = BLACK
p.space_before = Pt(6 if lvl == 0 else 3)
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
```
### PDF → PNG Conversion (for figures from /plot)
```python
import subprocess
def pdf_to_png(pdf_path, dpi=200):
"""Convert PDF figure to PNG for embedding in PPTX."""
png_base = pdf_path.replace(".pdf", "")
try:
subprocess.run(
["pdftoppm", "-r", str(dpi), "-png", "-singlefile",
pdf_path, png_base],
check=True, capture_output=True)
return png_base + ".png"
except (subprocess.CalledProcessError, FileNotFoundError):
# Fallback: pdf2image
from pdf2image import convert_from_path
imgs = convert_from_path(pdf_path, dpi=dpi)
png_path = png_base + ".png"
imgs[0].save(png_path, "PNG")
return png_path
```
### Save, Export PDF & Verify
```python
import subprocess
def save_and_verify(prs, output_path, export_pdf=True):
"""Save PPTX, optionally export PDF via LibreOffice, then verify."""
os.makedirs(os.path.dirname(output_path), exist_ok=True)
prs.save(output_path)
# ── Verify PPTX ──────────────────────────────────────────────
check = Presentation(output_path)
n = len(check.slides)
assert n > 0, "PPTX is empty — check slide generation."
print(f"✅ PPTX saved : {output_path}")
print(f" {n} slides | {os.path.getsize(output_path) // 1024} KB")
# ── Export PDF ───────────────────────────────────────────────
pdf_path = None
if export_pdf:
pdf_path = _pptx_to_pdf(output_path)
return output_path, pdf_path
def _pptx_to_pdf(pptx_path):
"""Convert PPTX → PDF using LibreOffice headless."""
out_dir = os.path.dirname(pptx_path)
try:
result = subprocess.run(
["libreoffice", "--headless", "--convert-to", "pdf",
"--outdir", out_dir, pptx_path],
capture_output=True, text=True, timeout=120
)
pdf_path = pptx_path.replace(".pptx", ".pdf")
if os.path.exists(pdf_path):
print(f"✅ PDF exported: {pdf_path}")
print(f" {os.path.getsize(pdf_path) // 1024} KB")
return pdf_path
else:
print(f"⚠️ LibreOffice conversion failed: {result.stderr.strip()}")
print(" → Open slides.pptx in PowerPoint and export manually.")
return None
except FileNotFoundError:
print("⚠️ LibreOffice not found. Install with:")
print(" apt-get install -y libreoffice # Ubuntu/Debian")
print(" brew install --cask libreoffice # macOS")
print(" → You can also export PDF from PowerPoint / LibreOffice Impress.")
return None
except subprocess.TimeoutExpired:
print("⚠️ LibreOffice timed out. Try running manually:")
print(f" libreoffice --headless --convert-to pdf {pptx_path}")
return None
```
---
## Slide Structure by Presentation Type
| Slide Section | 15-min conf (≤15) | 45-min seminar (≤30) | Job market (≤20) |
|---------------|:-----------------:|:--------------------:|:----------------:|
| Title | 1 | 1 | 1 |
| Motivation | 1–2 | 2–3 | 2–3 |
| This Paper | 1 | 1 | 1 |
| Related Lit | — | 1–2 | 1–2 |
| Data | 1 | 2 | 2 |
| Identification | 2 | 3–4 | 3 |
| Main Results | 3 | 5–7 | 4–5 |
| Robustness | 1 | 2–3 | 2 |
| Heterogeneity | — | 2–3 | 1–2 |
| Takeaways | 1 | 1 | 1 |
---
## Best Practices
1. **One message per slide** — split if content overflows
2. **Use figures over tables** — embed PNG at ≥ 200 DPI
3. **Bold the preferred specification** column in regression tables
4. **Add speaker notes** to every key slide via `add_speaker_notes()`
5. **Prepare appendix slides** for anticipated Q&A
6. **Timing**: budget 1.5 min/slide; final slide must be Takeaways
## Common Pitfalls
- ❌ Too much text (max 5 bullets per slide, max 10 words per bullet)
- ❌ Tables with more than 4 columns
- ❌ Ending with "Thank you / Questions?" — use Takeaways instead
- ❌ Embedding low-resolution images (< 150 DPI looks blurry on projectors)
- ❌ Skipping the "This Paper" preview slidedata-pipeline
End-to-end data pipeline for empirical research: fetch economic data from APIs (FRED, World Bank, IMF, BLS, OECD, Yahoo Finance), clean and transform raw data, construct strategy-specific variables, and validate panel structure. Use when asked to fetch data, download data, clean data, merge datasets, prepare analysis-ready data.
# Data-Pipeline
## Purpose
本 skill 是一体化数据管道,由 `/data` 命令调用,负责完整的数据工作流:
1. **数据获取**(Part 1)— 从多个 API 自动下载原始数据
2. **数据清洗与转换**(Part 2)— 合并、去重、缺失值处理、异常值检测、变量构建
3. **质量保证**(Part 3)— 验证面板结构、进行质量审计、输出分析就绪的数据集
产出:`data/clean/[project_name]_clean.[dta|parquet|csv]` + `data/clean/cleaning_log.md`
---
## When to Use
- 由 `/data` 命令自动调用(主要场景,分别在 Step 1 和 Step 2 调用)
- 独立触发:用户需要获取 + 清洗数据的完整流程
---
## Part 1: Data Acquisition(数据获取)
### Step 0: API Key Setup Check
**执行任何代码前必须先检查所需 API Key。**
1. 读取 `[plugin_root]/.env` 文件(与 `CLAUDE.md` 同目录,即插件根目录)
2. 检查 `FRED_API_KEY` 和 `BLS_API_KEY`
**若 `FRED_API_KEY` 缺失或为空:**
- 告知用户:*"需要免费的 FRED API Key。请访问 https://fred.stlouisfed.org/docs/api/api_key.html(约 1 分钟)。粘贴 Key 后我会保存。"*
- 等待输入,然后将 `FRED_API_KEY=<value>` 追加到 `.env`
**若 `BLS_API_KEY` 缺失:**
- 告知用户这是可选但可提升速率限制的 Key,地址 https://www.bls.gov/developers/
**若 `.env` 已存在且 Key 已设置:**
- 静默加载,通过 `python-dotenv` 注入所有代码中
```python
from dotenv import load_dotenv
load_dotenv() # 自动从 CWD 及父目录搜索 .env
```
> **⚠️ `.env` 格式要求:** 标准 `KEY=VALUE` 格式(每行一个),**非 JSON**。若为 JSON 格式,需先转换。
---
### Step 1: 接收调用上下文
**本 skill 由 `/data` 传入结构化上下文,无需重新询问用户。**
接收参数:
| 参数 | 说明 | 示例 |
|------|------|------|
| `target_datasets` | 目标数据集名称列表 | `["FRED宏观", "World Bank"]` |
| `variables` | 变量清单 | `Y=gdp_growth, Z=tariff_rate` |
| `time_range` | 起止年份与频率 | `2000–2023, 年度` |
| `geo_scope` | 地理范围 | `中国31省` / `美国全国` |
| `identification_strategy` | 识别策略 | `DiD` / `RDD` / `IV` / `Panel FE` |
| `output_format` | 输出格式 | `csv`(默认)/ `dta` / `parquet` |
---
### Step 2: 选择合适的数据源
根据传入的数据集名称和地理范围,选择对应 API:
| 数据类型 | 最佳数据源 | 方式 |
|---------|----------|------|
| 美国宏观时间序列 | FRED | `fredapi` |
| 全球发展指标 | World Bank | `wbdata` |
| 美国劳动力市场 | BLS API v2 | `requests`(需分块) |
| 跨国宏观/金融 | IMF WEO | `imf-reader` |
| 金融资产价格 | Yahoo Finance | `yfinance` |
> **无法自动获取的数据源(需用户手动下载):**
> - **OECD**:数据端点受 Cloudflare 防护,沙箱不可访问。优先用 IMF WEO 替代。
> - **国家统计局(NBS)**:API 限制中国大陆 IP。请前往 https://data.stats.gov.cn 下载后走手动上传流程。
---
### Step 3: 生成并执行获取代码
生成的脚本必须包含:
**① 基础要素**
- API Key 通过环境变量注入(`python-dotenv`),不硬编码
- API 请求失败的错误处理(捕获异常、打印错误信息)
- 系列 ID 的中英文注释说明
**② 输出路径规范**
```python
from datetime import date
filename = f"data/raw/{source_name}_{date.today().strftime('%Y%m%d')}.csv"
df.to_csv(filename, index=False)
```
**③ 写入 `data/raw/data_log.md`**
```python
log_entry = f"""
## {dataset_name} 获取记录
- 获取日期:{date.today()}
- 数据来源:{source_name}
- 原始文件路径:{filename}
- 变量数:{df.shape[1]},观测数:{df.shape[0]}
- 频率:{frequency}
- 合并键:{merge_key}
"""
os.makedirs("data/raw", exist_ok=True)
with open("data/raw/data_log.md", "a", encoding="utf-8") as f:
f.write(log_entry)
```
**④ 关键变量验证**
```python
required_vars = [Y_var, D_var, Z_var] + control_vars
available = df.columns.tolist()
found = [v for v in required_vars if v in available]
missing = [v for v in required_vars if v not in available]
print("=== 变量到位检查 ===")
for v in found:
print(f"✅ {v}")
for v in missing:
print(f"⚠️ {v} — 需在清洗阶段处理")
```
---
## Part 2: Data Cleaning & Transformation(数据清洗与转换)
### Step 0: 接收调用上下文
接收参数(来自 `identification-memo.md`):
| 参数 | 说明 | 示例 |
|------|------|------|
| `identification_strategy` | 识别策略 | `DiD` / `RDD` / `IV` |
| `Y_var` | 结果变量 | `log_gdp_growth` |
| `D_var` | 处理变量 | `policy_dummy` |
| `Z_var` | 识别变量 | `tariff_rate_1990` |
| `control_vars` | 协变量列表 | `["log_gdp", "population"]` |
| `id_var` | 面板个体标识 | `province_code` |
| `time_var` | 时间变量 | `year` |
| `input_files` | 原始数据文件路径 | `["data/raw/fred_*.csv"]` |
| `merge_plan` | 多源合并方案 | 主文件 + 辅助文件 |
---
### Step 1: 多源数据合并(条件触发)
**触发条件:** `input_files` 包含 ≥2 个文件且 `merge_plan` 已确认
#### 1.1 执行合并
```python
import pandas as pd
df = pd.read_csv("data/raw/[主文件]")
n_before = len(df)
for aux_file, merge_key, how in merge_plan:
aux = pd.read_csv(f"data/raw/{aux_file}")
df = df.merge(aux, on=merge_key, how=how, suffixes=("", "_aux"))
print(f"合并 {aux_file}:{len(df):,} 行({how} join on {merge_key})")
```
#### 1.2 合并质量报告
```python
n_after = len(df)
match_rate = n_after / n_before * 100
print(f"匹配率:{match_rate:.1f}%")
dups = df.duplicated(subset=merge_key).sum()
if dups > 0:
print(f"⚠️ 发现 {dups} 行重复")
for v in [Y_var, D_var, Z_var]:
miss_rate = df[v].isna().mean() * 100
print(f"{v} 缺失率:{miss_rate:.1f}%")
```
若匹配率低于 70%,需向用户报告并说明原因。
---
### Step 2: 通用清洗流水线
按以下顺序执行,**每步都在 `cleaning_log.md` 中记录样本量和关键指标变化**。
#### 2.1 重复观测检测与处理
```python
dup_mask = df.duplicated(subset=[id_var, time_var], keep=False)
n_dups = dup_mask.sum()
print(f"重复观测:{n_dups} 行")
if n_dups > 0:
df_dups = df[dup_mask]
df_dups.to_csv("data/clean/duplicates_removed.csv", index=False)
df = df.drop_duplicates(subset=[id_var, time_var], keep="first")
```
#### 2.2 缺失值编码识别与处理
```python
MISSING_CODES = [-99, -88, -77, -9, 9999, 99999, -9999]
df.replace(MISSING_CODES, pd.NA, inplace=True)
miss_report = df.isnull().mean().sort_values(ascending=False)
miss_high = miss_report[miss_report > 0.05]
print("缺失率 > 5% 的变量:")
for var, rate in miss_high.items():
print(f" {var}: {rate:.1%}")
```
**缺失值处理决策**:
| 情形 | 推荐处理 |
|------|---------|
| Y / D 缺失 | 明确报告,删除;不得插补 |
| Z 缺失 | 明确报告;可能非随机 |
| 协变量 < 5% | 可考虑均值/中位数插补 |
| 协变量 5–20% | 建议多重插补或敏感性分析 |
| 协变量 > 20% | 寻找替代变量 |
#### 2.3 变量类型修正与命名规范化
```python
# 列名统一为 snake_case
df.columns = (
df.columns
.str.strip()
.str.lower()
.str.replace(r"\s+", "_", regex=True)
.str.replace(r"[^a-z0-9_]", "", regex=True)
)
# 关键变量类型修正
df[time_var] = pd.to_numeric(df[time_var], errors="coerce").astype("Int64")
df[id_var] = df[id_var].astype(str).str.strip()
for col in numeric_vars:
df[col] = pd.to_numeric(df[col], errors="coerce")
```
#### 2.4 异常值检测与处理
```python
def winsorize(series, lo=0.01, hi=0.99):
"""将连续变量在 lo/hi 分位数处缩尾"""
lo_val = series.quantile(lo)
hi_val = series.quantile(hi)
return series.clip(lo_val, hi_val)
# 仅对连续型协变量缩尾,不对 Y / D / Z 自动缩尾
for col in continuous_controls:
n_outliers = ((df[col] < df[col].quantile(0.01)) |
(df[col] > df[col].quantile(0.99))).sum()
print(f"{col}: {n_outliers} 个观测超出 [p1, p99]")
df[f"{col}_w"] = winsorize(df[col]) # 保留原始变量
```
> ⚠️ **经济学提醒**:Winsorize 需要经济学判断,不是默认必须步骤。
#### 2.5 变量标签(Stata 输出时)
```stata
label variable `Y_var' "结果变量:[描述]"
label variable `D_var' "处理变量:[描述]"
label variable `Z_var' "识别变量:[描述]"
```
---
### Step 3: 识别策略专属变量构建
根据传入的 `identification_strategy`,**自动触发**对应的变量构建模块。
---
#### 策略 A: 双重差分(DiD / Event Study)
```stata
* DiD 核心变量构建
gen treated = (group_var == "treated_group")
gen post = (year >= policy_year)
gen treated_post = treated * post
gen event_time = year - policy_year
* 交错处理时点
bysort id: egen first_treated_year = min(cond(treated_post == 1, year, .))
gen relative_time = year - first_treated_year
```
**Python 版本:**
```python
df["treated"] = (df["group_var"] == "treated_group").astype(int)
df["post"] = (df[time_var] >= policy_year).astype(int)
df["treated_post"] = df["treated"] * df["post"]
df["event_time"] = df[time_var] - policy_year
first_treated = (df[df["treated_post"] == 1]
.groupby(id_var)[time_var].min()
.rename("first_treated_year"))
df = df.merge(first_treated, on=id_var, how="left")
df["relative_time"] = df[time_var] - df["first_treated_year"]
```
---
#### 策略 B: 断点回归(RDD)
```stata
gen running_centered = running_var - cutoff_value
gen above_cutoff = (running_var >= cutoff_value)
gen running_above = running_centered * above_cutoff
```
**Python 版本:**
```python
df["running_centered"] = df["running_var"] - cutoff_value
df["above_cutoff"] = (df["running_var"] >= cutoff_value).astype(int)
df["running_above"] = df["running_centered"] * df["above_cutoff"]
print(f"阈值以上样本:{df['above_cutoff'].sum()}")
```
---
#### 策略 C: 工具变量(IV)
```python
# 范式 1:历史值作工具变量
df["Z_historical"] = df["D_var_baseyear"]
# 范式 2:Bartik 工具
df["Z_bartik"] = (df["industry_share_base"] * df["national_growth"]).groupby(id_var).transform("sum")
# 范式 3:政策 × 时间外生变动
df["Z_policy_interact"] = df["policy_var"] * df["exogenous_shock"]
# 初步检查:Z 与 D 的相关性
corr = df[[Z_var, D_var]].corr().iloc[0, 1]
print(f"Z-D 相关系数:{corr:.3f}")
if abs(corr) < 0.1:
print("⚠️ 相关性较弱,一阶段 F 可能 < 10")
```
---
#### 策略 D: 面板固定效应(Panel FE)
```python
# 唯一标识符验证
assert not df.duplicated(subset=[id_var, time_var]).any()
# 面板平衡性检查
counts = df.groupby(id_var)[time_var].count()
is_balanced = (counts == counts.max()).all()
print(f"面板平衡性:{'强平衡' if is_balanced else '弱平衡'}")
# Within / Between 方差分解
df["Y_mean_i"] = df.groupby(id_var)[Y_var].transform("mean")
df["Y_within"] = df[Y_var] - df["Y_mean_i"]
var_within = df["Y_within"].var()
var_between = df["Y_mean_i"].var()
print(f"Within 方差占比:{var_within / (var_within + var_between):.1%}")
```
---
#### 策略 E: 合成控制(Synthetic Control)
```python
assert id_var in df.columns and time_var in df.columns
df["is_treated_unit"] = (df[id_var] == treated_unit_id).astype(int)
df["is_donor"] = (1 - df["is_treated_unit"])
df["pre_treatment"] = (df[time_var] < treatment_year).astype(int)
df["post_treatment"] = (df[time_var] >= treatment_year).astype(int)
n_donors = df[df["is_donor"] == 1][id_var].nunique()
T_pre = df[df["pre_treatment"] == 1][time_var].nunique()
print(f"控制池规模:{n_donors},预处理期:{T_pre}")
if T_pre < 5:
print("⚠️ 预处理期较短,匹配质量可能受限")
```
---
### Step 4: 面板结构验证(条件触发)
**触发条件:** `identification_strategy` 为 DiD、Panel FE 或 Synthetic Control
```python
print("=== 面板结构验证 ===")
print(f"观测总数:{len(df):,}")
print(f"个体数:{df[id_var].nunique()}")
print(f"时间期数:{df[time_var].nunique()}")
print(f"时间范围:{df[time_var].min()} – {df[time_var].max()}")
obs_per_unit = df.groupby(id_var).size()
print(f"每个体观测数:min={obs_per_unit.min()}, max={obs_per_unit.max()}")
if obs_per_unit.min() == obs_per_unit.max():
print("✅ 强平衡面板")
else:
print(f"⚠️ 不平衡面板")
```
---
## Part 3: Quality Assurance(质量保证)
### Step 1: 关键变量最终审计
```python
print("\n=== 关键变量最终状态 ===")
audit_vars = [Y_var, D_var, Z_var] + control_vars
for v in audit_vars:
if v not in df.columns:
print(f"❌ {v}:不存在!")
continue
miss = df[v].isna().mean()
mean = df[v].mean() if df[v].dtype != object else "N/A"
std = df[v].std() if df[v].dtype != object else "N/A"
print(f"{'✅' if miss < 0.05 else '⚠️ '} {v}: 缺失={miss:.1%}, 均值={mean:.4g}, 标准差={std:.4g}")
```
---
### Step 2: 保存清洗后数据集
```python
import os
from datetime import date
os.makedirs("data/clean", exist_ok=True)
output_path_parquet = f"data/clean/{project_name}_clean.parquet"
df.to_parquet(output_path_parquet, index=False)
print(f"✅ 已保存:{output_path_parquet}({len(df):,} 行 × {df.shape[1]} 列)")
# 如需 Stata 格式
try:
import pyreadstat
output_path_dta = f"data/clean/{project_name}_clean.dta"
pyreadstat.write_dta(df, output_path_dta)
print(f"✅ 已保存:{output_path_dta}")
except ImportError:
print("(如需 .dta 格式,请安装 pyreadstat)")
```
---
### Step 3: 写入 `cleaning_log.md`
每次清洗完成后自动追加:
```python
log = f"""
## 数据清洗记录 — {project_name}
- 清洗日期:{date.today()}
- 识别策略:{identification_strategy}
- 输入文件:{input_files}
- 输出文件:data/clean/{project_name}_clean.parquet
### 样本量变化
| 步骤 | 操作 | 样本量 |
|------|------|--------|
| 原始数据(合并后) | — | {n_raw:,} |
| 删除重复观测 | ({id_var}, {time_var}) | {n_after_dedup:,} |
| 删除 Y/D 缺失 | 结果/处理变量 | {n_after_ymiss:,} |
| 最终样本 | — | {len(df):,} |
### 关键变量缺失率(清洗后)
| 变量 | 角色 | 缺失率 |
|------|------|--------|
{chr(10).join(f"| {v} | {role} | {df[v].isna().mean():.1%} |" for v, role in zip([Y_var, D_var, Z_var], ["Y", "D", "Z"]))}
### 识别策略专属变量
{chr(10).join(f"- `{v}`:已构建" for v in strategy_vars_built)}
### 数据质量问题记录
{quality_notes}
"""
os.makedirs("data/clean", exist_ok=True)
with open("data/clean/cleaning_log.md", "a", encoding="utf-8") as f:
f.write(log)
print("✅ 已写入 data/clean/cleaning_log.md")
```
---
## Common Pitfalls
| ❌ 常见错误 | ✅ 正确做法 |
|------------|------------|
| 合并前不检查合并键唯一性 | `assert df[key].is_unique` |
| 对 Y / D 自动 Winsorize | 缩尾前须有经济学判断 |
| 用全局均值填充 Y / D 缺失 | Y/D 应删除,不应插补 |
| DiD 用全局政策年份,忽视交错处理 | 用个体自身 `first_treated_year` |
| 覆盖原始数据 | `data/raw/` 仅读,`data/clean/` 用于保存 |
| 未记录样本量变化 | 每步都在 `cleaning_log.md` 记录 |
| 合并后不检查匹配率 | 低于 70% 需向用户报告 |
| 对 RDD 忘记密度检验 | McCrary 检验在 EDA 阶段执行 |
---
## Requirements
```bash
pip install pandas pyreadstat python-dotenv fredapi wbdata requests yfinance imf-reader
```
## Related Skills & Commands
- **`/data`**:调用本 skill 的上级命令
- **`results-analysis`**:接收本 skill 的输出,生成描述性统计与探索性分析
- **`did-analysis` / `rdd-analysis` / `iv-estimation` / `panel-data`**:应用本 skill 构建的变量did-analysis
|
# Difference-in-Differences (DID) Skill
This skill guides complete DID analysis: from assumption validation and model specification to staggered treatment designs and event study regressions. Designed for policy evaluation and natural experiment settings.
## Core DID Logic
DID compares the change in outcomes for a treatment group before and after treatment to the change for a control group over the same period.
**DID Estimator** = (Ȳ_treat,post − Ȳ_treat,pre) − (Ȳ_ctrl,post − Ȳ_ctrl,pre)
**Key Assumption (Parallel Trends)**: In the absence of treatment, the treatment group's outcome would have evolved in parallel with the control group.
## DID Workflow
1. **Design check**: Confirm treatment/control assignment and timing
2. **Parallel trends**: Test with pre-treatment event study regression
3. **Baseline regression**: 2×2 DID or TWFE regression
4. **Staggered design check**: If adoption dates vary, use robust estimators
5. **Robustness**: Placebo treatment, alternative control groups, callaway-santanna
## Basic 2×2 DID Model
```
Y_it = β₀ + β₁·Treat_i + β₂·Post_t + β₃·(Treat_i × Post_t) + ε_it
β₃ = DID estimate (ATT)
```
### Code Templates
```python
# Python — 2×2 DID with TWFE
import statsmodels.formula.api as smf
# Simple 2x2
model = smf.ols('y ~ treat + post + treat_post', data=df).fit(cov_type='HC3')
# TWFE with entity and time FE (preferred)
from linearmodels.panel import PanelOLS
df_panel = df.set_index(['entity_id', 'year'])
twfe = PanelOLS(df_panel['y'], df_panel[['treat_post']],
entity_effects=True, time_effects=True)
result = twfe.fit(cov_type='clustered', cluster_entity=True)
print(result.summary)
```
```r
# R — TWFE
library(plm); library(lmtest); library(sandwich)
panel_df <- pdata.frame(df, index = c("entity_id", "year"))
twfe <- plm(y ~ treat_post, data = panel_df, model = "within", effect = "twoways")
coeftest(twfe, vcov = vcovHC(twfe, cluster = "group"))
```
```stata
* Stata — TWFE with clustered SE
xtset entity_id year
xtreg y treat_post i.year, fe cluster(entity_id)
* Or equivalently:
reghdfe y treat_post, absorb(entity_id year) cluster(entity_id)
```
## Parallel Trends: Event Study Regression
Replace the single `treat_post` dummy with relative-time dummies to visualize pre-trends:
```stata
* Stata — event study
reghdfe y ib(-1).rel_time, absorb(entity_id year) cluster(entity_id)
coefplot, vertical yline(0) xline(0) ///
title("Event Study: Pre/Post Treatment Effects") ///
xlabel(, angle(45))
```
```r
# R — event study
library(fixest)
es_model <- feols(y ~ i(rel_time, treat, ref = -1) | entity_id + year,
data = df, cluster = ~entity_id)
iplot(es_model, xlab = "Periods relative to treatment")
```
**Interpreting the event study plot:**
- Pre-treatment coefficients ≈ 0 → parallel trends assumption holds
- Pre-trend test: joint F-test for all pre-treatment coefficients = 0
- Post-treatment coefficients show dynamic treatment effects
## Staggered DID
When units adopt treatment at different times, standard TWFE can be biased (Callaway-Sant'Anna, Sun-Abraham).
```r
# R — Callaway-Sant'Anna estimator (csdid)
library(did)
cs_result <- att_gt(yname = "y",
gname = "cohort_year", # year of first treatment (0 if never treated)
idname = "entity_id",
tname = "year",
xformla = ~x1 + x2,
data = df)
# Aggregate to average ATT
aggte(cs_result, type = "simple") # Overall ATT
aggte(cs_result, type = "dynamic") # Dynamic effects
ggdid(cs_result)
```
```r
# R — Sun-Abraham (fixest)
library(fixest)
sa_model <- feols(y ~ sunab(cohort_year, year) | entity_id + year,
data = df, cluster = ~entity_id)
iplot(sa_model)
```
```stata
* Stata — Callaway-Sant'Anna (csdid from SSC)
csdid y x1 x2, ivar(entity_id) time(year) gvar(cohort_year)
csdid_plot
```
## Robustness Checks for DID
1. **Placebo treatment dates**: assign fake treatment 1–2 periods before actual treatment
2. **Placebo treatment groups**: run DID using only control units with a fake treatment
3. **Alternative control groups**: restrict to more comparable controls
4. **Continuous treatment intensity**: use dose-response DID
## Reporting Standards
- Report event study plot as Figure (essential for credibility)
- State the parallel trends assumption and supporting evidence
- Report DID coefficient with clustered SE (cluster at entity level)
- Discuss potential violations: anticipation effects, Ashenfelter's dip, spillovers
- For staggered designs, always use CS or SA estimators and explain why
See `references/did-reference.md` for heterogeneous treatment effects, triple-difference models, synthetic control comparison, Borusyak-Jaravel-Spiess imputation estimator, de Chaisemartin-D'Haultfoeuille estimator, and Roth (2022) pre-trends power analysis.
## Common Pitfalls
- **Using TWFE with staggered treatment and heterogeneous effects**: Standard TWFE is biased — use Callaway-Sant'Anna, Sun-Abraham, or Borusyak-Jaravel-Spiess
- **Clustering at the treatment level**: Don't cluster at the individual level if treatment varies at the state level — cluster at the state level
- **Failing to reject pre-trends ≠ parallel trends hold**: Low power is common; use Roth (2022) power analysis to assess
- **Ignoring anticipation effects**: If agents anticipate treatment, pre-treatment coefficients may be non-zero even with parallel trends
- **Not showing the event study plot**: Reviewers expect to see pre-trends visually — always include the event study figurefigure
Called by /plot to generate and upgrade econometric figures to top-journal standards.
# Publication-Quality Figures
This skill generates figure code that meets the formatting standards of top economics journals (AER, QJE, ReStud, Econometrica, JPE). It covers the most common econometric figure types with precise control over fonts, colors, dimensions, and export formats.
## Step 0: Workflow Interface
When called by `/plot` or another upstream command, this skill accepts a structured context block. **Do not re-ask the user for information already present in the context.**
```
strategy: DiD | RDD | IV | PanelFE | SC | OLS
figure_type: event_study | parallel_trends | rdd_binscatter | mccrary |
iv_scatter | sc_gap | sc_placebo | coefplot | binscatter |
density | timeseries | multipanel
software: python | r | stata
data_path: path to clean dataset (e.g., data/clean/analysis.parquet)
Y_var: outcome variable name
D_var: treatment variable name
Z_var: instrument variable name (IV only)
running_var: running variable name (RDD only)
cutoff_value: numeric cutoff (RDD only)
time_var: time/period variable name
id_var: unit identifier variable name
treatment_timing: first treatment period (DiD/Event Study)
fig_num: integer, for file naming (e.g., 3 → fig03_...)
project_name: string
```
When called standalone (not via `/plot`), prompt the user for `strategy`, `figure_type`, and `software` before proceeding.
**Strategy → required figures routing (execute only figures matching `figure_type`):**
| `strategy` | 必须 (Must) | 推荐 (Recommended) |
|---|---|---|
| `DiD` | `parallel_trends`, `event_study` | `coefplot` (robustness specs) |
| `RDD` | `rdd_binscatter`, `mccrary` | `coefplot` (bandwidth sensitivity) |
| `IV` | `iv_scatter` (first stage + exclusion) | `binscatter` (Z–Y reduced form) |
| `PanelFE` | `coefplot` | `parallel_trends`, `binscatter` |
| `SC` | `sc_gap`, `sc_placebo` | `parallel_trends` (pre-period fit) |
| `OLS` | `binscatter` | `coefplot`, `density` |
**Output convention (always dual-format):**
Every figure must be saved in two files:
- `figures/figNN_name.pdf` — vector, for journal submission (`\includegraphics{}`)
- `figures/figNN_name.png` — 300 DPI raster, for draft preview and slides
## Journal Requirements Summary
| Requirement | AER / QJE / ReStud Standard |
|-------------|----------------------------|
| File format | PDF (vector) or EPS; PNG at 300+ DPI for raster |
| Dimensions | Width: 3.4in (single column) or 7in (full page); height ≤ 9in |
| Font | Matching journal body font; minimum 8pt for labels |
| Colors | Must be readable in grayscale; avoid red-green pairs |
| Line width | ≥ 0.5pt for data lines; ≥ 0.75pt for axes |
| Legend | Inside plot area or below; no box border preferred |
| Notes | Figure notes below, starting with "Notes:" |
| Numbering | "Figure 1:", "Figure 2:" etc. in caption |
## Base Setup: Journal-Ready Defaults
### Python (matplotlib)
```python
# Python — journal-quality defaults
import matplotlib.pyplot as plt
import matplotlib as mpl
import numpy as np
# AER/QJE style defaults
plt.rcParams.update({
'figure.figsize': (7, 4.5),
'figure.dpi': 300,
'font.family': 'serif',
'font.serif': ['Times New Roman', 'Computer Modern Roman'],
'font.size': 11,
'axes.labelsize': 12,
'axes.titlesize': 13,
'xtick.labelsize': 10,
'ytick.labelsize': 10,
'legend.fontsize': 10,
'axes.linewidth': 0.8,
'lines.linewidth': 1.5,
'lines.markersize': 5,
'axes.spines.top': False,
'axes.spines.right': False,
'savefig.bbox': 'tight',
'savefig.pad_inches': 0.05,
})
# Grayscale-safe palette: differentiate series with LINESTYLE + MARKER first,
# color second. This ensures figures are readable when printed in black & white.
# Rule: never rely on color alone to distinguish two series.
COLORS = [
'#000000', # black — primary series / treated
'#555555', # dark gray — secondary series / control / synthetic
'#999999', # medium gray — tertiary series
'#cccccc', # light gray — reference / placebo / background
]
MARKERS = ['o', 's', '^', 'D'] # circle, square, triangle, diamond
LINESTYLES = ['-', '--', '-.', ':'] # solid, dashed, dash-dot, dotted
# Color for fills / shaded CI bands — always use low alpha so gray stays readable
FILL_ALPHA = 0.15
# For figures that genuinely benefit from color (slides, online appendix),
# use this color-enhanced palette as an explicit opt-in:
COLORS_COLOR = ['#2c3e50', '#2980b9', '#27ae60', '#e67e22'] # dark-blue-green-orange
```
### R (ggplot2)
```r
# R — journal-quality ggplot2 theme
library(ggplot2)
library(scales)
theme_econ <- function(base_size = 11, base_family = "serif") {
theme_minimal(base_size = base_size, base_family = base_family) %+replace%
theme(
# Clean axes
panel.grid.major = element_line(color = "grey90", linewidth = 0.3),
panel.grid.minor = element_blank(),
axis.line = element_line(color = "black", linewidth = 0.5),
axis.ticks = element_line(color = "black", linewidth = 0.3),
# Text
axis.title = element_text(size = rel(1.1)),
axis.text = element_text(size = rel(0.9), color = "black"),
plot.title = element_text(size = rel(1.2), face = "bold", hjust = 0),
plot.subtitle = element_text(size = rel(0.95), color = "grey30"),
plot.caption = element_text(size = rel(0.8), hjust = 0, color = "grey40"),
# Legend
legend.position = "bottom",
legend.title = element_blank(),
legend.background = element_blank(),
legend.key = element_blank(),
# Margins
plot.margin = margin(10, 15, 10, 10)
)
}
# Grayscale-safe palette: differentiate with linetype + shape, color secondary
# Never rely on color alone to distinguish series (must survive B&W printing)
econ_gray <- c("#000000", "#555555", "#999999", "#cccccc")
econ_ltype <- c("solid", "dashed", "dotdash", "dotted")
econ_shape <- c(16, 15, 17, 18) # circle, square, triangle, diamond
# Color opt-in for slides / online appendix only:
econ_colors <- c("#2c3e50", "#2980b9", "#27ae60", "#e67e22")
# Export function
save_econ_fig <- function(plot, filename, width = 7, height = 4.5) {
ggsave(filename, plot, width = width, height = height, dpi = 300,
device = cairo_pdf) # vector PDF with embedded fonts
}
```
### Stata
```stata
* Stata — journal-quality graph scheme
set scheme s2color
graph set window fontface "Times New Roman"
* Global graph options for consistency
global graph_opts ///
graphregion(color(white) margin(small)) ///
plotregion(margin(medium)) ///
ylabel(, angle(horizontal) nogrid labsize(small)) ///
xlabel(, labsize(small)) ///
legend(region(lcolor(none)) size(small) rows(1) position(6))
* Export as PDF
graph export "figure.pdf", as(pdf) replace
```
## Common Econometric Figure Types
### 1. Event Study / Dynamic Treatment Effects
```python
# Python — event study plot
import pandas as pd
def plot_event_study(coefs, ses, periods, ref_period=-1,
treatment_time=0, title="Event Study",
ylabel="Coefficient Estimate",
fig_num=1, cluster_desc="entity level",
pretrend_pval=None):
"""
Event study plot with dual confidence intervals (standard in AER/QJE).
- 95% CI: thin errorbar lines (outer bound)
- 90% CI: thick errorbar lines (inner bound, emphasizes economic significance)
- Pre-period shaded in light gray to visually separate pre/post
- Reference period (ref_period) plotted as hollow circle at zero
"""
import numpy as np
fig, ax = plt.subplots(figsize=(7, 4.5))
coefs = np.array(coefs)
ses = np.array(ses)
periods = np.array(periods)
# Mask out reference period (normalized to 0 by construction)
mask = periods != ref_period
# Dual CI bounds
ci95_lo = coefs - 1.96 * ses
ci95_hi = coefs + 1.96 * ses
ci90_lo = coefs - 1.645 * ses
ci90_hi = coefs + 1.645 * ses
# Pre-treatment shading
pre_mask = periods < treatment_time
if pre_mask.any():
ax.axvspan(periods[pre_mask].min() - 0.5,
treatment_time - 0.5,
alpha=0.06, color='grey', zorder=0)
# 95% CI — thin outer lines
ax.errorbar(periods[mask], coefs[mask],
yerr=np.array([coefs[mask] - ci95_lo[mask],
ci95_hi[mask] - coefs[mask]]),
fmt='none', color=COLORS[0], linewidth=0.8,
capsize=3, capthick=0.8, zorder=3, label='95% CI')
# 90% CI — thick inner lines
ax.errorbar(periods[mask], coefs[mask],
yerr=np.array([coefs[mask] - ci90_lo[mask],
ci90_hi[mask] - coefs[mask]]),
fmt='none', color=COLORS[0], linewidth=2.0,
capsize=0, zorder=4, label='90% CI')
# Point estimates (filled) + reference period (hollow at zero)
ax.plot(periods[mask], coefs[mask], 'o', color=COLORS[0],
markersize=5, linewidth=0, zorder=5)
ax.plot([ref_period], [0], 'o', color='white',
markeredgecolor=COLORS[0], markersize=5, zorder=5)
# Zero line and treatment cutoff
ax.axhline(y=0, color='black', linewidth=0.5)
ax.axvline(x=treatment_time - 0.5, color=COLORS[1],
linewidth=0.8, linestyle='--')
ax.text(treatment_time - 0.5, ax.get_ylim()[1],
'Treatment', ha='right', va='top',
fontsize=9, color=COLORS[1])
# Pre-trend annotation
if pretrend_pval is not None:
ax.text(0.02, 0.97,
f"Pre-trend joint test: p = {pretrend_pval:.3f}",
transform=ax.transAxes, fontsize=9,
va='top', ha='left', color='black')
ax.set_xlabel("Periods Relative to Treatment")
ax.set_ylabel(ylabel)
ax.set_title(title)
ax.legend(frameon=False, fontsize=9, loc='lower right')
notes = (f"Notes: 90% (thick) and 95% (thin) confidence intervals shown. "
f"Standard errors clustered at the {cluster_desc}. "
f"Reference period: t = {ref_period} (normalized to zero).")
fig.text(0, -0.05, notes, ha='left', fontsize=8, color='#444444',
wrap=True, transform=ax.transAxes)
plt.tight_layout()
save_fig(fig, fig_num, "event_study")
return fig
```
```r
# R — event study with fixest
library(fixest)
es <- feols(y ~ i(rel_time, treat, ref = -1) | entity_id + year,
data = df, cluster = ~entity_id)
# fixest iplot (quick)
iplot(es, xlab = "Periods Relative to Treatment",
ylab = "Coefficient Estimate",
main = "Event Study: Dynamic Treatment Effects")
# ggplot2 version (full control)
library(broom)
es_df <- tidy(es, conf.int = TRUE) %>%
filter(grepl("rel_time", term)) %>%
mutate(period = as.numeric(gsub(".*::(-?\\d+):.*", "\\1", term)))
# Add reference period
es_df <- bind_rows(es_df,
tibble(period = -1, estimate = 0, conf.low = 0, conf.high = 0))
ggplot(es_df, aes(x = period, y = estimate)) +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high),
fill = econ_colors[1], alpha = 0.15) +
geom_point(color = econ_colors[1], size = 2) +
geom_line(color = econ_colors[1], linewidth = 0.8) +
geom_hline(yintercept = 0, linewidth = 0.5) +
geom_vline(xintercept = -0.5, linetype = "dashed", color = "grey50") +
labs(x = "Periods Relative to Treatment",
y = "Coefficient Estimate",
title = "Event Study: Dynamic Treatment Effects",
caption = "Notes: 95% confidence intervals shown. Standard errors clustered at entity level.") +
theme_econ()
save_econ_fig(last_plot(), "event_study.pdf")
```
```stata
* Stata — event study plot
reghdfe y ib(-1).rel_time, absorb(entity_id year) cluster(entity_id)
coefplot, vertical drop(_cons) ///
yline(0, lcolor(black) lwidth(thin)) ///
xline(4.5, lcolor(gs8) lpattern(dash)) ///
ciopts(recast(rcap) lcolor(navy) lwidth(thin)) ///
mcolor(navy) msymbol(circle) ///
ytitle("Coefficient Estimate") ///
xtitle("Periods Relative to Treatment") ///
title("Event Study: Dynamic Treatment Effects") ///
note("Notes: 95% CIs shown. SEs clustered at entity level.") ///
$graph_opts
graph export "event_study.pdf", as(pdf) replace
```
### 2. Coefficient Plot (Multiple Models)
```python
# Python — coefficient plot comparing specifications
def plot_coefplot(models, model_names, var_names, var_labels=None):
fig, ax = plt.subplots(figsize=(7, 0.6 * len(var_names) + 1.5))
var_labels = var_labels or var_names
n_models = len(models)
offsets = np.linspace(-0.15 * (n_models-1), 0.15 * (n_models-1), n_models)
for j, (coefs, ses, name) in enumerate(zip(
[m['coefs'] for m in models],
[m['ses'] for m in models],
model_names)):
y_pos = np.arange(len(var_names)) + offsets[j]
ax.errorbar(coefs, y_pos, xerr=1.96 * np.array(ses),
fmt='o', color=COLORS[j], markersize=5,
capsize=3, linewidth=1.2, label=name)
ax.axvline(x=0, color='black', linewidth=0.5)
ax.set_yticks(range(len(var_names)))
ax.set_yticklabels(var_labels)
ax.set_xlabel("Coefficient Estimate")
ax.legend(loc='lower right', frameon=False)
ax.invert_yaxis()
plt.tight_layout()
plt.savefig("coefplot.pdf")
```
```r
# R — coefficient plot with modelsummary/modelplot
library(modelsummary)
modelplot(list("OLS" = m1, "IV" = m2, "FE" = m3),
coef_map = c("x1" = "Treatment", "x2" = "Income", "x3" = "Education"),
color = "model") +
geom_vline(xintercept = 0, linetype = "dashed") +
labs(x = "Coefficient Estimate", y = "") +
scale_color_manual(values = econ_colors[1:3]) +
theme_econ()
```
### 3. Binned Scatter Plot (binscatter)
```python
# Python — binned scatter with linear fit
def plot_binscatter(x, y, n_bins=20, controls=None, xlabel="X", ylabel="Y",
title="", residualize=False):
import statsmodels.api as sm
if residualize and controls is not None:
# Residualize both x and y on controls
x = sm.OLS(x, sm.add_constant(controls)).fit().resid
y = sm.OLS(y, sm.add_constant(controls)).fit().resid
# Create bins
bins = pd.qcut(x, n_bins, duplicates='drop')
bin_means = pd.DataFrame({'x': x, 'y': y, 'bin': bins}).groupby('bin').mean()
fig, ax = plt.subplots(figsize=(7, 4.5))
ax.scatter(bin_means['x'], bin_means['y'], color=COLORS[0],
s=40, zorder=5, edgecolors='white', linewidth=0.5)
# Linear fit
z = np.polyfit(bin_means['x'], bin_means['y'], 1)
x_line = np.linspace(bin_means['x'].min(), bin_means['x'].max(), 100)
ax.plot(x_line, np.polyval(z, x_line), color=COLORS[1],
linewidth=1.2, linestyle='--')
slope, pval = z[0], sm.OLS(y, sm.add_constant(x)).fit().pvalues[1]
ax.annotate(f"Slope = {slope:.3f} (p = {pval:.3f})",
xy=(0.05, 0.95), xycoords='axes fraction',
fontsize=10, va='top')
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
ax.set_title(title)
plt.tight_layout()
plt.savefig("binscatter.pdf")
```
```r
# R — binscatter with binsreg (Cattaneo et al.)
library(binsreg)
binsreg(y = df$y, x = df$x, w = ~ x1 + x2, # controls
data = df, dots = c(0, 0), line = c(3, 3),
ci = c(3, 3), cb = c(3, 3),
title = "Binned Scatter: Y vs X",
x.label = "X Variable", y.label = "Y Variable")
# ggplot2 manual version
library(dplyr)
df_binned <- df %>%
mutate(bin = ntile(x, 20)) %>%
group_by(bin) %>%
summarize(x = mean(x), y = mean(y))
ggplot(df_binned, aes(x = x, y = y)) +
geom_point(color = econ_colors[1], size = 3) +
geom_smooth(method = "lm", se = FALSE, color = econ_colors[2],
linetype = "dashed", linewidth = 0.8) +
labs(x = "X Variable", y = "Y Variable") +
theme_econ()
```
```stata
* Stata — binscatter
ssc install binscatter
binscatter y x, controls(x1 x2) nquantiles(20) ///
lcolor(navy) mcolor(navy) ///
ytitle("Y Variable") xtitle("X Variable") ///
title("Binned Scatter: Y vs X") ///
$graph_opts
graph export "binscatter.pdf", as(pdf) replace
```
### 4. RDD Visualization
```python
# Python — RDD plot with local polynomial fit
def plot_rdd(running_var, outcome, cutoff, bandwidth=None,
n_bins=20, title="Regression Discontinuity"):
fig, ax = plt.subplots(figsize=(7, 4.5))
# Binned means (separate for each side)
for side, mask, color in [("Control", running_var < cutoff, COLORS[0]),
("Treated", running_var >= cutoff, COLORS[1])]:
x_side = running_var[mask]
y_side = outcome[mask]
bins = pd.qcut(x_side, min(n_bins, len(x_side)//5), duplicates='drop')
bm = pd.DataFrame({'x': x_side, 'y': y_side, 'bin': bins}).groupby('bin').mean()
ax.scatter(bm['x'], bm['y'], color=color, s=35, zorder=5,
edgecolors='white', linewidth=0.5)
# Local polynomial fit
z = np.polyfit(x_side, y_side, 1)
x_fit = np.linspace(x_side.min(), x_side.max(), 200)
ax.plot(x_fit, np.polyval(z, x_fit), color=color, linewidth=1.5)
# Cutoff line
ax.axvline(x=cutoff, color='grey', linewidth=1, linestyle='--')
ax.set_xlabel("Running Variable")
ax.set_ylabel("Outcome")
ax.set_title(title)
if bandwidth:
ax.axvspan(cutoff - bandwidth, cutoff + bandwidth,
alpha=0.05, color='grey')
plt.tight_layout()
plt.savefig("rdd_plot.pdf")
```
```r
# R — RDD plot (rdrobust)
library(rdrobust)
rdplot(y = df$y, x = df$running_var, c = cutoff,
title = "Regression Discontinuity",
x.label = "Running Variable", y.label = "Outcome",
col.dots = econ_colors[1:2], col.lines = econ_colors[1:2])
# ggplot2 version
ggplot(df, aes(x = running_var, y = y)) +
geom_point(aes(color = running_var >= cutoff), alpha = 0.15, size = 1) +
geom_smooth(data = filter(df, running_var < cutoff),
method = "lm", formula = y ~ poly(x, 1),
color = econ_colors[1], fill = econ_colors[1], alpha = 0.1) +
geom_smooth(data = filter(df, running_var >= cutoff),
method = "lm", formula = y ~ poly(x, 1),
color = econ_colors[2], fill = econ_colors[2], alpha = 0.1) +
geom_vline(xintercept = cutoff, linetype = "dashed", color = "grey40") +
scale_color_manual(values = econ_colors[1:2], guide = "none") +
labs(x = "Running Variable", y = "Outcome",
title = "Regression Discontinuity Design") +
theme_econ()
```
```stata
* Stata — RDD plot
rdplot y running_var, c(cutoff) ///
graph_options(title("Regression Discontinuity") ///
ytitle("Outcome") xtitle("Running Variable") ///
$graph_opts)
graph export "rdd_plot.pdf", as(pdf) replace
```
### 5. Kernel Density / Distribution Comparison
```python
# Python — overlapping density plot
from scipy.stats import gaussian_kde
def plot_density(groups, labels, xlabel="Value", title=""):
fig, ax = plt.subplots(figsize=(7, 4.5))
for i, (data, label) in enumerate(zip(groups, labels)):
kde = gaussian_kde(data, bw_method='silverman')
x_grid = np.linspace(data.min() - data.std(), data.max() + data.std(), 300)
ax.plot(x_grid, kde(x_grid), color=COLORS[i], linewidth=1.5, label=label)
ax.fill_between(x_grid, kde(x_grid), alpha=0.1, color=COLORS[i])
ax.set_xlabel(xlabel)
ax.set_ylabel("Density")
ax.set_title(title)
ax.legend(frameon=False)
plt.tight_layout()
plt.savefig("density.pdf")
```
```r
# R — density comparison
ggplot(df, aes(x = outcome, fill = group, color = group)) +
geom_density(alpha = 0.15, linewidth = 0.8) +
scale_fill_manual(values = econ_colors[1:2]) +
scale_color_manual(values = econ_colors[1:2]) +
labs(x = "Outcome", y = "Density",
title = "Distribution by Group") +
theme_econ()
```
```stata
* Stata — density comparison
twoway (kdensity outcome if group == 0, lcolor(navy) lwidth(medthick)) ///
(kdensity outcome if group == 1, lcolor(cranberry) lwidth(medthick) ///
lpattern(dash)), ///
legend(label(1 "Control") label(2 "Treatment")) ///
ytitle("Density") xtitle("Outcome") ///
title("Distribution by Group") ///
$graph_opts
graph export "density.pdf", as(pdf) replace
```
### 6. Time Series / Trend Plot
```python
# Python — time series with shaded recession bars
def plot_timeseries(dates, series_dict, recessions=None,
ylabel="", title=""):
fig, ax = plt.subplots(figsize=(7, 4.5))
for i, (label, values) in enumerate(series_dict.items()):
ax.plot(dates, values, color=COLORS[i], linewidth=1.5,
linestyle=LINESTYLES[i], label=label)
if recessions:
for start, end in recessions:
ax.axvspan(start, end, alpha=0.08, color='grey')
ax.set_ylabel(ylabel)
ax.set_title(title)
ax.legend(frameon=False, loc='best')
fig.autofmt_xdate()
plt.tight_layout()
plt.savefig("timeseries.pdf")
```
```r
# R — time series with recession shading
library(ggplot2)
ggplot(df, aes(x = date, y = value)) +
geom_rect(data = recessions,
aes(xmin = start, xmax = end, ymin = -Inf, ymax = Inf),
inherit.aes = FALSE, fill = "grey", alpha = 0.1) +
geom_line(aes(color = series, linetype = series), linewidth = 0.8) +
scale_color_manual(values = econ_colors) +
labs(x = "", y = "Value", title = "Time Series Comparison") +
theme_econ()
```
### 7. Synthetic Control Gap Plot
```python
# Python — treated vs synthetic control + gap
def plot_synth(years, treated, synthetic, treatment_year, title=""):
fig, axes = plt.subplots(1, 2, figsize=(12, 4.5))
# Panel A: Levels
ax = axes[0]
ax.plot(years, treated, color=COLORS[0], linewidth=1.5, label="Treated")
ax.plot(years, synthetic, color=COLORS[1], linewidth=1.5,
linestyle='--', label="Synthetic Control")
ax.axvline(x=treatment_year, color='grey', linewidth=0.8, linestyle='--')
ax.set_title("(a) Treated vs. Synthetic Control")
ax.set_ylabel("Outcome")
ax.legend(frameon=False)
# Panel B: Gap
ax = axes[1]
gap = np.array(treated) - np.array(synthetic)
ax.plot(years, gap, color=COLORS[0], linewidth=1.5)
ax.fill_between(years, 0, gap, where=np.array(years) >= treatment_year,
alpha=0.15, color=COLORS[0])
ax.axhline(y=0, color='black', linewidth=0.5)
ax.axvline(x=treatment_year, color='grey', linewidth=0.8, linestyle='--')
ax.set_title("(b) Treatment Effect (Gap)")
ax.set_ylabel("Treated − Synthetic")
for ax in axes:
ax.set_xlabel("Year")
plt.tight_layout()
plt.savefig("synth_plot.pdf")
```
### 8. McCrary Density Test Plot (RDD — Manipulation Check)
The McCrary plot is a **mandatory** diagnostic for every RDD paper. It visualizes the density of the running variable around the cutoff and flags sorting/manipulation. Always pair with the formal `rddensity` test p-value.
```python
# Python — McCrary density plot using rddensity output
# pip install rdd (or use the rddensity R package via rpy2)
import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import gaussian_kde
def plot_mccrary(running_var, cutoff, bandwidth=None,
n_bins=30, fig_num=2, rddensity_pval=None):
"""
McCrary (2008) / Cattaneo et al. density continuity test plot.
Left panel: histogram + separate KDE fits on each side.
Annotates with rddensity p-value if provided.
"""
fig, ax = plt.subplots(figsize=(7, 4.5))
rv = np.array(running_var)
left = rv[rv < cutoff]
right = rv[rv >= cutoff]
# Histogram (same bin width, separate colors)
bins = np.linspace(rv.min(), rv.max(), n_bins + 1)
ax.hist(left, bins=bins, color=COLORS[1], alpha=0.35,
linewidth=0.4, edgecolor='white', label='Below cutoff')
ax.hist(right, bins=bins, color=COLORS[0], alpha=0.55,
linewidth=0.4, edgecolor='white', label='Above cutoff')
# KDE fits — separate for each side
x_left = np.linspace(left.min(), cutoff, 200)
x_right = np.linspace(cutoff, right.max(), 200)
kde_l = gaussian_kde(left, bw_method='silverman')
kde_r = gaussian_kde(right, bw_method='silverman')
# Scale KDE to histogram counts
bin_width = bins[1] - bins[0]
scale = len(rv) * bin_width
ax.plot(x_left, kde_l(x_left) * scale, color=COLORS[1],
linewidth=1.8, linestyle='--')
ax.plot(x_right, kde_r(x_right) * scale, color=COLORS[0],
linewidth=1.8, linestyle='-')
# Cutoff line
ax.axvline(x=cutoff, color='black', linewidth=1.0, linestyle='-')
# Bandwidth shading (optional)
if bandwidth is not None:
ax.axvspan(cutoff - bandwidth, cutoff + bandwidth,
alpha=0.04, color='grey')
# Annotation
pval_text = (f"rddensity p-value = {rddensity_pval:.3f}"
if rddensity_pval is not None else "")
interp = (" → No evidence of manipulation"
if rddensity_pval is not None and rddensity_pval > 0.1
else (" → Possible manipulation" if rddensity_pval is not None else ""))
if pval_text:
ax.text(0.98, 0.97, pval_text + interp,
transform=ax.transAxes, fontsize=9,
va='top', ha='right', color='black')
ax.set_xlabel("Running Variable (centered at cutoff)")
ax.set_ylabel("Count")
ax.set_title("Density Continuity Test (McCrary)")
ax.legend(frameon=False, fontsize=9)
notes = ("Notes: Histogram of running variable around the cutoff. "
"Dashed (solid) line shows kernel density estimate below (above) cutoff. "
"Cattaneo et al. (2018) rddensity test for density discontinuity.")
fig.text(0, -0.05, notes, ha='left', fontsize=8, color='#444444',
wrap=True, transform=ax.transAxes)
plt.tight_layout()
save_fig(fig, fig_num, "mccrary_density")
return fig
```
```r
# R — McCrary plot using rddensity
library(rddensity)
library(rdplotdensity)
rdd_test <- rddensity(X = df$running_var, c = cutoff)
summary(rdd_test) # p-value for density discontinuity
rdplotdensity(rdd_test, df$running_var,
xlabel = "Running Variable (centered at cutoff)",
ylabel = "Density",
title = "Density Continuity Test (McCrary)")
```
```stata
* Stata — McCrary test (DCdensity) + rddensity
ssc install rddensity
ssc install lpdensity
* Visual + test
rddensity running_var, c(cutoff) plot ///
graph_options(title("Density Continuity Test") ///
xtitle("Running Variable") ytitle("Density") $graph_opts)
graph export "figures/fig02_mccrary.pdf", as(pdf) replace
```
### 9. IV First-Stage and Exclusion Restriction Scatter
Two side-by-side scatter plots are standard in IV papers: (a) Z → D (first stage, must show strong correlation), (b) Z → Y (reduced form, evidence for exclusion restriction — Z should affect Y only through D).
```python
# Python — IV diagnostic scatter (2-panel)
def plot_iv_scatter(Z, D, Y, Z_label="Instrument (Z)",
D_label="Treatment (D)", Y_label="Outcome (Y)",
fig_num=3, fstat=None):
"""
Panel (a): First stage — Z vs D scatter + OLS fit line + F-stat annotation
Panel (b): Exclusion check — Z vs Y scatter + OLS fit line
If |corr(Z,Y)| > |corr(Z,D)| × 0.8, print a red warning.
"""
import statsmodels.api as sm
from scipy import stats
Z, D, Y = np.array(Z), np.array(D), np.array(Y)
fig, axes = plt.subplots(1, 2, figsize=(12, 4.5))
for ax, x, y, xlabel, ylabel, title_suffix in [
(axes[0], Z, D, Z_label, D_label, "First Stage (Z → D)"),
(axes[1], Z, Y, Z_label, Y_label, "Reduced Form (Z → Y)"),
]:
# Binned scatter (30 bins) for readability
bins = pd.qcut(x, 30, duplicates='drop')
bm = pd.DataFrame({'x': x, 'y': y, 'bin': bins}).groupby('bin').mean()
ax.scatter(bm['x'], bm['y'], color=COLORS[0], s=30, zorder=5,
edgecolors='white', linewidth=0.4)
# OLS fit + CI band
fit = sm.OLS(y, sm.add_constant(x)).fit()
x_line = np.linspace(x.min(), x.max(), 200)
y_hat = fit.params[0] + fit.params[1] * x_line
ax.plot(x_line, y_hat, color=COLORS[0], linewidth=1.5)
r, p = stats.pearsonr(x, y)
ax.text(0.05, 0.95, f"ρ = {r:.3f} (p = {p:.3f})",
transform=ax.transAxes, fontsize=9, va='top')
ax.set_xlabel(xlabel)
ax.set_ylabel(ylabel)
ax.set_title(f"({chr(97 + list(axes).index(ax))}) {title_suffix}")
# First-stage F-stat annotation
if fstat is not None:
color = 'black' if fstat >= 10 else 'red'
axes[0].text(0.05, 0.82, f"First-stage F = {fstat:.1f}",
transform=axes[0].transAxes, fontsize=9,
va='top', color=color)
# Exclusion restriction warning
r_ZD = abs(np.corrcoef(Z, D)[0, 1])
r_ZY = abs(np.corrcoef(Z, Y)[0, 1])
if r_ZY > r_ZD * 0.8:
axes[1].text(0.05, 0.72,
"⚠ |corr(Z,Y)| > 0.8×|corr(Z,D)|\nExclusion may be weak",
transform=axes[1].transAxes, fontsize=8,
va='top', color='red')
notes = ("Notes: Binned scatter plots (30 bins). Panel (a) shows the first-stage "
"relationship between the instrument and treatment. Panel (b) shows the "
"reduced-form relationship; large correlation relative to Panel (a) may "
"indicate violation of the exclusion restriction.")
fig.text(0, -0.06, notes, ha='left', fontsize=8, color='#444444',
wrap=True, transform=axes[0].transAxes)
plt.tight_layout()
save_fig(fig, fig_num, "iv_scatter")
return fig
```
```r
# R — IV scatter (2-panel with ggplot2 + patchwork)
library(ggplot2); library(patchwork); library(dplyr)
bin_scatter <- function(data, x_var, y_var, n_bins = 30) {
data %>%
mutate(bin = ntile(.data[[x_var]], n_bins)) %>%
group_by(bin) %>%
summarise(x = mean(.data[[x_var]]), y = mean(.data[[y_var]]))
}
p_first <- ggplot(bin_scatter(df, "Z", "D"), aes(x, y)) +
geom_point(size = 2, color = econ_gray[1]) +
geom_smooth(method = "lm", se = TRUE, color = econ_gray[1],
fill = econ_gray[3], alpha = 0.15, linewidth = 0.9) +
labs(x = "Instrument (Z)", y = "Treatment (D)",
title = "(a) First Stage: Z → D") + theme_econ()
p_reduced <- ggplot(bin_scatter(df, "Z", "Y"), aes(x, y)) +
geom_point(size = 2, color = econ_gray[2]) +
geom_smooth(method = "lm", se = TRUE, color = econ_gray[2],
fill = econ_gray[3], alpha = 0.15, linewidth = 0.9,
linetype = "dashed") +
labs(x = "Instrument (Z)", y = "Outcome (Y)",
title = "(b) Reduced Form: Z → Y") + theme_econ()
combined <- p_first | p_reduced
save_econ_fig(combined, "figures/fig03_iv_scatter.pdf", width = 12, height = 4.5)
```
### 10. Synthetic Control Placebo Plot
The placebo (permutation) plot is **mandatory** for SC papers. It runs the same SC procedure for every donor unit, plots all resulting gaps, and shows whether the treated unit's post-treatment gap is extreme relative to the placebo distribution.
```python
# Python — SC placebo (in-space permutation) plot
def plot_sc_placebo(years, treated_gap, placebo_gaps,
treatment_year, mspe_ratio_threshold=5,
fig_num=4):
"""
Placebo inference plot for synthetic control.
- Gray lines: donor unit gaps (all placebo runs)
- Optionally exclude high pre-MSPE placebos (ratio > threshold)
- Black solid line: treated unit gap
- Post-treatment p-value = rank of treated gap / total gaps
"""
years = np.array(years)
treated_gap = np.array(treated_gap)
# MSPE filter: drop placebos with pre-period MSPE >> treated unit
pre_mask = years < treatment_year
treated_pre_mspe = np.mean(treated_gap[pre_mask] ** 2)
filtered_gaps = []
for gap in placebo_gaps:
gap = np.array(gap)
donor_pre_mspe = np.mean(gap[pre_mask] ** 2)
if donor_pre_mspe <= mspe_ratio_threshold * treated_pre_mspe:
filtered_gaps.append(gap)
fig, ax = plt.subplots(figsize=(7, 4.5))
# Donor gaps (light gray)
for gap in filtered_gaps:
ax.plot(years, gap, color=COLORS[2], linewidth=0.6,
alpha=0.5, zorder=1)
# Treated gap (black, thick)
ax.plot(years, treated_gap, color=COLORS[0],
linewidth=2.0, zorder=5, label='Treated unit')
# Zero line and treatment cutoff
ax.axhline(y=0, color='black', linewidth=0.5)
ax.axvline(x=treatment_year, color=COLORS[1],
linewidth=0.8, linestyle='--')
# Post-treatment p-value (rank of treated)
post_mask = years >= treatment_year
treated_post = np.mean(np.abs(treated_gap[post_mask]))
donor_posts = [np.mean(np.abs(g[post_mask])) for g in filtered_gaps]
rank = sum(d >= treated_post for d in donor_posts) + 1
pval = rank / (len(filtered_gaps) + 1)
ax.text(0.98, 0.97,
f"Permutation p = {pval:.3f} (rank {rank}/{len(filtered_gaps)+1})",
transform=ax.transAxes, fontsize=9,
va='top', ha='right', color='black')
ax.set_xlabel("Year")
ax.set_ylabel("Gap (Treated − Synthetic)")
ax.set_title("Placebo Test: In-Space Permutation")
ax.legend(frameon=False, fontsize=9)
notes = (f"Notes: Each gray line shows the gap between a donor unit and its "
f"synthetic control (placebo runs). Donor units with pre-treatment MSPE "
f"more than {mspe_ratio_threshold}× the treated unit are excluded "
f"({len(placebo_gaps) - len(filtered_gaps)} units removed). "
f"Permutation p-value = fraction of placebos with post-treatment gap "
f"≥ treated unit.")
fig.text(0, -0.06, notes, ha='left', fontsize=8, color='#444444',
wrap=True, transform=ax.transAxes)
plt.tight_layout()
save_fig(fig, fig_num, "sc_placebo")
return fig
```
```r
# R — SC placebo plot with ggplot2
library(ggplot2); library(dplyr); library(tidyr)
# placebo_df: columns = year, unit_id, gap
# treated_df: columns = year, gap
ggplot() +
# Donor gaps
geom_line(data = placebo_df,
aes(x = year, y = gap, group = unit_id),
color = econ_gray[3], linewidth = 0.5, alpha = 0.5) +
# Treated gap
geom_line(data = treated_df,
aes(x = year, y = gap),
color = econ_gray[1], linewidth = 1.8) +
geom_hline(yintercept = 0, linewidth = 0.5) +
geom_vline(xintercept = treatment_year,
linetype = "dashed", color = econ_gray[2]) +
labs(x = "Year", y = "Gap (Treated − Synthetic)",
title = "Placebo Test: In-Space Permutation",
caption = paste0("Notes: Gray lines = donor placebo gaps. ",
"Black = treated unit. Permutation p = ", pval, ".")) +
theme_econ()
save_econ_fig(last_plot(), "figures/fig04_sc_placebo.pdf")
```
### 11. Multi-Panel Figures
```python
# Python — multi-panel layout
fig, axes = plt.subplots(2, 2, figsize=(12, 9))
for i, (ax, data, title) in enumerate(zip(
axes.flat, datasets, panel_titles)):
ax.scatter(data['x'], data['y'], color=COLORS[0], s=20, alpha=0.5)
ax.set_title(f"({chr(97+i)}) {title}", fontsize=11)
ax.set_xlabel("X")
ax.set_ylabel("Y")
plt.tight_layout()
plt.savefig("multipanel.pdf")
```
```r
# R — multi-panel with patchwork
library(patchwork)
p1 <- ggplot(df, aes(x, y1)) + geom_point(size = 1, alpha = 0.3) +
labs(title = "(a) Panel A") + theme_econ()
p2 <- ggplot(df, aes(x, y2)) + geom_point(size = 1, alpha = 0.3) +
labs(title = "(b) Panel B") + theme_econ()
p3 <- ggplot(df, aes(x, y3)) + geom_line() +
labs(title = "(c) Panel C") + theme_econ()
p4 <- ggplot(df, aes(x, y4)) + geom_line() +
labs(title = "(d) Panel D") + theme_econ()
combined <- (p1 | p2) / (p3 | p4)
save_econ_fig(combined, "multipanel.pdf", width = 12, height = 9)
```
```stata
* Stata — multi-panel with graph combine
graph combine panel_a panel_b panel_c panel_d, ///
rows(2) cols(2) ///
title("Figure 1: Main Results") ///
$graph_opts
graph export "multipanel.pdf", as(pdf) replace
```
## Standardized Figure Notes Format
Every figure **must** include a Notes line directly below the figure. Notes supply the information a reader needs to evaluate the figure without consulting the text. Use this template:
```
Notes: [What the figure shows — figure type and main variable]. [Sample description:
unit, time period, N]. [CI level and SE clustering]. [Key methodological detail
(bandwidth, bin count, reference period, MSPE filter, etc.)].
[Any data transformation or restriction].
```
**Figure-type specific templates:**
| Figure | Required Notes Content |
|---|---|
| Event study | CI level (90%/95%), SE cluster unit, reference period, pre-trend p-value |
| Parallel trends | Group definitions (treated/control), sample period, smoothing method if any |
| RDD binscatter | Bin count, bandwidth (if restricted), polynomial order, cutoff value |
| McCrary density | Test statistic name (rddensity), p-value, bandwidth |
| IV scatter | Bin count, first-stage F-stat, exclusion restriction caveat |
| SC gap | Donor pool size, weight method (Abadie et al.), pre-period RMSPE |
| SC placebo | Number of placebos, MSPE filter threshold, permutation p-value formula |
| Coefplot | SE type, specs included, omitted baseline |
**LaTeX figure environment with Notes:**
```latex
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{figures/fig03_event_study}
\caption{Event Study: Effect of Policy X on Outcome Y}
\label{fig:event_study}
\begin{minipage}{\textwidth}
\footnotesize
\textit{Notes:} 90\% (thick) and 95\% (thin) confidence intervals shown.
Standard errors clustered at the firm level. Reference period: $t = -1$
(normalized to zero). Pre-trend joint test: $p = 0.42$.
Sample: manufacturing firms, 2000--2015 ($N = 8{,}320$ firm-year observations).
\end{minipage}
\end{figure}
```
## Formatting Checklist
Before submitting to a journal, verify:
- [ ] **Vector format**: Exported as PDF (not PNG/JPG) for line plots
- [ ] **Readable in grayscale**: Print in B&W to check
- [ ] **Font consistency**: Same font family as paper body text
- [ ] **Axis labels**: Descriptive, with units (e.g., "Income (1000 USD)")
- [ ] **No chartjunk**: Remove gridlines, borders, and unnecessary decoration
- [ ] **Proper aspect ratio**: Not stretched or compressed
- [ ] **Legend placement**: Inside plot if space allows; below otherwise
- [ ] **Panel labels**: (a), (b), (c) for multi-panel figures
- [ ] **Notes below figure**: Data source, sample, key definitions
- [ ] **CI/SE shown**: For any estimated quantities (coefficients, treatment effects)
- [ ] **Reference lines**: Zero line for coefficient plots; cutoff for RDD; treatment date for event study
## Common Pitfalls
- **Using default matplotlib/ggplot themes**: They look unprofessional — always customize
- **Raster exports for line plots**: Use PDF/EPS, not PNG, for any plot with lines or text
- **Too many colors**: Limit to 3–4 distinguishable colors; use linestyle for additional series
- **Tiny axis labels**: Minimum 8pt after scaling to final size in the paper
- **Missing confidence intervals**: Never show point estimates without uncertainty
- **3D plots**: Almost never appropriate in economics — use 2D alternatives
- **Pie charts**: Never use in academic economics papers
## Output File Management
Consistent figure naming and directory layout prevents the most common assembly problem: the paper's `\includegraphics{}` calls pointing to files that don't exist or are scattered across subdirectories (`data/bartik/`, `data/results/`, `output/`, etc.).
### Standard Convention
Save **all** figures to a single `figures/` directory at the project root. Use zero-padded sequential prefixes:
```
figures/
fig01_aaei_distribution.pdf
fig02_parallel_trends.pdf
fig03_event_study_wages.pdf
fig04_event_study_employment.pdf
fig05_heterogeneity_wage_group.pdf
...
```
The numeric prefix (`fig01_`, `fig02_`, ...) makes insertion order explicit and survives alphabetical sorting. The descriptive suffix means you can identify the figure without opening it.
### Python Helper
Add this to any figure-generating script to enforce the convention:
```python
import os, matplotlib.pyplot as plt
FIGURES_DIR = "figures"
os.makedirs(FIGURES_DIR, exist_ok=True)
def save_fig(fig, fig_num: int, name: str, formats=("pdf", "png")):
"""
Save to figures/figNN_name.{pdf,png} with consistent naming.
Default: always dual-format — PDF (vector, submission) + PNG 300 DPI (draft).
Pass formats=("pdf",) only if PNG is explicitly not needed.
"""
paths = []
for fmt in formats:
path = os.path.join(FIGURES_DIR, f"fig{fig_num:02d}_{name}.{fmt}")
dpi = 300 if fmt == "png" else None # PDF is vector; dpi irrelevant
fig.savefig(path, bbox_inches='tight', dpi=dpi)
print(f"Saved: {path}")
paths.append(path)
return paths
# Usage:
fig, ax = plt.subplots(figsize=(7, 4.5))
# ... plotting code ...
save_fig(fig, fig_num=3, name="event_study_wages")
# → figures/fig03_event_study_wages.pdf (vector, for paper)
# → figures/fig03_event_study_wages.png (300 DPI, for draft / slides)
```
### R Helper
```r
FIGURES_DIR <- "figures"
dir.create(FIGURES_DIR, showWarnings = FALSE, recursive = TRUE)
save_econ_fig <- function(plot, fig_num, name, width = 7, height = 4.5) {
# Always dual-format: PDF (vector, submission) + PNG 300 DPI (draft/slides)
pdf_path <- file.path(FIGURES_DIR, sprintf("fig%02d_%s.pdf", fig_num, name))
png_path <- file.path(FIGURES_DIR, sprintf("fig%02d_%s.png", fig_num, name))
ggsave(pdf_path, plot, width = width, height = height, device = cairo_pdf)
ggsave(png_path, plot, width = width, height = height, dpi = 300, device = "png")
cat("Saved:", pdf_path, "\n")
cat("Saved:", png_path, "\n")
invisible(list(pdf = pdf_path, png = png_path))
}
# Usage:
p <- ggplot(...) + theme_econ()
save_econ_fig(p, fig_num = 3, name = "event_study_wages")
# → figures/fig03_event_study_wages.pdf (vector, for paper)
# → figures/fig03_event_study_wages.png (300 DPI, for draft / slides)
```
### In the LaTeX Paper
```latex
% In preamble — point to figures/ once:
\graphicspath{{../figures/}}
% In the paper body — no path needed in each call:
\begin{figure}[htbp]
\centering
\includegraphics[width=0.9\textwidth]{fig03_event_study_wages}
\caption{Event Study: Effect of AI Exposure on Log Wages}
\label{fig:event_study}
\end{figure}
```
When multiple scripts generate figures (e.g., main analysis, robustness, heterogeneity), add a comment block at the top of each script listing which figure numbers it produces. This prevents two scripts overwriting the same file.iv-estimation
|
# Instrumental Variables & Treatment Effects Skill
This skill covers IV/2SLS estimation and propensity score matching (PSM) for causal inference when treatment is endogenous. It helps identify valid instruments, run 2SLS, test instrument validity, and implement PSM.
## When to Use IV vs PSM
| Method | Use When |
|--------|----------|
| **IV / 2SLS** | Treatment is endogenous; a valid instrument exists |
| **PSM** | Selection on observables assumption is credible; rich covariate data |
| **OLS + controls** | Selection on observables, limited instruments |
## IV / 2SLS Framework
### Conditions for a Valid Instrument Z for endogenous X
1. **Relevance**: Cov(Z, X) ≠ 0 — Z must be correlated with the endogenous regressor
2. **Exclusion restriction**: Cov(Z, ε) = 0 — Z affects Y only through X (cannot be tested directly)
3. **Independence**: Z is as-good-as-randomly assigned (exogenous)
### Two-Stage Least Squares Procedure
**Stage 1**: Regress endogenous X on instruments Z and exogenous controls W
- X̂ = γ₀ + γ₁Z + γ₂W + v
- Check F-statistic > 10 (Stock-Yogo rule of thumb); ideally > 16.4 (5% bias threshold)
**Stage 2**: Regress Y on predicted X̂ and controls W
- Y = β₀ + β₁X̂ + β₂W + ε
- SE must be corrected for the two-stage estimation (done automatically by software)
### Quick Code Templates
```python
# Python (linearmodels)
from linearmodels.iv import IV2SLS
# Formula: dependent ~ exogenous [endogenous ~ instruments]
model = IV2SLS.from_formula(
'y ~ 1 + w1 + w2 + [x_endog ~ z1 + z2]', data=df
)
result = model.fit(cov_type='robust')
print(result.summary)
# First-stage diagnostics
print(result.first_stage.diagnostics)
# Check: partial F-stat, Shea partial R²
```
```r
# R (AER)
library(AER)
iv_model <- ivreg(y ~ x_endog + w1 + w2 | z1 + z2 + w1 + w2, data = df)
summary(iv_model, diagnostics = TRUE)
# Shows: weak instruments F-test, Wu-Hausman endogeneity test, Sargan overID test
```
```stata
* Stata
ivregress 2sls y w1 w2 (x_endog = z1 z2), robust first
estat firststage // First-stage diagnostics
estat endogenous // Wu-Hausman test
estat overid // Sargan-Hansen overidentification test
```
## Key Diagnostic Tests
| Test | Null Hypothesis | Interpretation |
|------|-----------------|----------------|
| **First-stage F-stat** | Instruments are weak | F > 10 → relevant instruments |
| **Wu-Hausman** | X is exogenous (OLS consistent) | p < 0.05 → endogeneity confirmed, use IV |
| **Sargan-Hansen** | All instruments valid (overID only) | p > 0.05 → instruments pass overID test |
| **Anderson-Rubin** | Robust to weak instruments | Use when F-stat is borderline |
## Propensity Score Matching (PSM)
### Assumptions
1. **Conditional independence** (unconfoundedness): Treatment T ⊥ Y(0), Y(1) | X
2. **Common support** (overlap): 0 < P(T=1|X) < 1 for all X
### PSM Procedure
```python
# Python
from sklearn.linear_model import LogisticRegression
import numpy as np
# Step 1: Estimate propensity scores
lr = LogisticRegression(max_iter=1000)
lr.fit(df[covariates], df['treatment'])
df['pscore'] = lr.predict_proba(df[covariates])[:, 1]
# Step 2: Check common support
import matplotlib.pyplot as plt
df.groupby('treatment')['pscore'].plot.hist(alpha=0.5, bins=30)
# Step 3: Match (nearest neighbor, 1:1 without replacement)
treated = df[df['treatment'] == 1].copy()
control = df[df['treatment'] == 0].copy()
from sklearn.neighbors import NearestNeighbors
nn = NearestNeighbors(n_neighbors=1)
nn.fit(control[['pscore']])
distances, indices = nn.kneighbors(treated[['pscore']])
matched_control = control.iloc[indices.flatten()].copy()
matched_df = pd.concat([treated, matched_control])
# Step 4: Estimate ATT
att = matched_df.groupby('treatment')['y'].mean().diff().iloc[-1]
print(f"ATT: {att:.4f}")
```
```r
# R (MatchIt)
library(MatchIt)
match_out <- matchit(treatment ~ x1 + x2 + x3, data = df,
method = "nearest", ratio = 1, replace = FALSE)
summary(match_out)
# Covariate balance
plot(match_out, type = "jitter")
plot(summary(match_out))
# Estimate ATT
matched_data <- match.data(match_out)
att_model <- lm(y ~ treatment, data = matched_data, weights = weights)
coeftest(att_model, vcov = vcovCL(att_model, ~subclass))
```
```stata
* Stata (psmatch2 from SSC)
psmatch2 treatment x1 x2 x3, outcome(y) neighbor(1) common
pstest x1 x2 x3
```
## Reporting IV Results
1. **Always show first-stage results** with F-statistic
2. **Report OLS alongside IV** to illustrate endogeneity bias direction
3. **State the exclusion restriction** argument explicitly — this cannot be statistically tested
4. **Interpret LATE not ATE**: IV estimates are local to compliers (those induced by instrument)
5. **Overidentification test**: report Sargan p-value when instruments > endogenous regressors
For weak-instrument robust inference (Anderson-Rubin confidence sets, LIML), control function approach, shift-share (Bartik) instruments, judge/examiner designs, and sensitivity analysis for PSM, see `references/iv-reference.md`.
## Common Pitfalls
- **Using 2SLS with weak instruments without robust inference**: When F < 10, use LIML or Anderson-Rubin confidence sets instead of 2SLS
- **Not arguing for exclusion restriction**: The exclusion restriction cannot be tested statistically — you must make a convincing argument
- **Confusing LATE with ATE**: IV estimates the local average treatment effect for compliers, not the population average
- **Clustering SE at the wrong level in Bartik IV**: With shift-share instruments, inference should account for the exposure shares structure
- **Over-identifying without caution**: Adding more instruments improves efficiency but only if all are valid — a significant Sargan test means at least one instrument is invalid
- **Using PSM without checking common support**: If treated and control propensity score distributions barely overlap, matching is unreliableliterature-review
Search, summarize, and synthesize economics literature. find research gaps, position your contribution.
# Literature-Review
## Purpose
This skill helps economists conduct rigorous literature reviews: actively searching academic databases, building reference list, summarizing individual papers, synthesizing views and evidence across the literature, identifying research gaps, and positioning the user's contribution.
## When to Use
- Finish phase 1 of the workflow (Research Question Scoping) and handoff to phase 2
- Start a literature review for a new project
- Draft the Related Literature section of a paper
- Find prior work to cite in an introduction
- Respond to a referee's request to engage more with the literature
---
## Step 0: OpenAlex API Key Setup
**Before starting any search, Claude must run this check every time the skill is invoked.**
### 0.1 Read the Config File
Read the file `[plugin_root]/.env`.
The file should contain:
```
OPENALEX_API_KEY=your-api-key-here
```
### 0.2 Decision Tree
**If the key is absent / still a placeholder:**
Use AskUserQuestion to prompt the user:
> *"请按以下步骤获取 OpenAlex API Key,然后粘贴到这里:*
> *① 访问 [https://openalex.org/settings/api](https://openalex.org/settings/api) → 注册 / 登录账号*
> *② 在账户设置页面生成 API Key → Copy*
> *③ 将 API Key 粘贴至此"*
- **If the user provides a key:** Append `OPENALEX_API_KEY=<value>` to `[plugin_root]/.env` (standard `KEY=VALUE` format, one entry per line — **not JSON**). If the key already exists in the file, replace that line in place. Confirm: *"✅ API Key 已保存,下次不再询问。"* Then proceed.
- **If the user skips:** Warn — *"⚠️ 未配置 API Key,OpenAlex 请求将受到严格限速,文献检索结果可能不完整。"* — then continue (omit `&api_key=...` from all URLs).
**If the key is already a valid non-placeholder string:** proceed silently — no prompt needed.
---
## Step 1: Confirm the Research Question
This skill is **Phase 2** of the empirical research workflow. The confirmed research question from Phase 1 is the essential input. Claude must follow this decision tree:
### 1.1 Check for a Confirmed Research Question
Scan your working directory to find the research-question.md document produced by the `question` skill:
```
研究问题:…
研究对象:…
识别层级:Level [A/B/C]
数据来源(初步):…
研究假设:H₀ / H₁ / 预期方向
```
Then ask **only the two supplementary questions**:
> *"文献搜索还需要以下信息:*
> *① 文献时间范围:全部年份 / 2000年后 / 最近十年?*
> *② 有没有 2–3 篇奠基性论文(作者 + 标题或 DOI)可作为搜索锚点?没有也可以直接跳过。"*
Once the user answers (or skips ②), proceed directly to **Step 2**.
### 1.2 Back to /question command
**If no research-question.md document is found** (user jumped directly to literature review without going through Phase 1):
Run the `/question` command immediately:
> *"在开展文献检索之前,需要先明确研究问题。我将运行 /question 命令帮助你完成这一步。"*
Run the full `/question` workflow. After the user confirms the research question, return to this skill at **Step 1.1** (the block will now be present) and continue.
---
## Step 2: Generate a Comprehensive Keyword Matrix
Before searching, expand the user's topic into a **full keyword matrix**. Do not rely on a single phrase.
### 2.1 Generate Three Layers × Multiple Synonyms
| Layer | Primary terms | Synonyms / variants |
|-------|--------------|---------------------|
| **Core concept** | e.g. "minimum wage" | "wage floor", "wage policy", "statutory wage" |
| **Outcome** | e.g. "employment" | "jobs", "labor demand", "hours worked", "unemployment" |
| **Method** | e.g. "difference-in-differences" | "DiD", "diff-in-diff", "natural experiment", "quasi-experiment" |
| **Context** | e.g. "United States" | "US", "OECD", "developing countries", "low-income countries" |
### 2.2 Generate JEL Code Candidates
Map the research question to **3–6 relevant JEL codes** (e.g., J31 = Wage Level and Structure; J23 = Labor Demand; C21 = Cross-Sectional Models). These could be used in EconLit and NBER searches.
### 2.3 Construct Boolean Query Variants
Generate **at least 5 distinct query strings** covering different angles:
```
Query 1 (main): ("minimum wage" OR "wage floor") AND (employment OR "labor demand")
Query 2 (method): ("minimum wage") AND ("difference-in-differences" OR "DiD" OR "natural experiment")
Query 3 (mechanism): ("minimum wage") AND (prices OR "profit margins" OR "labor productivity")
Query 4 (subgroup): ("minimum wage") AND ("low-skilled" OR "teen employment" OR "small business")
Query 5 (recent): ("minimum wage") AND (employment) after:2018
```
---
## Step 3: Execute Multi-Round Active Search
Claude must **actively execute these searches** autonomously.
Primary search engine: OpenAlex API via WebFetch.
Supplementary & optional sources: Semantic Scholar, NBER, SSRN and arXiv.
Search rules:
1. Use primary search engine for all rounds.
2. Use supplementary sources only for following purposes:
- More Coverage: OpenAlex results are clearly thin or missing a literature branch
- Working paper capture: recent NBER / SSRN papers are likely relevant
3. If OpenAlex coverage is already strong and balanced, supplementary search may be skipped entirely.
### **OpenAlex API**
All OpenAlex requests follow the base URL `https://api.openalex.org`. Append `&api_key=YOUR_KEY` (from `.env`) to every request. Without a key, requests are very limited but still functional for moderate searches.
#### ⚠️ 网络受限处理规则(Cowork 沙箱 WebFetch 被封锁时)
**在发出第一个 OpenAlex WebFetch 请求后,若收到网络错误(连接超时、403、域名无法解析等),必须严格遵循以下顺序,禁止直接跳转到网络搜索兜底:**
**Step A — 立即生成本地运行脚本,并提示用户在自己的电脑上执行**
向用户说明:
> *"Cowork 沙箱无法访问 OpenAlex API(网络受限)。请在您的本地终端运行以下 Python 脚本获取文献数据,然后将输出结果粘贴到这里或保存为文件后上传。*
>
> **运行方式:**
> ```bash
> pip install requests
> python fetch_openalex.py
> ```"*
随后生成完整的本地脚本 `fetch_openalex.py`(写入工作区):
```python
"""
OpenAlex 文献获取脚本 — 在本地终端运行
生成文件:openalex_results.json
"""
import requests, json, time
API_KEY = "YOUR_OPENALEX_API_KEY" # 替换为你的 API Key(或留空)
BASE = "https://api.openalex.org"
HEADERS = {"User-Agent": "mailto:your@email.com"}
def fetch(url):
if API_KEY:
url += f"&api_key={API_KEY}" if "?" in url else f"?api_key={API_KEY}"
r = requests.get(url, headers=HEADERS, timeout=30)
r.raise_for_status()
time.sleep(0.5) # 避免触发速率限制
return r.json()
queries = [
# ── 根据研究问题自动填充的查询(Claude 在生成脚本时替换占位符)──
f"{BASE}/works?search=KEYWORD_1&filter=type:article&sort=cited_by_count:desc&per_page=25",
f"{BASE}/works?search=KEYWORD_2&filter=type:article&sort=cited_by_count:desc&per_page=25",
f"{BASE}/works?search=KEYWORD_3&filter=type:article,publication_year:>2018&sort=cited_by_count:desc&per_page=25",
]
results = {}
for i, url in enumerate(queries, 1):
print(f"查询 {i}/{len(queries)}: {url[:80]}...")
try:
data = fetch(url)
results[f"query_{i}"] = data.get("results", [])
print(f" → 获取 {len(results[f'query_{i}'])} 篇")
except Exception as e:
print(f" ✗ 失败:{e}")
results[f"query_{i}"] = []
with open("openalex_results.json", "w", encoding="utf-8") as f:
json.dump(results, f, ensure_ascii=False, indent=2)
print("\n✅ 结果已保存至 openalex_results.json")
print(f" 总计获取:{sum(len(v) for v in results.values())} 篇文献条目")
```
> **脚本中的占位符 `KEYWORD_1/2/3` 由 Claude 在生成时替换为 Step 2 生成的实际关键词查询字符串。**
**Step B — 等待用户上传 / 粘贴结果**
等待用户执行后返回结果。接受以下任意格式:
- 上传 `openalex_results.json` 文件
- 粘贴 JSON 文本到对话中
- 粘贴终端输出的纯文本
收到数据后,正常解析并继续 Round 1 的候选文献筛选流程。
**Step C — 仅当用户明确表示无法在本地运行时,才启用兜底方案**
若用户回复"无法运行"或"没有 Python 环境",**此时**才可使用以下兜底(按优先级):
1. 通过 `WebSearch` 搜索 Google Scholar / 学术搜索(降级,覆盖度有限)
2. 提示用户在浏览器手动访问 `https://openalex.org/works?search=...` 并将页面内容粘贴
**⛔ 严禁行为:** 在未提示用户尝试本地运行之前,直接切换到网络搜索兜底。
**Important — Two-step author lookup:** Never filter by author *name* directly (names are ambiguous). Always first search for the author's OpenAlex ID, then filter works by that ID.
**Pattern 1 — Works keyword search**
```
https://api.openalex.org/works?search=minimum+wage+employment+difference-in-differences&filter=type:article&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
```
Key `filter` modifiers (combine with comma = AND):
- `type:article` — journal articles only
- `publication_year:2015-2024` — year range
- `publication_year:>2018` — after a year
- `is_oa:true` — open access only
- `topics.id:T10325` — by OpenAlex topic ID
- `authorships.institutions.country_code:US` — by country
Key `sort` options:
- `cited_by_count:desc` — most cited first
- `publication_year:desc` — most recent first
- `relevance_score:desc` — best text match first (default when `search=` used)
**Pattern 2 — Author ID lookup (step 1 of 2)**
```
https://api.openalex.org/authors?search=David+Card&per_page=5&api_key=YOUR_KEY
```
Parse the response: pick the top result's `id` field (e.g., `https://openalex.org/A5023888391`). Extract the short ID: `A5023888391`.
**Pattern 3 — Works by a specific author (step 2 of 2)**
```
https://api.openalex.org/works?filter=authorships.author.id:A5023888391&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
```
**Pattern 4 — Retrieve a single work by DOI or OpenAlex ID**
```
https://api.openalex.org/works/doi:10.2307/2118030?api_key=YOUR_KEY
https://api.openalex.org/works/W2170494700?api_key=YOUR_KEY
```
**Pattern 5 — Forward citations (papers that cite a given work)**
```
https://api.openalex.org/works?filter=cites:W2170494700&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
```
**Pattern 6 — Backward references (papers cited by a given work)**
Retrieve the work directly (Pattern 4) and read its `referenced_works` array. Then batch-fetch metadata:
```
https://api.openalex.org/works?filter=openalex_id:W111|W222|W333&per_page=50&api_key=YOUR_KEY
```
(Use pipe `|` to OR up to 100 IDs in one request.)
**Pattern 7 — Search with Boolean operators**
```
https://api.openalex.org/works?search=(minimum+wage+OR+wage+floor)+AND+(employment+OR+labor+demand)&filter=type:article,publication_year:2000-2024&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
```
Use `AND`, `OR`, `NOT` (uppercase).
---
### **Round 1: Broad Sweep (Target: 25+ candidate papers)**
#### Primay Source — OpenAlex API
Run **at least 4–5 distinct queries**. For each query variant:
1. **Keyword search** (Pattern 1) with `sort=cited_by_count:desc` → capture top 25 results
2. **Author lookup** (Patterns 2+3) for 2–3 key authors identified in results → capture their top works
3. **Combined filter** adding `publication_year:>2018` → capture recent papers
Example sequence:
```
# Query 1: Main search
https://api.openalex.org/works?search=minimum+wage+employment&filter=type:article&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
# Query 2: Method-focused
https://api.openalex.org/works?search=minimum+wage+"difference-in-differences"&filter=type:article&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
# Query 3: Mechanism
https://api.openalex.org/works?search=minimum+wage+price+pass-through+profit+mechanism&filter=type:article&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
# Query 4: Recent papers
https://api.openalex.org/works?search=minimum+wage+employment&filter=type:article,publication_year:>2018&sort=cited_by_count:desc&per_page=25&api_key=YOUR_KEY
# Query 5: Author lookup → Card
https://api.openalex.org/authors?search=David+Card&per_page=3&api_key=YOUR_KEY
# → get author ID, then:
https://api.openalex.org/works?filter=authorships.author.id:AUTHOR_ID&sort=cited_by_count:desc&per_page=20&api_key=YOUR_KEY
```
#### Supplementary Sources
**Semantic Scholar (for more coverage)**
**API Key Check (one-time per session):**
Read `.env` and look for `SEMANTIC_SCHOLAR_API_KEY`.
- **If the key is absent or still a placeholder:** Use `AskUserQuestion` to prompt:
> *"Semantic Scholar 可作为 OpenAlex 的补充来源(捕获遗漏文献)。配置 API Key 可获得更高请求速率:*
> *① 访问 [https://www.semanticscholar.org/product/api](https://www.semanticscholar.org/product/api) → 申请 API Key (或直接闲鱼搜索Semantic Scholar API 购买,小钱)*
> *② 将 Key 粘贴至此;若暂无 Key,直接回车跳过(将使用公共接口,速率受限)。"*
- If user provides key: append `SEMANTIC_SCHOLAR_API_KEY=<value>` to `.env` (standard `KEY=VALUE` format — **not JSON**; replace existing line if present). Set flag `SS_AUTH = True`. Confirm: *"✅ Semantic Scholar API Key 已保存。"*
- If user skips: set flag `SS_AUTH = False`. Note: *"⚠️ 将使用 Semantic Scholar 公共接口,调用速率受限。"*
- **If a valid key already exists in `.env`:** Set `SS_AUTH = True`, proceed silently.
**Branch A — With API Key (`SS_AUTH = True`): use Python via Bash**
Authentication is via HTTP header `x-api-key`, which WebFetch cannot set. Use `requests` in a Bash script instead.
```python
import requests, json, time
SS_KEY = "YOUR_SEMANTIC_SCHOLAR_API_KEY" # loaded from .env
headers = {"x-api-key": SS_KEY}
BASE = "https://api.semanticscholar.org/graph/v1"
FIELDS = "title,year,authors,abstract,citationCount,externalIds,openAccessPdf,venue,publicationTypes"
# --- Task 1: Keyword search (cross-validation & coverage) ---
params = {
"query": "minimum wage employment difference-in-differences", # adapt to topic
"fields": FIELDS,
"year": "2000-2024",
"sort": "citationCount", # most-cited first; alternatives: publicationDate
"limit": 100, # max per call; use "token" field in response for pagination
}
r = requests.get(f"{BASE}/paper/search/bulk", headers=headers, params=params)
results = r.json() # keys: "total", "data" (list), "token" (next page)
papers = results.get("data", [])
# Pagination: if results["token"] exists and more papers needed:
# params["token"] = results["token"]; repeat the call
time.sleep(1) # respect 1 req/sec rate limit with key
# --- Task 2: Cross-validate a specific paper by DOI ---
doi = "10.2307/2118030" # replace with actual DOI from OpenAlex results
r2 = requests.get(
f"{BASE}/paper/DOI:{doi}",
headers=headers,
params={"fields": FIELDS},
)
paper = r2.json() # single paper object
time.sleep(1)
# --- Task 3: Cross-validate by Semantic Scholar paper ID ---
# paperId can be found in search results or via DOI lookup
ss_id = "paper_id_here"
r3 = requests.get(
f"{BASE}/paper/{ss_id}",
headers=headers,
params={"fields": FIELDS},
)
```
**Key response fields:**
| Field | Description |
|-------|-------------|
| `paperId` | Semantic Scholar internal ID |
| `externalIds.DOI` | DOI — use to match against OpenAlex entries |
| `externalIds.ArXiv` | arXiv ID (if preprint available) |
| `citationCount` | Total citations in Semantic Scholar's index |
| `abstract` | Full abstract text (no reconstruction needed, unlike OpenAlex) |
| `venue` | Journal or conference name |
| `openAccessPdf.url` | Direct PDF link if open access |
| `publicationTypes` | e.g., `["JournalArticle"]`, `["Preprint"]` |
---
**Branch B — Without API Key (`SS_AUTH = False`): use WebFetch**
The public Semantic Scholar API works without authentication but shares a global rate limit. Use WebFetch for individual calls and space them out.
```
# Keyword search (rate-limited)
https://api.semanticscholar.org/graph/v1/paper/search/bulk?query=minimum+wage+employment+difference-in-differences&fields=title,year,authors,abstract,citationCount,externalIds,venue&sort=citationCount
# Cross-validate a specific paper by DOI
https://api.semanticscholar.org/graph/v1/paper/DOI:10.2307/2118030?fields=title,year,authors,citationCount,abstract,externalIds,venue
# Cross-validate by arXiv ID
https://api.semanticscholar.org/graph/v1/paper/ARXIV:2301.05345?fields=title,year,authors,citationCount,abstract,externalIds
```
> ⚠️ **Rate limit discipline (no key):** Issue at most **5–8 WebFetch calls** to Semantic Scholar per round. Space calls at least 3–5 seconds apart. If a `429` status is returned, pause and retry after 30 seconds.
---
**What to do with Semantic Scholar results in Round 1:**
1. **Coverage gap check** — run 1–2 keyword searches using queries that diverged from OpenAlex (e.g., different phrasing). Flag any papers in the top 20 Semantic Scholar results that are *absent* from the OpenAlex candidate list with citation count > 20.
2. **Abstract retrieval** — if OpenAlex `abstract_inverted_index` is missing for a Core candidate, use Semantic Scholar to retrieve the full plain-text abstract directly.
**arXiv (for latest economics preprints and working papers)**
**What to do with arXiv results in Round 1:**
1. **Recency capture** — collect papers from the last 12 months not yet published in journals. These often represent the methodological frontier.
2. **Preprint–publication link** — for papers already in OpenAlex, check `externalIds.ArXiv` (via Semantic Scholar) or search arXiv ID to confirm if the working paper version has additional material.
**Search patterns:**
```
# Economics preprints — keyword search, sorted by newest first
https://export.arxiv.org/api/query?search_query=all:minimum+wage+employment&start=0&max_results=25&sortBy=submittedDate&sortOrder=descending
# Restrict to econ.* categories (General Economics, Econometrics, Labour)
https://export.arxiv.org/api/query?search_query=cat:econ.*+AND+all:minimum+wage+employment&start=0&max_results=25&sortBy=submittedDate&sortOrder=descending
# Retrieve most-relevant results (default relevance ranking)
https://export.arxiv.org/api/query?search_query=all:minimum+wage+"difference-in-differences"&start=0&max_results=25
# Fetch a specific paper by arXiv ID
https://export.arxiv.org/api/query?id_list=2301.05345
```
---
**NBER (for latest working papers)**
**What to do with NBER results in Round 1:**
- Recency capture, NBER working papers from last 12 months that are absent from OpenAlex.
NBER is the most authoritative working paper repository in economics. It does not have a structured public search API, so use a combination of **WebSearch** and **WebFetch** for individual paper pages.
**Step 1 — Discover via WebSearch:**
```
site:nber.org/papers "minimum wage" employment 2023 2024
site:nber.org/papers "minimum wage" inequality recent
```
**Step 2 — Fetch individual paper metadata via WebFetch:**
```
# NBER paper page (contains title, authors, abstract, publication date, JEL codes)
https://www.nber.org/papers/w31234
# NBER also exposes a JSON metadata endpoint — faster than HTML parsing:
https://www.nber.org/api/v1/working_page_listing/contentType/working_paper/_id/w31234
```
---
**SSRN (for latest working papers)**
**What to do with SSRN results in Round 1:**
- Recency capture, SSRN working papers from last 12 months that are absent from OpenAlex.
- Topics involves intersection of law and economics.
SSRN is a preprint repository especially strong in law, finance, and applied economics. It does not offer a structured public API; use **WebSearch** as the primary discovery method.
**Step 1 — Discover via WebSearch:**
```
site:papers.ssrn.com "minimum wage" employment effects 2023 2024
site:papers.ssrn.com "minimum wage" causal inference difference-in-differences
site:papers.ssrn.com "minimum wage" inequality labour market 2024
```
**Step 2 — Fetch individual paper metadata via WebFetch:**
```
# SSRN abstract page — contains title, authors, abstract, date, download count
https://papers.ssrn.com/sol3/papers.cfm?abstract_id=4567890
```
> ⚠️ SSRN pages can be slow or return partial content. If WebFetch returns incomplete HTML, limit to 2–3 SSRN fetches per round and rely on WebSearch snippets for basic metadata.
---
Record every returned paper in a running list: Author(s), Year, Title, Source, OpenAlex ID (if available), Citation Count.
---
### Round 2: Citation Network Expansion (Target: +5–10 papers)
All searches in this round use **OpenAlex only**.
#### Seed Paper Selection
Do **not** use citation count alone. Select 5 seed papers as follows:
- **Top 3** from the Round 1 candidate list ranked by `cited_by_count` — these are the seminal works whose citation network is most productive to explore.
- **Top 2** most recently published candidates from Round 1 — these anchor the frontier and surface newer work not yet highly cited.
If fewer than 5 papers exist in Round 1, use all of them.
For each seed paper, crawl their citation network using OpenAlex.
#### Forward Citations
For each seed paper, retrieve works that cite it, sorted by citation count:
```
https://api.openalex.org/works?filter=cites:SEED_OPENALEX_ID&sort=cited_by_count:desc&per_page=25&select=id,title,authorships,publication_year,cited_by_count,primary_location&api_key=YOUR_KEY
```
Collect the top 10 citing works per seed paper.
#### Backward References
```
# Step 1: retrieve the seed paper's reference list
https://api.openalex.org/works/SEED_OPENALEX_ID?select=id,referenced_works&api_key=YOUR_KEY
# Step 2: batch-fetch metadata for up to 50 references
https://api.openalex.org/works?filter=openalex_id:W111|W222|W333&select=id,title,authorships,publication_year,cited_by_count,primary_location&per_page=50&api_key=YOUR_KEY
```
#### Inclusion Threshold
Add a paper to the candidate list if it meets **at least one** of:
- `cited_by_count > 5`, **or**
- Published within the last 3 years, **or**
#### Deduplication and Stopping Rule
**Deduplication:** after each seed paper's expansion, remove any paper whose OpenAlex ID already appears in the Round 1 candidate list. Work only with net-new entries.
**Stopping rule:** if across any 3 consecutive seed papers, forward citation expansion yields fewer than 2 new papers meeting the inclusion threshold, stop early — further expansion has diminishing returns.
---
#### Update the Candidate List
Append all qualifying net-new papers to the running list using the same fields as Round 1:
`Author(s) | Year | Title | OpenAlex ID | Citation Count`
---
## Step 4: Triage and Organize Papers
### 4.1 Fetch Abstracts for Triage
For each candidate paper, retrieve the abstract if not already fetched.
**Via OpenAlex (preferred):**
```
https://api.openalex.org/works/W2170494700?select=id,title,abstract_inverted_index,authorships,publication_year,cited_by_count,primary_location&api_key=YOUR_KEY
```
Note: OpenAlex stores abstracts as `abstract_inverted_index` (a word→position map). Reconstruct the abstract by sorting positions and joining words. Example Python reconstruction:
```python
inv = work["abstract_inverted_index"] # {"The": [0], "effect": [1], ...}
abstract = " ".join(w for w, _ in sorted(
((w, p) for w, positions in inv.items() for p in positions),
key=lambda x: x[1]
))
```
**Via Semantic Scholar (fallback):**
```
https://api.semanticscholar.org/graph/v1/paper/[PAPER_ID]?fields=title,abstract,authors,year,citationCount,venue
```
Assign each paper to one category:
- **Core** (directly relevant, must cite and engage)
- **Background** (motivates the topic, cite briefly)
- **Methodological** (uses a technique you adopt)
- **Contradictory** (finds different results — must address)
- **Excluded** (off-topic, weak identification, or low-quality)
After triage, proceed to 4.2 for Core papers and 4.3 for all other retained categories.
---
### 4.2 Full-Text Acquisition for Core Papers
For papers classified as **Core**, go beyond the abstract and acquire the full PDF. Limit this to a maximum of **10 papers** to keep the review tractable.
#### Priority Order for PDF Sources
Work through these sources in order until a PDF is obtained:
**① OpenAlex open-access URL (fastest)**
Retrieve the paper's open-access status from OpenAlex:
```
https://api.openalex.org/works/OPENALEX_ID?select=id,title,open_access,primary_location,locations&api_key=YOUR_KEY
```
Check `open_access.is_oa`. If `true`, use `open_access.oa_url` as the direct PDF link.
**② Semantic Scholar open-access PDF**
If OpenAlex has no OA URL, check Semantic Scholar:
```
# With API key (SS_AUTH = True)
GET https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}?fields=openAccessPdf,externalIds
Header: x-api-key: YOUR_SS_KEY
# Without API key (SS_AUTH = False) — WebFetch:
https://api.semanticscholar.org/graph/v1/paper/DOI:{doi}?fields=openAccessPdf,externalIds
```
Use `openAccessPdf.url` if present.
**③ Source-specific direct URLs**
If `externalIds` reveal a known working paper source, use the known URL pattern:
| Source | URL pattern |
|--------|-------------|
| arXiv | `https://arxiv.org/pdf/{arxiv_id}` |
| NBER | `https://www.nber.org/system/files/working_papers/w{NNNN}/w{NNNN}.pdf` |
| SSRN | Fetch abstract page → find download link in HTML |
**④ WebFetch the DOI landing page**
As a last open-access check, fetch the DOI resolver page and look for an OA download link:
```
https://doi.org/{doi}
```
Some journals (AEA, NBER, IZA) serve open-access PDFs directly from the DOI redirect.
#### Downloading and Reading the PDF
Once a PDF URL is identified, download it using Python via Bash and then read it with the `Read` tool:
```python
import requests, re, os
pdf_url = "https://arxiv.org/pdf/2301.05345" # replace with actual URL
paper_slug = "card_krueger_1994" # short identifier for filename
save_path = f"/sessions/lucid-nifty-goodall/{paper_slug}.pdf"
r = requests.get(pdf_url, timeout=60, headers={"User-Agent": "Mozilla/5.0"})
if r.status_code == 200 and "pdf" in r.headers.get("Content-Type", ""):
with open(save_path, "wb") as f:
f.write(r.content)
print(f"Saved: {save_path}")
else:
print(f"Failed: HTTP {r.status_code}, Content-Type: {r.headers.get('Content-Type')}")
```
Then read with the `Read` tool. For these core papers, read each in sections — Introduction + Conclusion first, then Data, then Empirical Strategy, then Results.
#### When PDF Is Unavailable (Paywalled)
If no open-access version is found after all steps above:
1. Note `[PAYWALLED]` in the candidate list for this paper.
2. Proceed with the abstract-only summary template (Section 4.3B).
3. Flag the paper for the user with a note: *"Full text unavailable — please upload the PDF manually if you have institutional access."*
---
### 4.3 Paper Summary Templates
Use the appropriate template based on paper category and whether full text was obtained.
#### 4.3A Full-Text Summary — Core Papers
Use this template when the full PDF has been read. It is the most detailed version and forms the primary input to Step 5.
```markdown
## [Author(s)] ([Year])
**Title:** [Full title]
**Published in:** [Journal / Working Paper Series, Volume/Number]
**JEL Codes:** [e.g., J31, C21]
**Citation count:** [N]
**OpenAlex ID:** [W...]
**Full text read:** Yes / Abstract only (paywalled)
### Research Question
[One sentence stating the causal question.]
### Data
- **Source:** [Dataset name, e.g., CPS, NLSY, administrative records]
- **Period:** [Years covered]
- **Sample:** [N observations; unit of analysis, e.g., county-month, individual]
- **Geography:** [Country / region / states]
- **Key variables:** [Outcome, treatment, main controls — with exact names if possible]
- **Sample restrictions:** [Any non-obvious exclusions applied]
### Identification Strategy
- **Method:** [DiD / IV / RDD / RCT / Matching / OLS]
- **Source of variation:** [What drives the treatment? e.g., state-level minimum wage changes]
- **Key assumption:** [e.g., parallel trends, exclusion restriction, continuity at threshold]
- **Assumption test(s):** [What did the authors test? What did they find?]
- **First-stage / relevance:** [For IV: F-statistic; for RDD: bandwidth, density test result]
### Main Findings
1. [Key result with coefficient magnitude + units + statistical significance, e.g., "A $1 MW increase reduces teen employment by 0.7 pp (se=0.2), significant at 5%"]
2. [Second key result]
3. [Third key result or primary heterogeneity finding]
### Robustness Checks
| Check | Result | Location |
|-------|--------|----------|
| [e.g., Alternative control group] | [Consistent / Attenuated / Null] | [Table A1] |
| [e.g., Placebo test] | [...] | [...] |
### Limitations
- **External validity:** [Main concern about generalisability]
- **Internal validity:** [Main remaining threat, e.g., parallel trends not perfectly supported]
- **Data:** [Key measurement or coverage limitation]
### Relevance to Your Project
[2–3 sentences: How does this paper shape your identification strategy, data choices, or expected results?]
```
---
#### 4.3B Abstract-Only Summary — non-Core / Paywalled Core
Use this shorter template for non-Core papers or Core papers where full text is unavailable.
```markdown
## [Author(s)] ([Year])
**Title:** [Full title]
**Published in:** [Journal / Working Paper Series]
**Category:** Background / Methodological / Contradictory / Core (paywalled)
**Citation count:** [N] | **OpenAlex ID:** [W...]
**Research Question:** [One sentence]
**Method:** [DiD / IV / RDD / OLS / other]
**Key finding:** [One sentence with magnitude if available]
**Relevance:** [One sentence — why is this paper in the list?]
```
---
## Step 5: Narrative Synthesis
After summarizing relevant papers, write a synthesis, but **Do not just list papers.** Economics papers use narrative paragraphs. Follow this structure:
**Paragraph 1 — Establish the big picture:**
> "A large literature examines the effect of X on Y. Early work using OLS found [result], but identification concerns motivated subsequent quasi-experimental research (Author A, Year; Author B, Year). Taken together, this literature establishes that [broad conclusion]."
**Paragraph 2 — Highlight consensus:**
> "There is now broad agreement that [specific finding]. [Author A (Year)] shows [result] using [method] in [context]. [Author B (Year)] confirms this using [different method/data], finding [comparable result]."
**Paragraph 3 — Highlight disagreements (must engage, not ignore):**
> "[Author C (Year)], however, finds [contradictory result]. This discrepancy may reflect [methodological difference / data difference / context difference]. We return to this issue in Section X."
**Paragraph 4 — Identify gaps that motivate your paper:**
> "Despite this progress, two questions remain unanswered. First, [Gap 1]. Second, [Gap 2]. Our paper contributes by [your approach]."
Research Gaps Taxonomy
| Gap type | Example |
|----------|---------|
| **Geographic** | Existing evidence is US-only; no developing country evidence |
| **Temporal** | Studies focus on short-run effects; long-run unknown |
| **Subgroup** | Effects on high-skill workers unstudied |
| **Mechanism** | What drives the effect? (price, profit, productivity?) |
| **Methodological** | All existing papers use OLS; no credible IV evidence |
| **Data** | All use survey data; no administrative records |
| **Policy counterfactual** | Effects at lower/higher magnitudes unknown|
**Paragraph 5 — Position Your Contribution**
> "This paper makes [N] contributions to the literature on [topic]. First, we [contribution 1] — while [Author A (Year)] studies [X], our paper is the first to [Y]. Second, we use [data/method] which allows us to [identify/measure] [Z], addressing [limitation in prior work]. Third, our findings [extend/challenge/reconcile] the evidence from [Author B (Year)] by showing [how/why]."
---
## Step 6: Deliver a Full Reference List
After completing the search and synthesis, produce a full BibTeX reference list for all Core and Background papers cited.
Retrieve structured metadata via OpenAlex:
```
https://api.openalex.org/works/W2170494700?select=id,doi,title,authorships,publication_year,primary_location,biblio&api_key=YOUR_KEY
```
The `doi` field, `primary_location.source.display_name` (journal name), `biblio.volume`, `biblio.issue`, and `biblio.first_page` / `biblio.last_page` provide everything needed for BibTeX.
**BibTeX format:**
```bibtex
@article{CardKrueger1994,
author = {Card, David and Krueger, Alan B.},
title = {Minimum Wages and Employment: A Case Study of the Fast-Food
Industry in New Jersey and Pennsylvania},
journal = {American Economic Review},
year = {1994},
volume = {84},
number = {4},
pages = {772--793},
doi = {10.2307/2118030}
}
```
Use `@unpublished` for working papers not yet published in a journal.
---
## Final Output: literature-review-report.md
After finishing all steps above, write the final literature review report and save it as `literature-review-report.md` in the working directory.
**Example format of literature review report:**
```markdown
# Literature Review Report
**Topic:** Effects of Minimum Wage on Employment
**Source:** OpenAlex(primary), Semantic Scholar/NBER/arXiv/SSRN(supplementary)
**Date Range:** 1990-2026
**Papers reviewed:** 52 total → 12 Core, 9 Background, 6 Methodological, 4 Contradictory, 21 Excluded
---
## Core Papers
### Card and Krueger (1994)
**Title:** Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania
**Published in:** *American Economic Review*, 84(4), 772–793
**JEL Codes:** J31, J38
**Citation count:** 8,214 (OpenAlex) | 9,102 (Semantic Scholar)
**OpenAlex ID:** W2133060252
**Full text read:** Yes (NBER PDF)
**Research Question:**
**Data**
**Identification Strategy**
**Main Findings**
**Limitations**
**Relevance to Your Project**
---
## Non-Core Papers
### Neumark and Wascher (2000)
**Title:** Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania: Comment
**Published in:** *American Economic Review*, 90(5), 1362–1396
**Category:** Contradictory
**Citation count:** 1,247 | **OpenAlex ID:** W2048872744
**Full text read:** Yes (user uploaded)
**Research Question:**
**Method:**
**Key finding:**
**Relevance:**
---
## Narrative Synthesis
A large empirical literature examines the effect of minimum wages on employment......
The modern quasi-experimental literature, launched by Card and Krueger (1994), challenged this consensus using a difference-in-differences design......
Despite this progress, three questions remain underexplored......
This paper makes [N] contributions to the literature on [topic]. First...
---
## References
Autor, D. H., Manning, A., & Smith, C. L. (2016). The contribution of the minimum wage to US wage inequality over three decades: A reassessment. *American Economic Journal: Applied Economics*, 8(1), 58–99. https://doi.org/10.1257/app.20140073
```
---
## Common Pitfalls
- ❌ Presenting a search framework instead of actually running searches
- ❌ Stopping after one round of search queries — always do three rounds
- ❌ Filtering by author name directly (use two-step ID lookup)
- ❌ Only citing papers that support your argument
- ❌ Ignoring contradictory findings
- ❌ Confusing correlation with causation when describing OLS results
- ❌ Citing papers you have not read (mischaracterizing findings)
- ❌ Vague gap identification ("more research is needed") — be specificml-causal
|
# Machine Learning for Causal Inference Skill
This skill covers modern ML-based causal inference methods: Causal Forests (GRF) for heterogeneous treatment effects, Double/Debiased Machine Learning (DML) for partially linear models, and LASSO-based variable selection. These methods combine the flexibility of ML with the rigor of econometric identification.
## When to Use ML Causal Methods
| Goal | Method |
|------|--------|
| Estimate average treatment effect with many controls | Double ML (DML) |
| Discover treatment effect heterogeneity | Causal Forest (GRF) |
| Variable selection for high-dimensional controls | Post-LASSO |
| Best linear predictor of CATE | BLP analysis |
| Subgroup with largest/smallest effects | CLAN analysis |
**Key principle**: ML is used for **nuisance parameter estimation** (predicting Y and D), not for identifying causal effects directly. Identification still requires valid research design (RCT, IV, DID, etc.).
## Double/Debiased Machine Learning (DML)
Chernozhukov, Chetverikov, Demirer, Duflo, Hansen, Newey & Robins (2018)
### Partially Linear Model
```
Y = θ·D + g(X) + ε (structural equation)
D = m(X) + v (treatment equation)
θ = causal parameter of interest
g(X), m(X) = unknown nuisance functions estimated by ML
```
### DML Procedure
1. **Cross-fitting**: Split sample into K folds (typically K=5)
2. **Nuisance estimation**: On each fold k, use remaining folds to estimate ĝ(X) and m̂(X) using ML
3. **Residualize**: Compute Ỹ = Y − ĝ(X) and D̃ = D − m̂(X)
4. **Final estimation**: Regress Ỹ on D̃ to obtain θ̂
### R — DoubleML
```r
# R — DoubleML package
library(DoubleML)
library(mlr3)
library(mlr3learners)
# Define data
dml_data <- DoubleMLData$new(
data = df,
y_col = "outcome",
d_cols = "treatment",
x_cols = c("x1", "x2", "x3", "x4", "x5")
)
# Choose ML methods for nuisance estimation
ml_g <- lrn("regr.ranger", num.trees = 500) # for E[Y|X]
ml_m <- lrn("classif.ranger", num.trees = 500) # for E[D|X]
# Fit DML (partially linear model)
dml_plr <- DoubleMLPLR$new(dml_data, ml_g, ml_m,
n_folds = 5, n_rep = 10)
dml_plr$fit()
print(dml_plr)
# Reports: coefficient, SE, t-stat, p-value, CI
# Interactive model (for CATE via DML)
dml_irm <- DoubleMLIRM$new(dml_data, ml_g, ml_m,
n_folds = 5, score = "ATE")
dml_irm$fit()
print(dml_irm)
```
### Python — DoubleML
```python
# Python — DoubleML
import doubleml as dml
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
# Define data
dml_data = dml.DoubleMLData(
df, y_col='outcome', d_cols='treatment',
x_cols=['x1', 'x2', 'x3', 'x4', 'x5']
)
# Nuisance learners
ml_g = RandomForestRegressor(n_estimators=500, max_depth=5)
ml_m = RandomForestClassifier(n_estimators=500, max_depth=5)
# Partially Linear Regression
dml_plr = dml.DoubleMLPLR(dml_data, ml_g, ml_m,
n_folds=5, n_rep=10)
dml_plr.fit()
print(dml_plr.summary)
# Confidence interval
print(dml_plr.confint())
```
### Stata — ddml
```stata
* Stata — ddml (Ahrens et al. 2024)
ssc install ddml
ssc install pystacked
* Partially linear model with cross-fitting
ddml init partial, kfolds(5) reps(10)
ddml E[outcome]: pystacked outcome x1 x2 x3 x4 x5, ///
type(reg) methods(rf gradboost lassocv)
ddml E[treatment]: pystacked treatment x1 x2 x3 x4 x5, ///
type(class) methods(rf gradboost lassocv)
ddml crossfit
ddml estimate, robust
```
## Causal Forest (Generalized Random Forest)
Athey, Tibshirani & Wager (2019)
Estimates Conditional Average Treatment Effects (CATE): τ(x) = E[Y(1) − Y(0) | X = x]
### R — grf
```r
# R — Generalized Random Forest
library(grf)
# Prepare data
X <- as.matrix(df[, c("x1", "x2", "x3", "x4", "x5")])
Y <- df$outcome
W <- df$treatment
# Fit causal forest
cf <- causal_forest(X, Y, W,
num.trees = 4000,
honesty = TRUE, # honest estimation
tune.parameters = "all") # auto-tune
# Average treatment effect (ATE)
ate <- average_treatment_effect(cf, target.sample = "all")
cat("ATE:", ate["estimate"], "SE:", ate["std.err"], "\n")
# ATT
att <- average_treatment_effect(cf, target.sample = "treated")
cat("ATT:", att["estimate"], "SE:", att["std.err"], "\n")
# Individual-level CATE predictions
cate <- predict(cf, estimate.variance = TRUE)
df$cate_hat <- cate$predictions
df$cate_se <- sqrt(cate$variance.estimates)
# Variable importance
varimp <- variable_importance(cf)
names(varimp) <- colnames(X)
sort(varimp, decreasing = TRUE)
```
### Python — econml / grf
```python
# Python — EconML (Microsoft)
from econml.dml import CausalForestDML
# Fit causal forest via DML
cf = CausalForestDML(
model_y=RandomForestRegressor(n_estimators=500),
model_t=RandomForestClassifier(n_estimators=500),
n_estimators=4000,
cv=5,
random_state=42
)
cf.fit(Y=df['outcome'].values,
T=df['treatment'].values,
X=df[['x1', 'x2', 'x3', 'x4', 'x5']].values)
# ATE
ate = cf.ate_inference()
print(f"ATE: {ate.mean_point:.4f} (SE: {ate.stderr_mean:.4f})")
# CATE predictions
cate = cf.effect(df[['x1', 'x2', 'x3', 'x4', 'x5']].values)
df['cate_hat'] = cate
```
## BLP Analysis (Best Linear Predictor)
Tests whether CATE varies with observables. From Chernozhukov, Demirer, Duflo & Fernandez-Val (2020).
```r
# R — BLP of CATE
library(grf)
# After fitting causal_forest cf:
blp <- best_linear_projection(cf, A = X)
print(blp)
# Interpretation:
# - Intercept: average effect
# - Coefficients: how CATE varies with each covariate
# - If all coefficients ≈ 0 → homogeneous treatment effect
```
## CLAN Analysis (Classification Analysis)
Identifies subgroups with highest/lowest treatment effects.
```r
# Sorted Group Average Treatment Effects (GATES)
# Split sample by predicted CATE quartiles
df$cate_quartile <- cut(df$cate_hat,
breaks = quantile(df$cate_hat, c(0, 0.25, 0.5, 0.75, 1)),
labels = c("Q1 (lowest)", "Q2", "Q3", "Q4 (highest)"),
include.lowest = TRUE)
# GATES regression
library(fixest)
gates <- feols(outcome ~ i(cate_quartile, treatment),
data = df, vcov = "HC1")
summary(gates)
# Significant difference between Q4 and Q1 → heterogeneity exists
# CLAN: compare characteristics across CATE quartiles
clan_table <- df %>%
group_by(cate_quartile) %>%
summarise(across(c(x1, x2, x3, age, income), mean))
print(clan_table)
```
## AIPW / Augmented IPW (Doubly Robust Estimator)
Combines outcome regression and propensity score weighting. **Doubly robust**: consistent if *either* the outcome model or the propensity score model is correctly specified (but not necessarily both). Particularly natural for binary treatment and binary/continuous outcomes.
**Estimator**:
```
τ_AIPW = E[ μ₁(X) − μ₀(X) + D(Y − μ₁(X))/e(X) − (1−D)(Y − μ₀(X))/(1−e(X)) ]
```
where μ_d(X) = E[Y|D=d, X] and e(X) = P(D=1|X) are estimated by ML.
**Difference from DML**: AIPW is more natural for binary treatment/outcome; DML is better suited for continuous treatment or partially linear structural models. Both use cross-fitting.
```python
# Python — AIPW / DR Learner (EconML)
from econml.dr import DRLearner
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.linear_model import LogisticRegressionCV
import numpy as np
# Data preparation
Y = df['outcome'].values
T = df['treatment'].values # binary: 0/1
X = df[['x1', 'x2', 'x3']].values
# DRLearner implements AIPW with cross-fitting for CATE estimation
dr_learner = DRLearner(
model_propensity=LogisticRegressionCV(cv=5), # propensity score e(X)
model_regression=RandomForestRegressor(n_estimators=200), # outcome μ_d(X)
model_final=RandomForestRegressor(n_estimators=200), # CATE model
cv=5,
random_state=42
)
dr_learner.fit(Y, T, X=X)
# ATE via AIPW
ate = dr_learner.ate_inference(X=X)
print(f"AIPW ATE: {ate.mean_point:.4f} (SE: {ate.stderr_mean:.4f})")
print(f"95% CI: {ate.conf_int_mean()}")
# CATE predictions
cate = dr_learner.effect(X)
```
```r
# R — AIPW using AIPW package
# install.packages("AIPW")
library(AIPW)
library(SuperLearner)
aipw_obj <- AIPW$new(
Y = df$outcome,
A = df$treatment,
W = df[, c("x1", "x2", "x3")],
Q.SL.library = c("SL.ranger", "SL.glm"), # outcome model
g.SL.library = c("SL.ranger", "SL.glm"), # propensity model
k_split = 5, # cross-fitting folds
verbose = FALSE
)
aipw_obj$fit()
aipw_obj$summary()
# Reports: ATE, RR, OR with 95% CI
```
## Meta-Learners for CATE Estimation
Meta-learners are general frameworks for estimating CATE that wrap any base ML model. They differ in how they use the treatment variable.
| Learner | Approach | Best When |
|---------|----------|-----------|
| **S-Learner** | Single model: fit μ(X, D), then CATE = μ(X,1) − μ(X,0) | Simple baseline; may shrink CATE to zero if D is weak signal |
| **T-Learner** | Two separate models: μ₁(X) and μ₀(X) | Unequal sample sizes; less regularization shrinkage on treatment |
| **X-Learner** | Imputes counterfactuals, then fits CATE on imputed residuals | Unbalanced treatment (very few treated or controls) |
| **R-Learner** | Residualizes Y and D, then fits CATE on residuals | High confounding; closely related to DML |
```python
# Python — Meta-Learners via EconML
from econml.metalearners import TLearner, SLearner, XLearner
from econml.dml import LinearDML
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
import numpy as np
Y = df['outcome'].values
T = df['treatment'].values
X = df[['x1', 'x2', 'x3']].values
# --- T-Learner ---
t_learner = TLearner(models=RandomForestRegressor(n_estimators=500))
t_learner.fit(Y, T, X=X)
cate_t = t_learner.effect(X)
print(f"T-Learner ATE: {np.mean(cate_t):.4f}")
# --- S-Learner ---
s_learner = SLearner(overall_model=RandomForestRegressor(n_estimators=500))
s_learner.fit(Y, T, X=X)
cate_s = s_learner.effect(X)
# --- X-Learner ---
x_learner = XLearner(
models=RandomForestRegressor(n_estimators=500),
propensity_model=RandomForestClassifier(n_estimators=500)
)
x_learner.fit(Y, T, X=X)
cate_x = x_learner.effect(X)
# --- R-Learner (via LinearDML with non-parametric final stage) ---
r_learner = LinearDML(
model_y=RandomForestRegressor(n_estimators=500),
model_t=RandomForestClassifier(n_estimators=500),
cv=5, random_state=42
)
r_learner.fit(Y, T, X=X, W=None)
cate_r = r_learner.effect(X)
# Compare ATE across meta-learners
print(f"S: {np.mean(cate_s):.4f} | T: {np.mean(cate_t):.4f} | "
f"X: {np.mean(cate_x):.4f} | R: {np.mean(cate_r):.4f}")
```
```r
# R — X-Learner using grf building blocks
library(grf)
# Step 1: T-Learner stage
X_mat <- as.matrix(df[, c("x1", "x2", "x3")])
Y <- df$outcome; W <- df$treatment
rf1 <- regression_forest(X_mat[W==1, ], Y[W==1]) # treated
rf0 <- regression_forest(X_mat[W==0, ], Y[W==0]) # control
# Step 2: Impute counterfactuals
mu1 <- predict(rf1, X_mat)$predictions
mu0 <- predict(rf0, X_mat)$predictions
# Step 3: X-Learner imputed effects
D1 <- Y[W==1] - predict(rf0, X_mat[W==1,])$predictions # treated: Y(1) - mu0
D0 <- predict(rf1, X_mat[W==0,])$predictions - Y[W==0] # control: mu1 - Y(0)
# Step 4: Fit CATE models on imputed effects
tau1 <- regression_forest(X_mat[W==1,], D1)
tau0 <- regression_forest(X_mat[W==0,], D0)
# Step 5: Combine using propensity score
e_hat <- regression_forest(X_mat, W)$predictions # propensity
cate_x <- e_hat * predict(tau0, X_mat)$predictions +
(1 - e_hat) * predict(tau1, X_mat)$predictions
cat("X-Learner ATE:", mean(cate_x), "\n")
```
## LASSO for Variable Selection
### Post-LASSO (Belloni, Chernozhukov & Hansen 2014)
Use LASSO to select controls, then run OLS with selected variables.
```r
# R — Post-LASSO
library(hdm)
# Post-double-selection LASSO for ATE
pds <- rlassoEffect(x = X, y = Y, d = W, method = "double selection")
summary(pds)
# Reports: coefficient, SE, t-stat, CI
```
```python
# Python — Post-LASSO
from sklearn.linear_model import LassoCV
import statsmodels.api as sm
# Step 1: LASSO on Y ~ X to select controls
lasso_y = LassoCV(cv=5).fit(X, Y)
selected_y = np.where(lasso_y.coef_ != 0)[0]
# Step 2: LASSO on D ~ X to select controls
lasso_d = LassoCV(cv=5).fit(X, W)
selected_d = np.where(lasso_d.coef_ != 0)[0]
# Step 3: Union of selected variables
selected = np.union1d(selected_y, selected_d)
# Step 4: OLS with selected controls
X_selected = sm.add_constant(np.column_stack([W, X[:, selected]]))
ols_result = sm.OLS(Y, X_selected).fit(cov_type='HC1')
print(f"Post-LASSO ATE: {ols_result.params[1]:.4f} (SE: {ols_result.bse[1]:.4f})")
```
```stata
* Stata — Post-double-selection LASSO
ssc install lassopack
* pdslasso: post-double-selection
pdslasso outcome treatment (x1-x50), robust
```
## Diagnostics and Validation
### Calibration Test for Causal Forest
```r
# Test forest calibration: does the forest detect heterogeneity?
calibration <- test_calibration(cf)
print(calibration)
# Row 1 (mean forest prediction): should be significant → forest detects an effect
# Row 2 (differential forest prediction): significant → heterogeneity exists
```
### Cross-Validated Performance
```r
# Out-of-bag predictions (built into grf)
oob_predictions <- predict(cf)$predictions # uses OOB by default
cor(oob_predictions, df$true_cate) # if true CATE known (simulation)
```
## Reporting Standards
1. **Method description**: State the ML method used for nuisance estimation (RF, LASSO, boosting)
2. **Cross-fitting**: Report number of folds (K) and repetitions
3. **ATE with CI**: Report point estimate, SE, 95% CI
4. **Heterogeneity evidence**: BLP table, GATES plot, variable importance
5. **Robustness**: Compare DML with different ML methods; compare GRF with traditional subgroup analysis
**Key sentence template (DML)**:
> "We estimate the treatment effect using Double Machine Learning (Chernozhukov et al. 2018) with random forests for nuisance estimation, 5-fold cross-fitting, and 10 repetitions. The estimated ATE is [β] (SE = [se], 95% CI: [lb, ub])."
**Key sentence template (Causal Forest)**:
> "We estimate heterogeneous treatment effects using a causal forest (Athey et al. 2019) with [N] trees and honest splitting. The calibration test confirms significant heterogeneity (p = [p]). Units in the top quartile of predicted CATE have an estimated effect of [β_Q4] compared to [β_Q1] in the bottom quartile."
## Common Pitfalls
- **Using ML for identification**: ML estimates nuisance parameters, not causal effects. You still need exogenous variation (RCT, IV, etc.)
- **Overfitting CATE**: Always use honest estimation (separate splitting and estimation samples)
- **Interpreting variable importance causally**: Variable importance in GRF shows predictive power for heterogeneity, not causal mediation
- **Ignoring cross-fitting**: Without cross-fitting, DML estimates are biased
See `references/ml-causal-reference.md` for IV-based causal forests, DML with IV, and simulation studies.ols-regression
|
# OLS Regression Skill
This skill provides comprehensive guidance for OLS regression and linear models in empirical research. It covers model specification, assumption testing, diagnostic checks, and result interpretation, with code examples in Python, R, and Stata.
## Core Workflow
When assisting with OLS regression, follow this sequence:
1. **Clarify the research question and data** — understand dependent variable, key regressors, and sample
2. **Specify the model** — choose functional form, control variables, fixed effects if needed
3. **Run the regression** — provide code in the user's preferred language
4. **Check assumptions** — run diagnostics systematically (see references)
5. **Interpret and report** — explain coefficients, significance, fit, and caveats
## Key Concepts
### Model Specification
- Write the regression equation explicitly: Y = β₀ + β₁X₁ + ... + βₖXₖ + ε
- Consider log transformations for skewed variables or elasticity interpretation
- Include relevant controls to reduce omitted variable bias
- Watch for irrelevant variables inflating standard errors
### The Gauss-Markov Assumptions
1. Linearity in parameters
2. Random sampling
3. No perfect multicollinearity
4. Zero conditional mean of errors: E(ε|X) = 0
5. Homoskedasticity: Var(ε|X) = σ²
6. (For inference) Normally distributed errors
Violation of assumptions 4–5 does not bias OLS but affects standard errors. Violation of assumption 4 (endogeneity) biases estimates — recommend IV methods.
### Standard Error Options
- **Default OLS SE**: valid only under homoskedasticity
- **HC robust SE (White)**: use when heteroskedasticity is suspected; always safe for cross-section data
- **Clustered SE**: use when observations are grouped (e.g., by firm, region, year)
- **Newey-West SE**: use for time series with autocorrelation
## Quick Code Templates
### Python (statsmodels)
```python
import statsmodels.api as sm
import statsmodels.formula.api as smf
# With robust standard errors
model = smf.ols('y ~ x1 + x2 + x3', data=df).fit(cov_type='HC3')
print(model.summary())
```
### R
```r
library(lmtest)
library(sandwich)
model <- lm(y ~ x1 + x2 + x3, data = df)
coeftest(model, vcov = vcovHC(model, type = "HC3"))
```
### Stata
```stata
reg y x1 x2 x3, robust
```
## Diagnostics Checklist
Run all diagnostics after fitting. See `references/ols-reference.md` for full test details.
| Issue | Test | Quick Fix |
|-------|------|-----------|
| Heteroskedasticity | Breusch-Pagan, White test | Robust SE |
| Autocorrelation | Durbin-Watson, Breusch-Godfrey | Newey-West SE |
| Multicollinearity | VIF > 10 | Drop/combine variables |
| Non-normality of errors | Jarque-Bera | Check outliers; large N mitigates |
| Omitted variable bias | Ramsey RESET | Respecify model |
## Reporting Standards (Academic)
- Report coefficients with standard errors in parentheses (or t-stats)
- Use asterisks for significance: * p<0.10, ** p<0.05, *** p<0.01
- Always state which standard errors are used (robust, clustered, etc.)
- Report R², adjusted R², N, and F-statistic
- Describe the identification strategy and potential endogeneity concerns
For detailed test formulas, code, and extended examples, see `references/ols-reference.md`.
## Common Pitfalls
- **Claiming causality without identification**: OLS with controls does not establish causality — use IV, DID, or RDD for causal claims
- **Using default SE with clustered data**: Always cluster SE at the group level when observations are grouped
- **Including "bad controls"**: Don't control for post-treatment variables (mediators) — they introduce collider bias
- **Log-transforming variables with zeros**: ln(0) is undefined; use asinh(x) or ln(x+1) with appropriate interpretation
- **Reporting R² as evidence of a good model**: High R² does not mean the model is correctly specified or causalpanel-data
|
# Panel Data Models Skill
This skill covers panel data econometrics: pooled OLS, fixed effects (FE), random effects (RE), and two-way FE models. It guides model selection, assumption testing, and interpretation for longitudinal/panel datasets.
## Key Terminology
- **Panel dataset**: observations on N units (individuals, firms, countries) over T time periods
- **Balanced panel**: every unit observed in every period
- **Unbalanced panel**: some unit-period observations missing
- **Unobserved heterogeneity (αᵢ)**: time-invariant unit-specific factors (e.g., firm culture, individual ability)
## Model Selection Framework
```
Start
├─ Is unobserved heterogeneity correlated with regressors?
│ ├─ YES → Fixed Effects (FE)
│ └─ NO → Random Effects (RE) — test with Hausman
│
├─ Are time effects important?
│ ├─ YES → Two-Way FE (entity + time dummies)
│ └─ NO → One-Way FE
│
└─ Need to estimate effect of time-invariant variables?
├─ YES → Random Effects or Mundlak/Correlated RE
└─ NO → Fixed Effects preferred
```
### Hausman Test Decision Rule
- H₀: RE is consistent (αᵢ uncorrelated with X)
- H₁: FE is consistent but RE is not (αᵢ correlated with X)
- **p < 0.05**: Use Fixed Effects
- **p ≥ 0.05**: Random Effects is efficient
### Mundlak / Correlated Random Effects (CRE)
Use when you need RE to estimate time-invariant variable effects, but want to relax the strict exogeneity assumption of RE. CRE includes group means of time-varying regressors in the RE equation, making it equivalent to FE for those variables while still estimating time-invariant effects.
```r
# R — Mundlak CRE approach
library(plm); library(dplyr)
# Compute entity means of time-varying regressors
panel_df_cre <- df %>%
group_by(entity_id) %>%
mutate(x1_mean = mean(x1),
x2_mean = mean(x2)) %>%
ungroup()
panel_cre <- pdata.frame(panel_df_cre, index = c("entity_id", "time_var"))
# RE model augmented with group means (Mundlak approach):
cre_model <- plm(y ~ x1 + x2 + time_invariant_var + x1_mean + x2_mean,
data = panel_cre,
model = "random")
summary(cre_model)
# Coefficients on x1_mean, x2_mean test the correlation between αᵢ and X
# (equivalent to Hausman test; joint significance = prefer FE)
# Coefficient on time_invariant_var is identified via between variation
```
```stata
* Stata — Mundlak CRE
* Step 1: compute entity means
bysort entity_id: egen x1_mean = mean(x1)
bysort entity_id: egen x2_mean = mean(x2)
* Step 2: RE model with group means added
xtreg y x1 x2 time_invariant_var x1_mean x2_mean, re
* Test joint significance of group means (Mundlak test):
testparm x1_mean x2_mean
* p < 0.05 → group means matter → prefer FE for time-varying regressors
```
## Quick Code Templates
### Fixed Effects
```python
# Python (linearmodels)
from linearmodels.panel import PanelOLS
import pandas as pd
# Set multi-index: entity and time
df = df.set_index(['entity_id', 'time_var'])
model = PanelOLS(df['y'], df[['x1', 'x2']], entity_effects=True,
time_effects=True) # Two-way FE
result = model.fit(cov_type='clustered', cluster_entity=True)
print(result.summary)
```
```r
# R (plm)
library(plm)
panel_df <- pdata.frame(df, index = c("entity_id", "time_var"))
# One-way FE
fe_model <- plm(y ~ x1 + x2, data = panel_df, model = "within")
# Two-way FE
twfe_model <- plm(y ~ x1 + x2, data = panel_df, model = "within",
effect = "twoways")
# Clustered SE
library(lmtest); library(sandwich)
coeftest(fe_model, vcov = vcovHC(fe_model, cluster = "group"))
```
```stata
* Stata — Two-way FE with clustered SE
xtset entity_id time_var
xtreg y x1 x2 i.time_var, fe cluster(entity_id)
```
### Random Effects
```python
from linearmodels.panel import RandomEffects
re_model = RandomEffects(df['y'], df[['x1', 'x2']])
re_result = re_model.fit()
print(re_result.summary)
```
```r
re_model <- plm(y ~ x1 + x2, data = panel_df, model = "random")
summary(re_model)
```
```stata
xtreg y x1 x2, re
```
### Hausman Test
```python
from linearmodels.panel import compare
# Compare FE vs RE
print(compare({'FE': fe_result, 'RE': re_result}))
# Or use statsmodels hausman
```
```r
phtest(fe_model, re_model)
# p < 0.05 → prefer Fixed Effects
```
```stata
hausman fe_estimates re_estimates
```
## Dynamic Panels and Arellano-Bond GMM
Use when the model includes a lagged dependent variable (Yᵢₜ₋₁) in short-T panels. The within (FE) estimator is biased in this case (Nickell 1981 bias). Arellano-Bond uses lagged levels as instruments for the differenced equation.
**When to use**: Short T (T < 10), panel includes lagged DV, suspicion of endogenous regressors.
```r
# R — Arellano-Bond GMM (plm package)
library(plm)
# Difference GMM (Arellano-Bond 1991)
ab <- pgmm(
y ~ lag(y, 1) + x1 + x2 | lag(y, 2:4), # instruments: lags 2-4 of y
data = panel_df,
effect = "individual",
model = "twosteps" # two-step is asymptotically efficient
)
summary(ab, robust = TRUE)
# Key diagnostics:
# AR(1): should be significant (differencing induces MA(1))
# AR(2): should be insignificant (no serial correlation in levels)
# Hansen J test: p > 0.05 → instruments are valid
```
```python
# Python — Arellano-Bond GMM (linearmodels)
# Note: linearmodels does not directly implement Arellano-Bond;
# use the dedicated BetterArellano approach or wrap via R
# Alternatively, use system GMM via a custom estimator:
# pip install pydynpd
import pydynpd
# pydynpd syntax
command_str = "y L1.y x1 x2 | gmm(y, 2 4) iv(x1 x2)"
results = pydynpd.regression.abond(command_str, df, ["entity_id", "time_var"])
print(results.summary)
```
```stata
* Stata — Arellano-Bond with xtabond2 (preferred)
ssc install xtabond2
xtset entity_id time_var
* Difference GMM:
xtabond2 y L.y x1 x2, gmm(L.y, lag(2 4)) iv(x1 x2) ///
twostep robust noleveleq
* System GMM (adds level equation with lagged differences as instruments):
xtabond2 y L.y x1 x2, gmm(L.y, lag(2 4)) iv(x1 x2) twostep robust
* Diagnostics reported automatically:
* - AR(1), AR(2) tests
* - Hansen J test of overidentifying restrictions
```
**Interpretation rules**:
- AR(1) significant, AR(2) insignificant → no second-order serial correlation in levels ✓
- Hansen J p > 0.05 → instruments jointly valid ✓
- Too many instruments (> N) weakens the Hansen test — restrict lag range
## Standard Errors for Panel Data
| Situation | Recommended SE |
|-----------|---------------|
| Serial correlation within entities | Cluster by entity |
| Cross-sectional dependence | Driscoll-Kraay SE |
| Both serial + cross-sectional | Two-way clustering |
| Heteroskedasticity only | HC robust SE |
### Driscoll-Kraay and Two-Way Clustering Code
**Driscoll-Kraay SE**: Robust to cross-sectional dependence and serial correlation. Preferred for macro panels (small N, large T).
```r
# R — Driscoll-Kraay SE (sandwich package)
library(plm); library(sandwich); library(lmtest)
fe_model <- plm(y ~ x1 + x2, data = panel_df, model = "within")
# Driscoll-Kraay SE (robust to cross-sectional and serial dependence):
coeftest(fe_model, vcov = vcovSCC(fe_model, type = "HC1", maxlag = 4))
# maxlag: number of lags for serial correlation (typically T^0.25)
```
```stata
* Stata — Driscoll-Kraay SE
xtscc y x1 x2, fe lag(4)
* lag(4) = bandwidth parameter; use T^0.25 as a rule of thumb
```
**Two-way clustering**: Clusters at both entity and time level. Use when treatment varies at both levels.
```r
# R — Two-way clustering (sandwich)
library(sandwich); library(lmtest)
# Manually compute two-way clustered SE:
# V_twoway = V_entity + V_time - V_entity×time
vcov_entity <- vcovCL(fe_model, cluster = ~entity_id)
vcov_time <- vcovCL(fe_model, cluster = ~time_var)
vcov_both <- vcovCL(fe_model, cluster = ~entity_id + time_var)
coeftest(fe_model, vcov = vcov_both)
```
```stata
* Stata — Two-way clustering
xtreg y x1 x2 i.time_var, fe vce(cluster entity_id) // cluster by entity only
* For two-way clustering (entity AND time):
reghdfe y x1 x2, absorb(entity_id time_var) vce(cluster entity_id time_var)
```
## Interpreting Fixed Effects Results
- FE coefficients identify **within-unit** variation only
- Cannot estimate effect of time-invariant variables (absorbed by unit FEs)
- Two-way FE removes both unit trends and aggregate time trends
- Always report whether entity FE, time FE, or both are included
## Reporting Standards
- State panel dimensions: N = [units], T = [periods], total obs
- Report whether SE are clustered (at entity level is standard)
- Specify which effects are included (entity, time, or both)
- Report F-test for joint significance of fixed effects
- Include Hausman test result when choosing FE over RE
For first-difference (FD) estimators and panel models with limited dependent variables (conditional logit, Poisson FE, Cox stratified hazard), see [`references/panel-ldv-advanced.md`](references/panel-ldv-advanced.md).
## Common Pitfalls
- **Using RE when FE is appropriate**: If Hausman test rejects, RE is inconsistent — always test
- **Clustering at the wrong level**: Cluster SE at the level of treatment variation, not the individual level
- **Nickell bias**: Including lagged DV in short-T panels with FE is biased — use Arellano-Bond GMM
- **Ignoring cross-sectional dependence**: In macro panels (small N, large T), standard FE SE are invalid — use Driscoll-Kraay
- **Interpreting FE coefficients as between-unit effects**: FE estimates are purely within-unit; they cannot speak to cross-unit differencespaper-writing
Draft economics papers with proper structure and academic style
# Paper Writing Skill
## Role in Workflow
**This skill is the authoritative LaTeX template library and writing convention reference for the econometrics plugin.** It is designed to be called by the `/write` command (Phase 9) rather than invoked independently.
**Division of labor with `/write` command:**
| Responsibility | Owner |
|---------------|-------|
| Read upstream files (model-spec.md, results-memo.md, etc.) | `/write` command |
| Extract research context and build writing brief | `/write` command |
| Confirm writing scope, language, and target journal with user | `/write` command |
| Provide section-by-section content rules tied to upstream data | `/write` command |
| Provide LaTeX templates (preamble, section skeletons) | **This skill** |
| Provide causal language calibration rules | **This skill** |
| Provide writing conventions, common pitfalls, compile workflow | **This skill** |
| Assemble paper.tex, compile to PDF, export DOCX | `/write` command |
---
## Narrative Writing Principles
**An empirical economics paper is a causal story.** Every section must advance the narrative; no section should be a standalone island of results.
### The Narrative Arc
```
Introduction → Literature → Data → Strategy → Results
↑ ↓
(question) (answer)
↑ ↓
Conclusion ← Discussion ← Mechanisms ← Robustness
(so what) (credibility)
```
**Each section must:**
1. **Open** with one sentence bridging from the previous section
2. **Close** with one sentence previewing the next section (except Conclusion)
### Figure and Table Integration Rule
Every figure or table reference in the text must accomplish three things in the same paragraph:
| Step | What to do | Example |
|------|-----------|---------|
| **① Preview** | Tell the reader what they're about to see | *"Table 2 presents estimates of equation (1)."* |
| **② Extract** | Give the key number from the table/figure | *"The coefficient on D is 0.12 (s.e. = 0.04)..."* |
| **③ Interpret** | State what this means for the narrative | *"...implying a 15% increase relative to the mean."* |
**Never write**: "See Table X." or "Figure Y presents the results." without extracting and interpreting the number.
### Main Text vs. Appendix — Placement Decision
**Keep in main text** (direct support for the main causal story):
- Summary statistics table (Table 1)
- Main regression table (Table 2)
- Core identification figure (event study, RDD binscatter, IV first-stage)
- Primary heterogeneity table (if it is a core contribution)
**Move to appendix** (supporting/supplementary material):
- Full robustness table (multiple specification columns)
- Balance/pre-treatment tests (DiD, IV)
- Variable definitions table
- Additional heterogeneity subgroups
- Sample construction flowchart
- Supplementary mechanism figures
When referencing appendix items in the main text, use: *"Appendix Table A1 reports..."* or *"(see Appendix Figure A1)"*.
---
## LaTeX Preamble (Authoritative)
This is the canonical preamble for economics journal submissions. The `/write` command uses this directly — there is no separate preamble in the command file.
```latex
\documentclass[12pt]{article}
\usepackage{amsmath,amssymb}
\usepackage[margin=1.25in]{geometry}
\usepackage{setspace}
\usepackage{booktabs,caption,threeparttable,makecell}
\usepackage{pdflscape} % landscape pages for wide tables (\begin{landscape})
\usepackage{graphicx}
\usepackage{hyperref}
\usepackage{natbib}
\usepackage[title,titletoc]{appendix} % appendix environment with lettered sections
\usepackage{subcaption}
% \usepackage{microtype} % enable if TeX distribution supports it
\setstretch{1.5} % 1.5x body spacing -- standard for journal submissions
\captionsetup{labelfont=bf, labelsep=period, font=small}
\hypersetup{colorlinks=true, linkcolor=black, citecolor=black, urlcolor=blue}
\graphicspath{{../figures/}} % relative to paper/ -- one level up to reach figures/
\bibliographystyle{aer} % AER author-year; use chicago for other journals
```
### Front Matter (First Page)
The global `\setstretch{1.5}` inflates the title block unless overridden. Wrap front matter in `\begin{spacing}{1.0}`.
```latex
\begin{document}
\thispagestyle{empty}
\begin{spacing}{1.0}
\begin{center}
{\large\bfseries [PAPER TITLE]}\par
\vspace{0.7em}
{\normalsize\bfseries [AUTHOR NAME]}\par
{\small\textit{[AFFILIATION]} \quad \href{mailto:[EMAIL]}{[EMAIL]}}\par
{\small [MONTH YEAR]}\par
\end{center}
\vspace{0.5em}
\input{sections/abstract}
\vspace{0.4em}
\noindent\textbf{JEL Codes:} [J23, J31, O33]
\quad\textbf{Keywords:} [keyword1, keyword2, keyword3]
\vspace{0.3em}
\noindent\footnotesize\textit{I thank [acknowledgements]. All errors are my own.}
\normalsize
\end{spacing}
\clearpage
% NO \tableofcontents -- not standard in economics journal submissions
\input{sections/introduction}
\input{sections/literature}
\input{sections/data}
\input{sections/strategy}
\input{sections/results}
\input{sections/robustness}
\input{sections/heterogeneity}
\input{sections/discussion}
\input{sections/conclusion}
\bibliography{references}
% ── Appendix ─────────────────────────────────────────────────────────────────
\begin{appendices}
% Reset counters for appendix tables and figures
\renewcommand{\thetable}{A\arabic{table}}
\renewcommand{\thefigure}{A\arabic{figure}}
\setcounter{table}{0}
\setcounter{figure}{0}
\input{sections/appendix} % contains all supplementary tables and figures
\end{appendices}
\end{document}
```
### Key Front-Matter Conventions
- **No `\titlepage` environment** -- forces abstract to a separate page
- **No `\tableofcontents`** -- not standard in economics submissions
- **`\begin{spacing}{1.0}` for title block** -- prevents 1.5x stretch from inflating spacing
- **`\thispagestyle{empty}`** -- suppresses page number on cover page
- **Title font: `\large\bfseries`** -- `\LARGE` combined with 1.5x stretch uses too much vertical space
- **`\citet{}` vs `\citep{}`**: use `\citet{Author2020}` for "Author (2020) show that..." and `\citep{Author2020}` for parenthetical "(Author 2020)"
---
## Causal Language Calibration
**This table governs language intensity across ALL sections.** The causal language level is determined by `/write` from `results-memo.md §4` and passed to this skill. Apply consistently throughout the paper -- do not mix levels within a single draft.
| Identification credibility | Assumption framing | Verb choices |
|---------------------------|--------------------|--------------|
| **High** -- all key assumptions pass empirical tests | "The causal effect of D on Y..." | "causes", "increases", "reduces", "leads to" |
| **Medium** -- some assumptions not fully testable or show mild violations | "The effect of D on Y, as estimated by [strategy]..." | "is associated with", "predicts", "is related to" |
| **Low** -- identification assumptions not plausibly satisfied | "We document a correlation between D and Y..." | "is correlated with", "co-moves with", "we observe" |
**Cross-section enforcement**: once the level is set, apply it uniformly in the Abstract, Introduction, Results, and Conclusion. The Empirical Strategy section always describes assumptions in the formal/neutral register regardless of the level.
---
## Section Templates
### 1. Abstract
**Format**: 150-200 words, 5-6 sentences, fixed structure.
```latex
\begin{abstract}
\noindent
[RESEARCH QUESTION IN ONE SENTENCE].
Using [DATA SOURCE] covering [N] [UNITS] over [PERIOD],
we exploit [IDENTIFICATION STRATEGY] to identify
the [causal/estimated] effect of [D] on [Y].
We find that [D] [CAUSAL VERB] [Y] by [MAGNITUDE] [UNITS/PERCENT/SD]
([SIGNIFICANCE STATEMENT, e.g., significant at the 1\% level]).
[HETEROGENEITY OR ROBUSTNESS SENTENCE --
"This effect is concentrated among [SUBGROUP] and robust to [CHECKS]."]
Our results [CONTRIBUTION VERB: inform/challenge/extend]
[POLICY/THEORY IMPLICATION].
\end{abstract}
```
**Checklist**:
- Contains the main coefficient magnitude (not just direction)
- Names the identification strategy explicitly
- Causal verb consistent with the calibration table above
- Does NOT merely report statistical significance without magnitude
---
### 2. Introduction
Top-5 journal introductions are typically 4-6 pages (approx. 1,500-2,000 words). Follow the five-paragraph structure.
```latex
\section{Introduction}
\label{sec:introduction}
% -- Paragraph 1: Hook + Research Question + Main Finding (numbers required) --
% Rule: by the end of paragraph 1, the reader must know (1) what question,
% (2) what finding, (3) the magnitude. Never open with "This paper examines..."
[HOOK SENTENCE -- a striking fact, statistic, or policy puzzle].
This paper [provides causal evidence / documents] that [D] [CAUSAL VERB] [Y]:
a [UNIT CHANGE] in [D] [CAUSAL VERB] [Y] by [MAGNITUDE]
([SIGNIFICANCE]; s.e.\ = [SE]).
% -- Paragraph 2: Why is this hard to estimate? (endogeneity threats) --
Estimating the [causal / true] effect of [D] on [Y] is challenging
for [two/three] reasons.
First, [OVB THREAT -- name the omitted variable and its bias direction].
Second, [REVERSE CAUSALITY THREAT].
[Third, [MEASUREMENT ERROR / SELECTION CONCERN].]
Prior work using [OLS / cross-sectional comparisons] is likely to
[overstate/understate] the effect because [DIRECTION AND SOURCE OF BIAS].
% -- Paragraph 3: Identification strategy and data --
We address these challenges by exploiting [SOURCE OF EXOGENOUS VARIATION].
[EXPLAIN WHY THIS VARIATION IS PLAUSIBLY EXOGENOUS -- one to two sentences].
Our data come from [DATA SOURCE], covering [N UNITS] over [PERIOD],
yielding [OBS] observations.
% -- Paragraph 4: Summary of findings (magnitudes + heterogeneity + robustness) --
Our main finding is that [RESTATE MAIN RESULT WITH COEFFICIENT].
[ECONOMIC SIGNIFICANCE -- relative to mean or SD].
[HETEROGENEITY -- one sentence if applicable].
The results are robust to [ROBUSTNESS SUMMARY -- one sentence].
% -- Paragraph 5: Contribution + Roadmap --
% Rule: no more than 3 contribution statements; roadmap no more than 5 sentences.
This paper contributes to [STRAND 1] by [CONTRIBUTION 1].
[It also contributes to [STRAND 2] by [CONTRIBUTION 2].]
Unlike \citet{CLOSEST PAPER}, we [KEY DISTINCTION].
The remainder of the paper proceeds as follows.
Section~\ref{sec:literature} reviews related literature.
Section~\ref{sec:data} describes the data.
Section~\ref{sec:strategy} presents the empirical strategy.
Section~\ref{sec:results} reports main results.
Section~\ref{sec:robustness} presents robustness checks.
Section~\ref{sec:conclusion} concludes.
```
**Introduction prohibitions**:
- "This paper examines..." as the opening sentence
- Reporting significance only, no magnitude
- More than 3 contribution statements
- Roadmap paragraph longer than 5 sentences
- Any equation or technical notation in the introduction
---
### 3. Literature Review
```latex
\section{Related Literature}
\label{sec:literature}
% Organize by research STRAND (thematic), NOT chronologically.
% Typical: 2-4 strands, 1-3 paragraphs each.
% -- Strand 1: Core empirical literature --
This paper builds on [NUMBER] strands of literature.
The first strand examines [BROAD TOPIC].
\citet{Seminal1} and \citet{Seminal2} established [FOUNDATIONAL FINDING].
More recently, \citet{Recent1} show [KEY FINDING] using [DATA/METHOD],
and \citet{Recent2} document [COMPLEMENTARY RESULT] in [CONTEXT].
Unlike these papers, we [KEY DISTINCTION -- method, context, or mechanism].
% -- Strand 2: Identification approach --
The second strand uses [SIMILAR IDENTIFICATION STRATEGY] to study related questions.
\citet{Author3} exploit [INSTRUMENT/EVENT] to identify [EFFECT], finding [RESULT].
Our approach is closest to \citet{Author4}, who [BRIEF DESCRIPTION OF THEIR DESIGN].
We extend their work by [HOW WE DIFFER].
% -- Strand 3: Mechanism or theoretical background (if applicable) --
% -- Explicit contribution positioning --
This paper contributes to these literatures in two ways.
First, [CONTRIBUTION 1: the new empirical evidence you provide].
Second, [CONTRIBUTION 2: new context, method, or mechanism].
```
---
### 4. Data
```latex
\section{Data}
\label{sec:data}
\subsection{Data Sources and Sample Construction}
Our primary data come from [DATA SOURCE], which covers
[UNIT OF OBSERVATION] over the period [START YEAR]--[END YEAR].
[ONE TO TWO SENTENCES describing the data content and collection method].
We construct our analysis sample by [SAMPLE SELECTION STEPS].
We exclude [EXCLUSION CRITERIA AND REASON],
resulting in a final sample of [N] [UNITS] and [OBS] observations.
\subsection{Variable Definitions}
Our main outcome variable is [Y], defined as [PRECISE DEFINITION].
[TREATMENT VARIABLE D] takes value one if [CONDITION] and zero otherwise.
[INSTRUMENT Z] measures [DEFINITION -- for IV papers].
Variable definitions follow [REFERENCE IF STANDARD DEFINITION].
\subsection{Descriptive Statistics}
Table~\ref{tab:sumstats} reports summary statistics for the main analysis sample.
The average [Y] is [MEAN] (standard deviation [SD]).
[D PERCENT]\% of [UNITS] are [TREATED / ABOVE THRESHOLD].
[KEY OBSERVATION ABOUT DATA DISTRIBUTION OR NOTABLE PATTERN].
[IF DiD/IV: Table~\ref{tab:balance} reports pre-treatment balance.
The treatment and control groups are similar on [KEY COVARIATES],
consistent with [PARALLEL TRENDS / EXCLUSION RESTRICTION].]
```
---
### 5. Empirical Strategy
**This section receives the most scrutiny from referees.** Draw directly from `model-spec.md` and `identification-memo.md`; do not re-derive the model.
```latex
\section{Empirical Strategy}
\label{sec:strategy}
\subsection{Baseline Specification}
% Use the equation exactly as specified in model-spec.md Section 2
Our baseline specification is:
\begin{equation}
[MAIN EQUATION FROM model-spec.md]
\label{eq:baseline}
\end{equation}
\noindent where $[Y_{it}]$ is [OUTCOME DEFINITION],
$[D_{it}]$ is [TREATMENT DEFINITION],
$\mathbf{X}_{it}$ is a vector of controls including [CONTROL LIST],
$[FE NOTATION]$ are [ENTITY/TIME] fixed effects,
and $\varepsilon_{it}$ is the error term.
$[PARAMETER]$ is the coefficient of interest; it measures
[WHAT THE PARAMETER IDENTIFIES -- ATE/ATT/LATE and the population it applies to].
Standard errors are clustered at the [LEVEL] level to account for
[SERIAL/SPATIAL CORRELATION -- state the economic reason explicitly].
\subsection{Identification}
% State each assumption formally and in plain language.
% Always preview where you test it.
The key identifying assumption is [ASSUMPTION IN PLAIN LANGUAGE].
Formally:
\begin{equation*}
[FORMAL STATEMENT -- e.g., parallel trends, relevance, exclusion restriction]
\end{equation*}
This assumption would be violated if [SPECIFIC, CONCRETE THREAT].
We provide evidence supporting this assumption in
Section~\ref{sec:robustness}, where we show [BRIEF PREVIEW OF TEST RESULT].
% FOR IV -- add exclusion restriction paragraph:
% The exclusion restriction requires that [Z] affects [Y] only through [D].
% This could be violated if [SPECIFIC VIOLATION SCENARIO].
% We argue this is unlikely because [ECONOMIC ARGUMENT].
% FOR RDD -- add continuity / no-manipulation paragraph
% FOR DiD -- add parallel trends + no-anticipation paragraph
```
---
### 6. Results
```latex
\section{Results}
\label{sec:results}
% Rule: Lead with the number. Never open with "Table X shows our results."
\subsection{Main Results}
[MAIN COEFFICIENT AS FIRST CLAUSE -- e.g., "A one-unit increase in D
[CAUSAL VERB] Y by [VALUE] (s.e.\ = [SE], [SIGNIFICANCE])."]
Table~\ref{tab:main} reports estimates of equation~\eqref{eq:baseline}.
Column~(1) presents the baseline specification without controls.
Adding [CONTROL SET] in column~(2) [leaves the estimate stable at /
reduces it to] [VALUE], [INTERPRETATION OF CHANGE].
Our preferred specification in column~([N]) includes [FULL CONTROLS AND FE]
and yields [FINAL ESTIMATE] ([SE], [SIGNIFICANCE]).
% Economic significance -- mandatory
To assess economic significance, note that the sample mean of [Y] is [MEAN].
Our estimate implies that [D] [CAUSAL VERB] [Y] by
[MAGNITUDE PERCENT]\% relative to the mean,
or approximately [SD-UNITS] standard deviations.
[COMPARISON: This is [larger/comparable/smaller] than
\citet{CLOSEST PAPER}'s estimate of [THEIR VALUE] using [THEIR METHOD].]
% Event study / dynamic effects (DiD papers)
[Figure~\ref{fig:eventstudy} plots event-study coefficients $\hat{\beta}_k$
for $k \in [-K, K]$ relative to the treatment date.
Pre-treatment coefficients are small and indistinguishable from zero
(joint $F$-test: $p = $ [P-VALUE]), supporting parallel trends.
Post-treatment coefficients are [positive/negative] and [growing/stable/declining],
consistent with [INTERPRETATION].]
% IMPORTANT: Tables are always in separate files under tables/
% The paper.tex structure handles \input{../tables/table_main.tex}
% Never write \begin{tabular}...\end{tabular} inline in section files
```
---
### 7. Robustness
```latex
\section{Robustness}
\label{sec:robustness}
% Open with the conclusion, then present evidence
Table~\ref{tab:robustness} presents a battery of robustness checks.
Our main finding is stable: the coefficient on [D] ranges from [MIN] to [MAX]
across all specifications, remaining statistically significant at the
[\%] level in [N of M] cases.
\subsection{Inference Robustness}
[Column~(1) replicates the preferred specification with [ALTERNATIVE SE TYPE].
The coefficient is [VALUE] ([SE]), nearly identical to the baseline.]
\subsection{Sample Robustness}
[Column~(2) excludes [GROUP / OUTLIER CRITERION].
Column~(3) restricts the sample to [SUBSAMPLE].
Estimates range from [MIN] to [MAX], consistent with the baseline.]
\subsection{Specification Robustness}
[Column~([N]) uses [ALTERNATIVE FUNCTIONAL FORM / CONTROL SET / BANDWIDTH].
The coefficient is [VALUE], [MAGNITUDE CHANGE AND INTERPRETATION].]
\subsection{Identification Checks}
% DiD: Pre-trend test
[Figure~\ref{fig:eventstudy} shows no differential pre-trends
(joint $F$-test: $p = $ [P]).]
% DiD/IV: Placebo
[Assigning treatment [one year earlier / to untreated units] yields
$\hat{\beta} = $ [NEAR-ZERO VALUE] ([SE]), confirming our result is not
driven by pre-existing trends.]
% RDD: McCrary density test
[The density of [RUNNING VARIABLE] is continuous at the threshold
($p = $ [P]), ruling out sorting around the cutoff.]
% IV: First-stage strength
[The first-stage $F$-statistic on [Z] is [F], well above the
Stock-Yogo critical value of 10, ruling out weak-instrument concerns.]
% Template for any coefficient that moves materially:
% In column ([N]), [WHAT CHANGES]. The estimate [increases/decreases] to [VALUE],
% reflecting [ECONOMIC REASON]. This does not challenge the main conclusion
% because [ARGUMENT].
```
---
### 8. Heterogeneity and Mechanisms
```latex
\section{Heterogeneity and Mechanisms}
\label{sec:heterogeneity}
\subsection{Heterogeneous Treatment Effects}
% Always: state the theory first, then show the data
[THEORETICAL REASON why heterogeneity is expected along [DIMENSION]].
If [MECHANISM], we would expect the effect to be larger among [SUBGROUP].
Table~\ref{tab:heterogeneity} reports treatment effects by [DIMENSION].
The effect is [X times larger / present only / absent] for [SUBGROUP 1]
(column~([A]): $\hat{\beta} = $ [VALUE], s.e.\ = [SE])
relative to [SUBGROUP 2]
(column~([B]): $\hat{\beta} = $ [VALUE], s.e.\ = [SE]).
[The interaction term is statistically significant ($p = $ [P]).]
[SUBGROUP 1] may respond more because [ECONOMIC REASON].
\subsection{Mechanisms}
% Language intensity follows the causal calibration table:
% Direct mechanism evidence (externally identified) -> "provides direct evidence for"
% Suggestive / descriptive test -> "is consistent with"
% Ruling out alternatives -> "rules out" / "cannot be explained by"
Our results [ARE CONSISTENT WITH / PROVIDE DIRECT EVIDENCE FOR] [MAIN MECHANISM].
[We examine [INTERMEDIATE OUTCOME Z], which should [INCREASE/DECREASE]
if [MECHANISM] is operative.
Table~\ref{tab:mechanisms} shows that [D] [VERB] [Z] by [MAGNITUDE] ([SIGNIFICANCE]),
consistent with [MECHANISM].]
An alternative explanation is [COMPETING MECHANISM].
However, this predicts [TESTABLE IMPLICATION], which we do not observe in [EVIDENCE].
[We therefore rule out / cannot rule out] [ALTERNATIVE].
```
---
### 9. Discussion
```latex
\section{Discussion}
\label{sec:discussion}
\subsection{External Validity}
Our estimates apply most directly to [POPULATION / CONTEXT / TIME PERIOD].
Several factors may limit generalizability.
First, [LIMITATION 1 -- e.g., single country, specific sector].
Whether these findings generalize to [OTHER CONTEXT] is an open question.
Second, [LIMITATION 2 -- e.g., identification relies on a particular event].
Third, [LIMITATION 3 -- data quality, partial compliance, SUTVA].
\subsection{Policy Implications}
Our findings suggest that [POLICY INTERVENTION] could [EFFECT ON OUTCOME].
[BACK-OF-ENVELOPE: scaling up by [FACTOR] implies [AGGREGATE EFFECT].]
Policymakers should be cautious because [CAVEAT --
general equilibrium effects, targeting, political economy, compliance].
```
---
### 10. Conclusion
**Top-5 conclusions: 1-2 pages maximum. Do not introduce new results.**
```latex
\section{Conclusion}
\label{sec:conclusion}
% 6 elements, approx. 6-8 sentences total
% (1) Question + method (1 sentence)
This paper examined [RESEARCH QUESTION] using [IDENTIFICATION STRATEGY]
and [DATA SOURCE].
% (2) Main findings with magnitudes (2-3 sentences)
We find that [D] [CAUSAL VERB] [Y] by [MAGNITUDE] ([SIGNIFICANCE]).
[HETEROGENEITY FINDING IF APPLICABLE].
[MECHANISM FINDING IF APPLICABLE].
% (3) Robustness summary (1 sentence)
These results are robust to [ROBUSTNESS CHECKS SUMMARY].
% (4) Policy / theory implications (2-3 sentences)
For policy, our findings suggest [IMPLICATION].
For theory, they [SUPPORT / CHALLENGE / EXTEND] [THEORETICAL MECHANISM/VIEW].
% (5) Limitations (1-2 sentences)
Our analysis has limitations. [MOST IMPORTANT LIMITATION -- honest but brief].
% (6) Future directions (1-2 sentences)
Future work could [DIRECTION 1 -- natural next step].
[DIRECTION 2 -- broader question opened by this paper].
```
**Conclusion prohibitions**:
- Any result not already presented in the body of the paper
- Section-by-section recap ("In Section 2 we showed...")
- "More research is needed" without specifics
- Exceeding 2 pages
---
## Compile Workflow (Authoritative)
### Step 1 — LaTeX → PDF
Always run **four commands** in this exact sequence. Skipping BibTeX leaves all `\cite{}` commands as `[?]` in the final PDF.
```bash
cd [workspace]/paper/
pdflatex -interaction=nonstopmode paper.tex # pass 1: build .aux file
bibtex paper # resolve citations -- produces .bbl
pdflatex -interaction=nonstopmode paper.tex # pass 2: embed bibliography
pdflatex -interaction=nonstopmode paper.tex # pass 3: fix all cross-references
# Inspect the log
grep -i "overfull\|undefined\|missing\|error" paper.log
grep "Overfull .hbox" paper.log # >10pt = visible overflow, fix it
grep "File.*not found" paper.log # missing package
grep "Citation.*undefined" paper.log # missing .bib entry
```
**Auto-fix for the four most common errors:**
| Error | Cause | Fix |
|-------|-------|-----|
| `File 'siunitx.sty' not found` | Table used `S` columns | Switch to `c` or `D{.}{.}{-1}` (dcolumn) |
| `Unicode character U+XXXX` | Python wrote symbols directly into .tex | Replace with LaTeX macros: `$\geq$`, `$\rightarrow$` |
| `Overfull \hbox (>10pt)` | Table or text overflows margin | Wrap in `\begin{landscape}...\end{landscape}` + `\footnotesize` |
| `Citation 'key' undefined` | Entry missing from references.bib | Add BibTeX entry and rerun bibtex |
### Step 2 — LaTeX → DOCX (via pandoc)
After a successful PDF compile, export DOCX for co-author review and journal submission systems that require Word format.
```bash
cd [workspace]/paper/
# Preferred: pandoc with citeproc (preserves math, citations, cross-refs)
pandoc paper.tex \
--bibliography=references.bib \
--citeproc \
-o paper.docx
# Verify the output exists and has non-trivial size
ls -lh paper.docx
```
If pandoc is not installed:
```bash
# Ubuntu/Debian
apt-get install -y pandoc 2>/dev/null || true
# macOS
brew install pandoc 2>/dev/null || true
# Check version
pandoc --version | head -1
```
Fallback (lower fidelity, no pandoc needed):
```bash
libreoffice --headless --convert-to docx paper.pdf --outdir .
```
**Pandoc known limitations** — inform the user if these apply:
- Complex LaTeX tables may render as plain text in DOCX; the PDF remains the authoritative version
- `\begin{landscape}` pages convert to normal portrait pages in DOCX (landscape formatting is PDF-only)
- `\citet{}`/`\citep{}` resolve correctly only when `--citeproc` and `--bibliography` are passed
### Compile Confirmation
```
─────────────────────────────────────────────────────
✅ pdflatex pass 1-3 + bibtex: no errors
✅ No undefined references
✅ No missing citations
⚠️ Overfull \hbox (2.3pt) — minor, acceptable
✅ DOCX exported via pandoc
─────────────────────────────────────────────────────
Output: paper/paper.pdf ([N] pages)
paper/paper.docx
```
---
## Version Management
Every time a draft is created or revised, the `/write` command saves a versioned copy. This skill does not write files directly -- it provides templates; the command handles file output.
```
paper_v1.0_YYYYMMDD.tex # initial draft
paper_v1.1_YYYYMMDD.tex # first revision
paper_v2.0_YYYYMMDD.tex # major rewrite
```
If called in Mode B (standalone) to revise a single section, always confirm: "Should I overwrite the existing file or save as a new version?" Default to saving a new version.
---
## Writing Tips
**Introductions**
- First sentence must contain a substantive claim or striking fact -- never "This paper examines..."
- State the main result with a number by the end of paragraph 1
- Limit contribution statements to 3 or fewer; more signals overselling to reviewers
**Results**
- Lead every results paragraph with the coefficient value, not "Table X shows..."
- Economic significance is mandatory: interpret magnitude relative to mean or SD
- Guide the reader column by column; explain why estimates move across specifications
**Conclusions**
- Synthesize, do not summarize -- add interpretive value, not repetition
- Be honest about limitations; reviewers will find them regardless
- End on the contribution, not a hedge
---
## Landscape Table Template (Wide Regression Tables)
Use this template whenever a table has **more than 5 columns** or produces `Overfull \hbox > 5pt` at standard page width. The table occupies its own portrait-to-landscape page and is placed in a float page (`[p]`).
```latex
% ── Wide regression table on a dedicated landscape page ──────────────────────
\begin{landscape}
\begin{table}[p]
\caption{[Table Title — e.g., Robustness Checks]}
\label{tab:robustness}
\footnotesize % slightly smaller font to fit more columns
\begin{threeparttable}
\begin{tabular}{l *{6}{c}} % 1 label col + 6 numeric cols (adjust as needed)
\toprule
& (1) & (2) & (3) & (4) & (5) & (6) \\
& Baseline & Alt. SE & Excl. Outliers & Log Y & Alt. Controls & Preferred \\
\midrule
[D var] & [β₁]*** & [β₂]*** & [β₃]*** & [β₄]** & [β₅]*** & \textbf{[β₆]***} \\
& ([SE₁]) & ([SE₂]) & ([SE₃]) & ([SE₄]) & ([SE₅]) & \textbf{([SE₆])} \\
\addlinespace
Controls & No & No & Yes & Yes & Alt & Yes \\
FE & No & Yes & Yes & Yes & Yes & Yes \\
$N$ & [N] & [N] & [N] & [N] & [N] & [N] \\
$R^2$ & [r] & [r] & [r] & [r] & [r] & [r] \\
\bottomrule
\end{tabular}
\begin{tablenotes}[flushleft]
\footnotesize
\item \textit{Notes:} [Standard error clustering. Significance: *\,p$<$0.1, **\,p$<$0.05, ***\,p$<$0.01.]
\end{tablenotes}
\end{threeparttable}
\end{table}
\end{landscape}
% ─────────────────────────────────────────────────────────────────────────────
```
**Rules for landscape tables:**
- Always use `[p]` float specifier (dedicated page, no surrounding text)
- Use `\footnotesize` (10pt) inside the table body; `\normalsize` resumes automatically after `\end{landscape}`
- `\begin{landscape}` requires `pdflscape` (already in the authoritative preamble)
- In the main text, reference the table normally: *"Table~\ref{tab:robustness} presents..."*
- DOCX export: landscape formatting is PDF-only; inform the user the DOCX version will show the table in portrait
---
## Appendix Template
```latex
% sections/appendix.tex
% Called via \input{sections/appendix} inside \begin{appendices}...\end{appendices}
% Counter resets (A1, A2, ...) are handled in paper.tex
\section{Additional Robustness Checks}
\label{sec:appendix_robustness}
[Brief narrative sentence explaining what each appendix table shows and why
it belongs in the appendix rather than the main text.]
% Wide robustness table — also use landscape here if > 5 cols
\input{../tables/table_robustness_appendix}
\section{Pre-Treatment Balance}
\label{sec:appendix_balance}
Table~\ref{tab:balance} reports balance statistics for the treatment and control groups
prior to the policy change. [One sentence summarizing the balance result.]
\input{../tables/table_balance}
\section{Variable Definitions}
\label{sec:appendix_variables}
\input{../tables/table_variable_definitions}
\section{Supplementary Figures}
\label{sec:appendix_figures}
[Brief sentence describing each supplementary figure.]
\begin{figure}[htbp]
\centering
\includegraphics[width=0.85\textwidth]{figA1_supplementary}
\caption{[Caption]}
\label{fig:appendix_A1}
\end{figure}
```
---
## Common Pitfalls
- Burying the main result in the middle of the paper
- Using "significant" without specifying statistical vs. economic
- Over-claiming causality when identification is weak (see Causal Language Calibration above)
- Literature review organized chronologically rather than by research strand
- Conclusion that recaps the paper section by section
- Inline `\begin{tabular}` in section .tex files -- tables go in `tables/`, referenced via `\input`
- `\graphicspath` pointing to wrong directory (must be `../figures/` relative to `paper/`)
- Compiling without bibtex (produces `[?]` for all citations)rdd-analysis
|
# Regression Discontinuity Design (RDD) Skill
This skill covers sharp and fuzzy RDD: identification assumptions, bandwidth selection, local polynomial estimation, validity tests, and reporting standards for academic papers.
## Core Logic
RDD exploits a known threshold in a continuous "running variable" (X) that determines treatment assignment. Units just above and below the cutoff (c) are comparable on all dimensions except treatment.
**Sharp RDD**: Treatment perfectly determined by crossing cutoff
- T_i = 1 if X_i ≥ c, T_i = 0 if X_i < c
- Estimand: Average treatment effect at the cutoff (τ_SRD)
**Fuzzy RDD**: Crossing cutoff increases probability of treatment (like an instrument)
- Use when there's non-compliance around the cutoff
- Estimand: LATE at the cutoff (τ_FRD = reduced form / first stage)
## RDD Assumptions
1. **Continuity of conditional expectation**: E[Y(0)|X] and E[Y(1)|X] are continuous at X = c
- Means: units cannot precisely manipulate running variable to select into treatment
2. **No other discontinuities**: Nothing else changes discontinuously at the cutoff
3. **Bandwidth continuity**: Observations near cutoff are locally valid comparisons
## Complete RDD Workflow
### Step 1: Visualize the Discontinuity
Always plot the raw data with binned means before any regression.
```python
# Python
import matplotlib.pyplot as plt
import numpy as np
# Bin the running variable
df['bin'] = pd.cut(df['running_var'], bins=50)
bin_means = df.groupby('bin')[['running_var', 'y']].mean().reset_index()
plt.figure(figsize=(10, 6))
plt.scatter(bin_means['running_var'], bin_means['y'], s=30, color='steelblue')
plt.axvline(x=cutoff, color='red', linestyle='--', label='Cutoff')
plt.xlabel('Running Variable'); plt.ylabel('Outcome')
plt.title('RDD: Binned Scatter Plot')
plt.legend(); plt.show()
```
```r
# R (rdplot from rdrobust)
library(rdrobust)
rdplot(y = df$y, x = df$running_var, c = cutoff,
title = "RDD Binned Scatter", x.label = "Running Variable",
y.label = "Outcome")
```
```stata
rdplot y running_var, c(cutoff) graph_options(title("RDD Visualization"))
```
### Step 2: Bandwidth Selection
**Default**: Use Imbens-Kalyanaraman (IK) or Calonico-Cattaneo-Titiunik (CCT) optimal bandwidth.
```python
from rdrobust import rdrobust
result = rdrobust(df['y'], df['running_var'], c=cutoff)
print(result.summary())
```
```r
rdbwselect(y = df$y, x = df$running_var, c = cutoff)
```
```stata
rdbwselect y running_var, c(cutoff) all
```
### Step 3: Main RDD Estimate
```python
# Python (rdrobust) — triangular kernel, local linear
result = rdrobust(y=df['y'], x=df['running_var'], c=cutoff,
kernel='triangular', p=1)
print(result.summary())
```
```r
# R
main_rdd <- rdrobust(y = df$y, x = df$running_var, c = cutoff,
kernel = "triangular", p = 1)
summary(main_rdd)
```
```stata
rdrobust y running_var, c(cutoff) kernel(triangular) p(1)
```
### Step 4: Validity Tests (All Required)
#### 4a. Density/Manipulation Test (McCrary Test)
H₀: No discontinuity in density of running variable at cutoff
```python
from rdrobust import rddensity
density_test = rddensity(df['running_var'], c=cutoff)
print(density_test.summary())
```
```r
library(rddensity)
rdd_density <- rddensity(df$running_var, c = cutoff)
summary(rdd_density)
rdplotdensity(rdd_density, df$running_var)
```
```stata
rddensity running_var, c(cutoff)
```
**Interpretation**: p > 0.05 → no bunching; manipulation unlikely ✓
#### 4b. Covariate Balance (Placebo Outcome Tests)
Run RDD on pre-determined covariates — should find no discontinuity.
```r
for (cov in c("age", "income_pre", "gender")) {
res <- rdrobust(y = df[[cov]], x = df$running_var, c = cutoff)
cat(cov, ": coef =", res$coef[1], ", p =", res$pv[3], "\n")
}
```
#### 4c. Placebo Cutoff Tests
Run RDD at fake cutoffs above and below actual cutoff — should find no effects.
```r
for (fake_c in c(cutoff - 5, cutoff + 5)) {
df_sub <- df[df$running_var < cutoff, ] # Use only control side
res <- rdrobust(df_sub$y, df_sub$running_var, c = fake_c)
cat("Placebo c =", fake_c, ": coef =", res$coef[1], "\n")
}
```
#### 4d. Bandwidth Sensitivity
Report estimates at 50%, 75%, 125%, 150% of optimal bandwidth.
```r
bw_opt <- rdbwselect(df$y, df$running_var, c = cutoff)$bws[1,1]
for (mult in c(0.5, 0.75, 1, 1.25, 1.5)) {
res <- rdrobust(df$y, df$running_var, c = cutoff, h = bw_opt * mult)
cat("BW =", round(bw_opt*mult,2), ": coef =", round(res$coef[1],3),
", p =", round(res$pv[3],3), "\n")
}
```
## Fuzzy RDD
```r
# R — fuzzy RDD (uses crossing as instrument for actual treatment)
fuzzy_rdd <- rdrobust(y = df$y, x = df$running_var, c = cutoff,
fuzzy = df$actual_treatment)
summary(fuzzy_rdd)
```
```stata
rdrobust y running_var, c(cutoff) fuzzy(actual_treatment)
```
## Reporting Standards
Report in this order:
1. **Binned scatter plot** showing discontinuity visually
2. **Main estimate** with optimal CCT bandwidth, triangular kernel, local linear
3. **Sensitivity table**: estimates across bandwidth multiples and polynomial orders
4. **Validity tests**: density test (McCrary), covariate balance, placebo cutoffs
5. **Sample size**: N total, N within bandwidth (left and right)
**Key sentence template for papers**:
> "We estimate the RDD using a local linear regression with a triangular kernel and the CCT optimal bandwidth (h = [X]). The point estimate at the cutoff is [β] (SE = [se], p = [p])."
See `references/rdd-reference.md` for geographic RDD, kink designs, donut-hole robustness, discrete running variable RDD, and multi-cutoff/multi-score RDD.
## Common Pitfalls
- **Using global polynomial regression**: High-order global polynomials (e.g., 5th degree) overfit and produce misleading results — always use local linear or local quadratic
- **Not showing the binned scatter plot**: The visual discontinuity is crucial for credibility — always include it
- **Ignoring manipulation**: A failed McCrary test means your RDD is fundamentally compromised — address the sorting concern
- **Reporting only one bandwidth**: Show sensitivity across 50%–150% of optimal bandwidth to demonstrate robustness
- **Using RDD far from cutoff**: RDD estimates are valid only at the cutoff — do not extrapolate to units far from the thresholdresults-analysis
Comprehensive results analysis for empirical research: generate publication-quality descriptive statistics and balance tables, interpret regression coefficients with economic magnitude and effect sizes, assess identification assumption diagnostics, and produce structured results memos. Use when asked to create summary statistics, Table 1, balance tests, interpret results, assess economic significance, or write results narratives.
# Results-Analysis
## Purpose
本 skill 是一体化结果分析工具,支持从原始数据探索(EDA)到回归结果解读的完整流程:
1. **描述性统计与 EDA**(Part 1)— 生成 Table 1、Table 2、诊断图、缺失值报告
2. **结果解读与经济学意义**(Part 2)— 系数量级换算、效应量计算、文献对比、识别可信度评估
3. **输出生成**(Part 3)— `results-memo.md` 和所有表格的 .tex/.csv 双格式输出
---
## When to Use
- Phase 4 EDA 阶段(由 `/data` Step 3 调用)
- Phase 7 结果解读(由 `/code` Phase 6 执行完成后调用)
- 独立触发:用户持有数据或回归结果,需要快速生成统计表或结果解读
---
## Part 1: Descriptive Statistics & EDA(描述统计与探索性分析)
### Step 0: 环境初始化
**本 skill 通常由 `/data` 或 `/code` 传入结构化上下文。**
接收参数:
| 参数 | 说明 | 示例 |
|------|------|------|
| `clean_data_path` | 清洗后数据路径 | `data/clean/china_trade_*.parquet` |
| `identification_strategy` | 识别策略 | `DiD` / `RDD` / `IV` / `Panel FE` |
| `Y_var` | 结果变量 | `log_gdp_growth` |
| `D_var` | 处理变量 | `policy_dummy` |
| `Z_var` | 识别变量(如适用) | `tariff_rate_1990` |
| `control_vars` | 协变量列表 | `["log_gdp", "population"]` |
| `id_var` | 面板个体标识 | `province_code` |
| `time_var` | 时间变量 | `year` |
| `treatment_timing` | 政策实施时点(DiD 用) | `2003` |
| `cutoff_value` | 断点阈值(RDD 用) | `50.0` |
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl
import os
os.makedirs("tables", exist_ok=True)
os.makedirs("figures", exist_ok=True)
mpl.rcParams.update({
"figure.dpi": 300,
"font.size": 11,
"axes.spines.top": False,
"axes.spines.right": False,
"axes.grid": True,
"grid.alpha": 0.3,
})
df = pd.read_parquet(clean_data_path)
print(f"样本:{len(df):,} 行 × {df.shape[1]} 列")
```
---
### Step 1: Table 1 — 描述性统计
生成全样本描述性统计,**双格式输出**(`.tex` 用于论文,`.csv` 用于核查)。
```python
# ─────────────────────────────────────────────
# Table 1 — 描述性统计
# ─────────────────────────────────────────────
analysis_vars = [Y_var, D_var] + ([Z_var] if Z_var else []) + control_vars
analysis_vars = [v for v in analysis_vars if v in df.columns]
stats_dict = {}
for v in analysis_vars:
s = df[v].dropna()
stats_dict[v] = {
"N": len(s),
"Mean": s.mean(),
"SD": s.std(),
"P25": s.quantile(0.25),
"Median": s.median(),
"P75": s.quantile(0.75),
"Min": s.min(),
"Max": s.max(),
}
table1 = pd.DataFrame(stats_dict).T.round(3)
table1.to_csv("tables/table1_descriptive.csv")
print("✅ Table 1 数据已保存(.csv)")
# LaTeX 格式化由 `table` skill 统一负责(在 /plot 阶段调用)
```
---
### Step 2: Table 2 — 平衡性检验
**触发条件:** 存在处理变量 `D_var`
**关键原则:平衡性检验必须限定在预处理期样本上,不得使用全样本。**
```python
# ─────────────────────────────────────────────
# Table 2 — 处理组/控制组平衡性检验
# ─────────────────────────────────────────────
from scipy import stats as scipy_stats
# 筛选预处理期样本
if treatment_timing:
df_pre = df[df[time_var] < treatment_timing].copy()
print(f"预处理期样本:{len(df_pre):,} 行")
else:
df_pre = df.copy()
treated_group = df_pre[df_pre[D_var] == 1]
control_group = df_pre[df_pre[D_var] == 0]
balance_rows = {}
for v in control_vars:
if v not in df_pre.columns:
continue
t_vals = treated_group[v].dropna()
c_vals = control_group[v].dropna()
mean_t = t_vals.mean()
mean_c = c_vals.mean()
diff = mean_t - mean_c
t_stat, p_val = scipy_stats.ttest_ind(t_vals, c_vals, equal_var=False)
norm_diff = diff / np.sqrt((t_vals.var() + c_vals.var()))
balance_rows[v] = {
"Treatment Mean": round(mean_t, 3),
"Control Mean": round(mean_c, 3),
"Difference": round(diff, 3),
"t-stat": round(t_stat, 2),
"p-value": round(p_val, 3),
"Norm. Diff.": round(norm_diff, 3),
"Balanced": "✅" if abs(norm_diff) < 0.25 else "⚠️",
}
table2 = pd.DataFrame(balance_rows).T
table2.to_csv("tables/table2_balance.csv")
print("✅ Table 2 数据已保存(.csv)")
# LaTeX 格式化由 `table` skill 统一负责(在 /plot 阶段调用)
unbalanced = table2[table2["Balanced"] == "⚠️"].index.tolist()
if unbalanced:
print(f"⚠️ 标准化差异 > 0.25 的变量:{unbalanced}")
```
---
### Step 3: 识别变量分布诊断图
根据 `identification_strategy`,**自动触发**对应的诊断图。
#### 策略 A: DiD — 处理组 vs 控制组年均趋势
```python
# ─────────────────────────────────────────────
# DiD:预处理期平行趋势目测
# ─────────────────────────────────────────────
trend = (df.groupby([time_var, D_var])[Y_var]
.mean()
.reset_index()
.rename(columns={D_var: "group"}))
fig, ax = plt.subplots(figsize=(8, 4.5))
for g, label, color, ls in [(1, "Treatment", "#C0392B", "-"),
(0, "Control", "#2C3E50", "--")]:
sub = trend[trend["group"] == g]
ax.plot(sub[time_var], sub[Y_var], color=color, ls=ls,
lw=2, marker="o", ms=4, label=label)
if treatment_timing:
ax.axvline(treatment_timing, color="gray", ls=":", lw=1.5,
label=f"Policy ({treatment_timing})")
ax.set_xlabel("Year")
ax.set_ylabel(f"Mean {Y_var}")
ax.set_title("Treatment vs Control: Pre-treatment Trend")
ax.legend()
plt.tight_layout()
plt.savefig("figures/eda_did_trend.png", dpi=300)
plt.close()
# 预处理期趋势斜率对比
pre = df[df[time_var] < treatment_timing] if treatment_timing else df
for g, label in [(1, "Treatment"), (0, "Control")]:
sub = pre[pre[D_var] == g].groupby(time_var)[Y_var].mean()
if len(sub) >= 2:
slope = np.polyfit(sub.index, sub.values, 1)[0]
print(f" {label}:年均趋势斜率 = {slope:.4f}")
```
#### 策略 B: RDD — 分配变量分布 + 阈值附近散点图
```python
# ─────────────────────────────────────────────
# RDD:分配变量分布 + 阈值附近密度目测
# ─────────────────────────────────────────────
running_centered = df["running_centered"]
fig, axes = plt.subplots(1, 2, figsize=(11, 4.5))
bandwidth_plot = running_centered.std() * 3
mask = running_centered.abs() <= bandwidth_plot
# 左图:直方图
ax = axes[0]
ax.hist(running_centered[mask & (running_centered < 0)],
bins=30, color="#2C3E50", alpha=0.7, label="Below cutoff")
ax.hist(running_centered[mask & (running_centered >= 0)],
bins=30, color="#C0392B", alpha=0.7, label="Above cutoff")
ax.axvline(0, color="black", lw=1.5, ls="--")
ax.set_xlabel("Running Variable (centered)")
ax.set_ylabel("Frequency")
ax.set_title("Distribution of Running Variable")
ax.legend()
# 右图:Y 关于分配变量的散点 + 分段拟合
ax = axes[1]
x = running_centered[mask]
y = df.loc[mask, Y_var]
ax.scatter(x, y, alpha=0.2, s=8, color="steelblue")
for side_mask, color in [(x < 0, "#2C3E50"), (x >= 0, "#C0392B")]:
if side_mask.sum() > 5:
coef = np.polyfit(x[side_mask], y[side_mask], 1)
x_line = np.linspace(x[side_mask].min(), x[side_mask].max(), 100)
ax.plot(x_line, np.polyval(coef, x_line), color=color, lw=2)
ax.axvline(0, color="black", lw=1.5, ls="--")
ax.set_xlabel("Running Variable (centered)")
ax.set_ylabel(Y_var)
ax.set_title(f"Y vs Running Variable Near Cutoff")
plt.suptitle("RDD Diagnostic Plots", fontsize=13, fontweight="bold")
plt.tight_layout()
plt.savefig("figures/eda_rdd.png", dpi=300)
plt.close()
n_above = (df["above_cutoff"] == 1).sum()
n_below = (df["above_cutoff"] == 0).sum()
print(f"阈值以上:{n_above},以下:{n_below}(比值 {n_above/n_below:.2f})")
```
#### 策略 C: IV — 工具变量与处理变量相关性
```python
# ─────────────────────────────────────────────
# IV:Z 与 D 的散点图 + 一阶段初步检验
# ─────────────────────────────────────────────
fig, axes = plt.subplots(1, 2, figsize=(11, 4.5))
ax = axes[0]
ax.scatter(df[Z_var], df[D_var], alpha=0.3, s=10, color="steelblue")
coef = np.polyfit(df[Z_var].dropna(), df[D_var].dropna(), 1)
x_line = np.linspace(df[Z_var].min(), df[Z_var].max(), 100)
ax.plot(x_line, np.polyval(coef, x_line), color="#C0392B", lw=2)
ax.set_xlabel(f"Instrument: {Z_var}")
ax.set_ylabel(f"Treatment: {D_var}")
ax.set_title("First Stage: Z vs D")
corr = df[[Z_var, D_var]].corr().iloc[0, 1]
ax.text(0.05, 0.92, f"ρ = {corr:.3f}", transform=ax.transAxes,
fontsize=11, color="#C0392B")
ax = axes[1]
ax.scatter(df[Z_var], df[Y_var], alpha=0.3, s=10, color="#2C3E50")
corr_zy = df[[Z_var, Y_var]].corr().iloc[0, 1]
ax.set_xlabel(f"Instrument: {Z_var}")
ax.set_ylabel(f"Outcome: {Y_var}")
ax.set_title("Reduced Form: Z vs Y")
ax.text(0.05, 0.92, f"ρ = {corr_zy:.3f}", transform=ax.transAxes,
fontsize=11, color="#2C3E50")
plt.suptitle("IV Diagnostic Plots", fontsize=13, fontweight="bold")
plt.tight_layout()
plt.savefig("figures/eda_iv.png", dpi=300)
plt.close()
if abs(corr) < 0.1:
print(f"⚠️ Z-D 相关系数仅 {corr:.3f},一阶段 F < 10 风险")
```
#### 策略 D: Panel FE — Within / Between 方差分解
```python
# ─────────────────────────────────────────────
# Panel FE:方差分解
# ─────────────────────────────────────────────
fig, axes = plt.subplots(1, 2, figsize=(11, 4.5))
for ax, var, title in [(axes[0], Y_var, "Outcome Y"),
(axes[1], D_var, "Treatment D")]:
unit_means = df.groupby(id_var)[var].mean()
df_temp = df.copy()
df_temp["unit_mean"] = df_temp.groupby(id_var)[var].transform("mean")
df_temp["within"] = df_temp[var] - df_temp["unit_mean"]
var_between = unit_means.var()
var_within = df_temp["within"].var()
total = var_between + var_within
bars = ax.bar(["Between", "Within"],
[var_between / total * 100, var_within / total * 100],
color=["#2C3E50", "#C0392B"], alpha=0.8, width=0.5)
ax.bar_label(bars, fmt="%.1f%%", padding=3)
ax.set_ylabel("Share of Total Variance (%)")
ax.set_title(f"Variance Decomposition: {title}")
ax.set_ylim(0, 110)
if var_within / total < 0.1:
ax.text(0.5, 0.6, "⚠️ Low within\nFE weak",
transform=ax.transAxes, ha="center", color="#C0392B", fontsize=9)
plt.suptitle(f"Panel FE: Within vs Between Variance", fontsize=12, fontweight="bold")
plt.tight_layout()
plt.savefig("figures/eda_panel_fe.png", dpi=300)
plt.close()
```
---
### Step 4: 缺失值报告
```python
miss = df.isnull().mean().sort_values(ascending=False)
miss = miss[miss > 0]
miss_report = pd.DataFrame({
"N_Missing": df.isnull().sum()[miss.index],
"Pct_Missing": (miss * 100).round(2),
})
miss_report.to_csv("tables/missing_report.csv")
if not miss_report.empty:
print("=== 缺失值报告 ===")
print(miss_report.to_string())
key_vars = [v for v in [Y_var, D_var, Z_var] if v and v in miss_report.index]
if key_vars:
print("\n⚠️ 关键变量缺失:")
for v in key_vars:
pct = miss_report.loc[v, "Pct_Missing"]
print(f" {v}: {pct:.1f}%")
```
---
## Part 2: Results Interpretation(结果解读与经济学意义)
### Step 0: 读取输入文件
依次读取:
```
model-spec.md # 主方程、识别假设状态
tables/table_main.csv # 主回归结果
tables/table1_descriptive.csv # 描述统计
data-report.md # 样本信息
literature-review-report.md # 先行文献估计值(可选)
```
---
### Step 1: 提取点估计与置信区间
```python
results = pd.read_csv("tables/table_main.csv", index_col=0)
beta = results.loc[D_var, "coef"] # 点估计
se = results.loc[D_var, "se"] # 标准误
p_value = results.loc[D_var, "pvalue"] # p 值
ci_lo = results.loc[D_var, "ci_low"] # 95% CI 下界
ci_hi = results.loc[D_var, "ci_high"] # 95% CI 上界
n_obs = results.loc["N", "coef"]
print(f"β̂ = {beta:.4f}(SE = {se:.4f}, p = {p_value:.3f})")
print(f"95% CI:[{ci_lo:.4f}, {ci_hi:.4f}]")
# 读取描述统计用于量级换算
desc = pd.read_csv("tables/table1_descriptive.csv", index_col=0)
Y_mean = desc.loc[Y_var, "Mean"]
Y_sd = desc.loc[Y_var, "SD"]
D_mean = desc.loc[D_var, "Mean"]
```
---
### Step 2: 系数量级解读
**量级解读是结果分析最核心的环节。**
根据变量变换类型,自动选择对应的解读框架:
| Y 变换 | D 变换 | 系数含义 | 推荐表达 |
|--------|--------|---------|---------|
| 水平值 | 虚拟变量 | D=1 时 Y 绝对变化量 | "[处理]使 Y 增加 β 个单位" |
| log Y | 虚拟变量 | D=1 时 Y 百分比变化 | "[处理]使 Y 增加约 100β%" |
| log Y | log D | 弹性 | "D 增加 1%,Y 变化 β%" |
| 标准化 z-score | 任意 | 以 Y 的标准差为单位 | "效应量 = β SD" |
#### 效应量计算
```python
# 效应量(Cohen's d 等价)
effect_size_sd = abs(beta) / Y_sd
print(f"效应量:{effect_size_sd:.3f} σ_Y")
if effect_size_sd < 0.1:
magnitude = "极小(< 0.1σ)"
elif effect_size_sd < 0.3:
magnitude = "小(0.1–0.3σ)"
elif effect_size_sd < 0.5:
magnitude = "中等(0.3–0.5σ)"
else:
magnitude = "较大(> 0.5σ)"
print(f"等级:{magnitude}")
# 相对于均值的百分比
pct_of_mean = beta / Y_mean * 100
print(f"相对于 Y 均值的效应:{pct_of_mean:.1f}%")
```
---
### Step 3: 统计显著性 vs. 经济显著性
**两者必须同时讨论,不可仅凭 p 值下结论。**
```python
def sig_stars(p):
if p < 0.01: return "***"
if p < 0.05: return "**"
if p < 0.10: return "*"
return "(不显著)"
stars = sig_stars(p_value)
print(f"β̂ = {beta:.4f}{stars},p = {p_value:.3f}")
print(f"95% CI:[{ci_lo:.4f}, {ci_hi:.4f}]")
# 诊断信息
if p_value < 0.05 and effect_size_sd < 0.05:
print("⚠️ 统计显著但经济上微不足道(大样本效应)")
elif p_value >= 0.05 and (ci_hi - ci_lo) > 0.2:
print("⚠️ 精度不足,无法排除实质效应")
elif p_value >= 0.05 and (ci_hi - ci_lo) <= 0.1:
print("✅ 精度充分,可合理排除大效应")
```
**经济显著性判断框架:**
```
经济显著性 ⟺
1. 效应量 ≥ 0.1σ_Y
2. 相对于均值 ≥ 5%
3. 置信区间全同号
4. 与 IV/OLS 估计一致
```
---
### Step 4: 与先行文献对比
从 `literature-review-report.md` 提取同类研究的估计值:
```python
# 文献估计汇总表
literature_comparison = {
"[作者, 年份]": {
"strategy": "[识别策略]",
"sample": "[地区/时期]",
"beta": "[估计值]",
"comment": "本文高/低/一致"
}
}
# 差异来源检查
print("""
差异来源检查清单
□ 识别策略不同(OLS vs. IV/DiD)→ 衰减偏误方向
□ 样本差异(国家/时期)→ 异质性
□ 变量定义不同 → 测量误差衰减
□ LATE vs. ATE 差异
□ 短期 vs. 长期效应
□ 政策强度差异
""")
print(f"""
边际贡献定位:
与 [文献] 相比,本文采用 [更可信的识别策略],
估计值为 {beta:.4f},[高于/低于/类似于] 文献中位数估计。
差异主要来自 [原因]。
""")
```
---
### Step 5: 识别可信度评估
将 `model-spec.md` 中每条识别假设的状态纳入结果解读:
```python
credibility_assessment = """
识别可信度评估
══════════════════════════════════════════
策略:[策略名称]
目标参数:[ATE / ATT / LATE]
假设状态:
✅ [假设1]:[诊断通过的依据]
⚠️ [假设2]:[存疑的表现及应对]
❌ [假设3]:[未满足的含义]
综合可信度:高 / 中 / 低
══════════════════════════════════════════
"""
# 可信度等级定义
credibility_levels = {
"高": "所有核心假设 ✅ → 可使用因果语言",
"中": "核心假设含 ⚠️ 但无 ❌ → 审慎语言",
"低": "存在 ❌ 假设 → 明确声明局限性,降级为相关性"
}
# 按策略的核心假设快速诊断
core_assumptions = {
"DiD": "平行趋势假设",
"RDD": "分配变量无操纵",
"IV": "排他性限制",
"Panel FE": "严格外生性",
"Synthetic Control": "预处理期拟合良好"
}
```
---
### Step 6: 外部有效性与局限性
#### LATE 的适用边界(IV / Fuzzy RDD 用)
```python
late_interpretation = """
LATE 解读
────────────────────────────────────────
本文估计的是 Compliers 的处理效应,即:
"因 [工具变量/阈值] 改变而改变处理状态的个体"
Compliers 可能不代表:
□ Always-takers
□ Never-takers
□ 样本外的其他群体
政策含义:结果直接适用于 [可能受政策影响的边际群体]
────────────────────────────────────────
"""
```
#### 样本代表性
```python
representativeness_check = """
样本代表性检查
- 地理范围:结果是否仅适用于特定国家/地区?
- 时间范围:特定政治经济环境下的效应可推广吗?
- 行业/群体:是否存在选择性进入样本?
- 数据局限:测量误差的影响方向
"""
```
#### 残余内生性风险
```python
residual_endogeneity = """
残余内生性检查
□ 同期发生的其他政策/事件(Confounders)
□ 样本选择
□ 测量误差
□ SUTVA 违反(处理单元间溢出)
□ 一般均衡效应(大规模政策的反馈)
"""
```
---
## Part 3: Output Generation(输出生成)
### Step 1: 生成 `results-memo.md`
整合 Part 2 的所有分析,写入工作目录:
```markdown
# Results Memo
**项目:** [研究问题一句话]
**版本:** v1.0
**日期:** [YYYY-MM-DD]
**对应方程:** model-spec.md,方程 [eq:label]
---
## 1. 核心估计结果
| | 主规格 | 加控制变量 | 全样本 |
|--|--------|-----------|--------|
| β̂ (D_var) | [值]*** | [值]*** | [值]** |
| SE | ([值]) | ([值]) | ([值]) |
| N | [N] | [N] | [N] |
**主系数解读:**
[按 Step 2 格式,含绝对量级、效应量、相对均值百分比]
## 2. 统计显著性 vs. 经济显著性
- **统计显著性:** [显著/边缘显著/不显著],p = [值]
- **95% CI:** [[lo], [hi]]
- **效应量:** [值] σ_Y([极小/小/中/大])
- **相对于均值:** Y 均值的 [值]%
- **经济显著性:** [一句话结论]
## 3. 与先行文献对比
[表格 + 2–3 句话说明差异来源和贡献]
## 4. 识别可信度评估
综合可信度:**[高/中/低]**
[逐条列出假设及状态]
结论语言建议:[因果语言 / 审慎语言 / 相关性语言]
## 5. 外部有效性与局限性
[LATE 边界 + 样本代表性 + 残余内生性威胁]
## 6. Phase 8 稳健性检验优先级
1. **[最高优先级]**:[检验名称]——原因:[...]
2. **[次优先级]**:[检验名称]——原因:[...]
3. **[可选]**:[检验名称]——原因:[...]
## 7. 论文写作建议
- **Results 节核心句**:[直接可用于论文的表达,含系数值和置信区间]
- **语言强度:** [因果 / 审慎因果 / 相关性]
- **Discussion 必须提及的注意事项**:[识别局限性]
```
---
## Common Pitfalls
| ❌ 常见错误 | ✅ 正确做法 |
|------------|------------|
| 平衡性检验用全样本 | **仅用预处理期样本** |
| 只报告 p 值 | 同时报告标准化差异和效应量 |
| EDA 图不标注政策时点 | DiD 趋势图必须标注 `treatment_timing` |
| 表格只保存 .tex | 双格式输出:.tex + .csv |
| 主系数不显著就说"无效应" | CI 窄 = 零效应本身是重要发现 |
| 系数方向与理论预期相反,直接回避 | 排查编码方向、中介变量、样本选择,如实记录 |
| 缺失值报告仅列数量 | 对关键变量给出 MCAR/MAR/MNAR 初步判断 |
---
## Requirements
```bash
pip install pandas numpy matplotlib scipy
```
## Related Skills & Commands
- **`/data`**:调用本 skill 的上级命令(Phase 4 Step 3)
- **`data-pipeline`**:上一阶段,提供本 skill 的输入数据
- **`/model` & `/code`**:Phase 5–6,执行回归并产出 `table_main.csv`
- **`/robustness`**:Phase 8,使用本 skill 生成的 `results-memo.md` 指导稳健性检验
- **`/write`**:Phase 9,直接使用本 skill 生成的结果表格和 results-memo.mdscrapling
Scrape web pages using Scrapling with anti-bot bypass (like Cloudflare Turnstile), stealth headless browsing, spiders framework, adaptive scraping, and JavaScript rendering. Use when asked to scrape, crawl, or extract data from websites; web_fetch fails; the site has anti-bot protections; write Python code to scrape/crawl; or write spiders.
# Scrapling
Scrapling is an adaptive Web Scraping framework that handles everything from a single request to a full-scale crawl.
Its parser learns from website changes and automatically relocates your elements when pages update. Its fetchers bypass anti-bot systems like Cloudflare Turnstile out of the box. And its spider framework lets you scale up to concurrent, multi-session crawls with pause/resume and automatic proxy rotation — all in a few lines of Python. One library, zero compromises.
Blazing fast crawls with real-time stats and streaming. Built by Web Scrapers for Web Scrapers and regular users, there's something for everyone.
**Requires: Python 3.10+**
**This is the official skill for the scrapling library by the library author.**
## Setup (once)
Create a virtual Python environment through any way available, like `venv`, then inside the environment do:
`pip install "scrapling[all]>=0.4.2"`
Then do this to download all the browsers' dependencies:
```bash
scrapling install --force
```
Make note of the `scrapling` binary path and use it instead of `scrapling` from now on with all commands (if `scrapling` is not on `$PATH`).
### Docker
Another option if the user doesn't have Python or doesn't want to use it is to use the Docker image, but this can be used only in the commands, so no writing Python code for scrapling this way:
```bash
docker pull pyd4vinci/scrapling
```
or
```bash
docker pull ghcr.io/d4vinci/scrapling:latest
```
## CLI Usage
The `scrapling extract` command group lets you download and extract content from websites directly without writing any code.
```bash
Usage: scrapling extract [OPTIONS] COMMAND [ARGS]...
Commands:
get Perform a GET request and save the content to a file.
post Perform a POST request and save the content to a file.
put Perform a PUT request and save the content to a file.
delete Perform a DELETE request and save the content to a file.
fetch Use a browser to fetch content with browser automation and flexible options.
stealthy-fetch Use a stealthy browser to fetch content with advanced stealth features.
```
### Usage pattern
- Choose your output format by changing the file extension. Here are some examples for the `scrapling extract get` command:
- Convert the HTML content to Markdown, then save it to the file (great for documentation): `scrapling extract get "https://blog.example.com" article.md`
- Save the HTML content as it is to the file: `scrapling extract get "https://example.com" page.html`
- Save a clean version of the text content of the webpage to the file: `scrapling extract get "https://example.com" content.txt`
- Output to a temp file, read it back, then clean up.
- All commands can use CSS selectors to extract specific parts of the page through `--css-selector` or `-s`.
Which command to use generally:
- Use **`get`** with simple websites, blogs, or news articles.
- Use **`fetch`** with modern web apps, or sites with dynamic content.
- Use **`stealthy-fetch`** with protected sites, Cloudflare, or anti-bot systems.
> When unsure, start with `get`. If it fails or returns empty content, escalate to `fetch`, then `stealthy-fetch`. The speed of `fetch` and `stealthy-fetch` is nearly the same, so you are not sacrificing anything.
#### Key options (requests)
Those options are shared between the 4 HTTP request commands:
| Option | Input type | Description |
|:-------------------------------------------|:----------:|:-----------------------------------------------------------------------------------------------------------------------------------------------|
| -H, --headers | TEXT | HTTP headers in format "Key: Value" (can be used multiple times) |
| --cookies | TEXT | Cookies string in format "name1=value1; name2=value2" |
| --timeout | INTEGER | Request timeout in seconds (default: 30) |
| --proxy | TEXT | Proxy URL in format "http://username:password@host:port" |
| -s, --css-selector | TEXT | CSS selector to extract specific content from the page. It returns all matches. |
| -p, --params | TEXT | Query parameters in format "key=value" (can be used multiple times) |
| --follow-redirects / --no-follow-redirects | None | Whether to follow redirects (default: True) |
| --verify / --no-verify | None | Whether to verify SSL certificates (default: True) |
| --impersonate | TEXT | Browser to impersonate. Can be a single browser (e.g., Chrome) or a comma-separated list for random selection (e.g., Chrome, Firefox, Safari). |
| --stealthy-headers / --no-stealthy-headers | None | Use stealthy browser headers (default: True) |
Options shared between `post` and `put` only:
| Option | Input type | Description |
|:-----------|:----------:|:----------------------------------------------------------------------------------------|
| -d, --data | TEXT | Form data to include in the request body (as string, ex: "param1=value1¶m2=value2") |
| -j, --json | TEXT | JSON data to include in the request body (as string) |
Examples:
```bash
# Basic download
scrapling extract get "https://news.site.com" news.md
# Download with custom timeout
scrapling extract get "https://example.com" content.txt --timeout 60
# Extract only specific content using CSS selectors
scrapling extract get "https://blog.example.com" articles.md --css-selector "article"
# Send a request with cookies
scrapling extract get "https://scrapling.requestcatcher.com" content.md --cookies "session=abc123; user=john"
# Add user agent
scrapling extract get "https://api.site.com" data.json -H "User-Agent: MyBot 1.0"
# Add multiple headers
scrapling extract get "https://site.com" page.html -H "Accept: text/html" -H "Accept-Language: en-US"
```
#### Key options (browsers)
Both (`fetch` / `stealthy-fetch`) share options:
| Option | Input type | Description |
|:-----------------------------------------|:----------:|:---------------------------------------------------------------------------------------------------------------------------------------------------------|
| --headless / --no-headless | None | Run browser in headless mode (default: True) |
| --disable-resources / --enable-resources | None | Drop unnecessary resources for speed boost (default: False) |
| --network-idle / --no-network-idle | None | Wait for network idle (default: False) |
| --real-chrome / --no-real-chrome | None | If you have a Chrome browser installed on your device, enable this, and the Fetcher will launch an instance of your browser and use it. (default: False) |
| --timeout | INTEGER | Timeout in milliseconds (default: 30000) |
| --wait | INTEGER | Additional wait time in milliseconds after page load (default: 0) |
| -s, --css-selector | TEXT | CSS selector to extract specific content from the page. It returns all matches. |
| --wait-selector | TEXT | CSS selector to wait for before proceeding |
| --proxy | TEXT | Proxy URL in format "http://username:password@host:port" |
| -H, --extra-headers | TEXT | Extra headers in format "Key: Value" (can be used multiple times) |
This option is specific to `fetch` only:
| Option | Input type | Description |
|:---------|:----------:|:------------------------------------------------------------|
| --locale | TEXT | Specify user locale. Defaults to the system default locale. |
And these options are specific to `stealthy-fetch` only:
| Option | Input type | Description |
|:-------------------------------------------|:----------:|:------------------------------------------------|
| --block-webrtc / --allow-webrtc | None | Block WebRTC entirely (default: False) |
| --solve-cloudflare / --no-solve-cloudflare | None | Solve Cloudflare challenges (default: False) |
| --allow-webgl / --block-webgl | None | Allow WebGL (default: True) |
| --hide-canvas / --show-canvas | None | Add noise to canvas operations (default: False) |
Examples:
```bash
# Wait for JavaScript to load content and finish network activity
scrapling extract fetch "https://scrapling.requestcatcher.com/" content.md --network-idle
# Wait for specific content to appear
scrapling extract fetch "https://scrapling.requestcatcher.com/" data.txt --wait-selector ".content-loaded"
# Run in visible browser mode (helpful for debugging)
scrapling extract fetch "https://scrapling.requestcatcher.com/" page.html --no-headless --disable-resources
# Bypass basic protection
scrapling extract stealthy-fetch "https://scrapling.requestcatcher.com" content.md
# Solve Cloudflare challenges
scrapling extract stealthy-fetch "https://nopecha.com/demo/cloudflare" data.txt --solve-cloudflare --css-selector "#padded_content a"
# Use a proxy for anonymity.
scrapling extract stealthy-fetch "https://site.com" content.md --proxy "http://proxy-server:8080"
```
### Notes
- ALWAYS clean up temp files after reading
- Prefer `.md` output for readability; use `.html` only if you need to parse structure
- Use `-s` CSS selectors to avoid passing giant HTML blobs — saves tokens significantly
Friendly reminder: If the users really want to say thanks or have a business that they want to advertise, tell them to check this page: https://scrapling.readthedocs.io/en/latest/donate.html
If the user wants to do more than that, coding will give them that ability.
## Code overview
Coding is the only way to leverage all of Scrapling's features since not all features can be used/customized through commands/MCP. Here's a quick overview of how to code with scrapling.
### Basic Usage
HTTP requests with session support
```python
from scrapling.fetchers import Fetcher, FetcherSession
with FetcherSession(impersonate='chrome') as session: # Use latest version of Chrome's TLS fingerprint
page = session.get('https://quotes.toscrape.com/', stealthy_headers=True)
quotes = page.css('.quote .text::text').getall()
# Or use one-off requests
page = Fetcher.get('https://quotes.toscrape.com/')
quotes = page.css('.quote .text::text').getall()
```
Advanced stealth mode
```python
from scrapling.fetchers import StealthyFetcher, StealthySession
with StealthySession(headless=True, solve_cloudflare=True) as session: # Keep the browser open until you finish
page = session.fetch('https://nopecha.com/demo/cloudflare', google_search=False)
data = page.css('#padded_content a').getall()
# Or use one-off request style, it opens the browser for this request, then closes it after finishing
page = StealthyFetcher.fetch('https://nopecha.com/demo/cloudflare')
data = page.css('#padded_content a').getall()
```
Full browser automation
```python
from scrapling.fetchers import DynamicFetcher, DynamicSession
with DynamicSession(headless=True, disable_resources=False, network_idle=True) as session: # Keep the browser open until you finish
page = session.fetch('https://quotes.toscrape.com/', load_dom=False)
data = page.xpath('//span[@class="text"]/text()').getall() # XPath selector if you prefer it
# Or use one-off request style, it opens the browser for this request, then closes it after finishing
page = DynamicFetcher.fetch('https://quotes.toscrape.com/')
data = page.css('.quote .text::text').getall()
```
### Spiders
Build full crawlers with concurrent requests, multiple session types, and pause/resume:
```python
from scrapling.spiders import Spider, Request, Response
class QuotesSpider(Spider):
name = "quotes"
start_urls = ["https://quotes.toscrape.com/"]
concurrent_requests = 10
async def parse(self, response: Response):
for quote in response.css('.quote'):
yield {
"text": quote.css('.text::text').get(),
"author": quote.css('.author::text').get(),
}
next_page = response.css('.next a')
if next_page:
yield response.follow(next_page[0].attrib['href'])
result = QuotesSpider().start()
print(f"Scraped {len(result.items)} quotes")
result.items.to_json("quotes.json")
```
Use multiple session types in a single spider:
```python
from scrapling.spiders import Spider, Request, Response
from scrapling.fetchers import FetcherSession, AsyncStealthySession
class MultiSessionSpider(Spider):
name = "multi"
start_urls = ["https://example.com/"]
def configure_sessions(self, manager):
manager.add("fast", FetcherSession(impersonate="chrome"))
manager.add("stealth", AsyncStealthySession(headless=True), lazy=True)
async def parse(self, response: Response):
for link in response.css('a::attr(href)').getall():
# Route protected pages through the stealth session
if "protected" in link:
yield Request(link, sid="stealth")
else:
yield Request(link, sid="fast", callback=self.parse) # explicit callback
```
Pause and resume long crawls with checkpoints by running the spider like this:
```python
QuotesSpider(crawldir="./crawl_data").start()
```
Press Ctrl+C to pause gracefully — progress is saved automatically. Later, when you start the spider again, pass the same `crawldir`, and it will resume from where it stopped.
### Advanced Parsing & Navigation
```python
from scrapling.fetchers import Fetcher
# Rich element selection and navigation
page = Fetcher.get('https://quotes.toscrape.com/')
# Get quotes with multiple selection methods
quotes = page.css('.quote') # CSS selector
quotes = page.xpath('//div[@class="quote"]') # XPath
quotes = page.find_all('div', {'class': 'quote'}) # BeautifulSoup-style
# Same as
quotes = page.find_all('div', class_='quote')
quotes = page.find_all(['div'], class_='quote')
quotes = page.find_all(class_='quote') # and so on...
# Find element by text content
quotes = page.find_by_text('quote', tag='div')
# Advanced navigation
quote_text = page.css('.quote')[0].css('.text::text').get()
quote_text = page.css('.quote').css('.text::text').getall() # Chained selectors
first_quote = page.css('.quote')[0]
author = first_quote.next_sibling.css('.author::text')
parent_container = first_quote.parent
# Element relationships and similarity
similar_elements = first_quote.find_similar()
below_elements = first_quote.below_elements()
```
You can use the parser right away if you don't want to fetch websites like below:
```python
from scrapling.parser import Selector
page = Selector("<html>...</html>")
```
And it works precisely the same way!
### Async Session Management Examples
```python
import asyncio
from scrapling.fetchers import FetcherSession, AsyncStealthySession, AsyncDynamicSession
async with FetcherSession(http3=True) as session: # `FetcherSession` is context-aware and can work in both sync/async patterns
page1 = session.get('https://quotes.toscrape.com/')
page2 = session.get('https://quotes.toscrape.com/', impersonate='firefox135')
# Async session usage
async with AsyncStealthySession(max_pages=2) as session:
tasks = []
urls = ['https://example.com/page1', 'https://example.com/page2']
for url in urls:
task = session.fetch(url)
tasks.append(task)
print(session.get_pool_stats()) # Optional - The status of the browser tabs pool (busy/free/error)
results = await asyncio.gather(*tasks)
print(session.get_pool_stats())
```
## References
You already had a good glimpse of what the library can do. Use the references below to dig deeper when needed
- `references/mcp-server.md` — MCP server tools and capabilities
- `references/parsing` — Everything you need for parsing HTML
- `references/fetching` — Everything you need to fetch websites and session persistence
- `references/spiders` — Everything you need to write spiders, proxy rotation, and advanced features. It follows a Scrapy-like format
- `references/migrating_from_beautifulsoup.md` — A quick API comparison between scrapling and Beautifulsoup
- `https://github.com/D4Vinci/Scrapling/tree/main/docs` — Full official docs in Markdown for quick access (use only if current references do not look up-to-date).
This skill encapsulates almost all the published documentation in Markdown, so don't check external sources or search online without the user's permission.
## Guardrails (Always)
- Only scrape content you're authorized to access.
- Respect robots.txt and ToS.
- Add delays (download_delay) for large crawls.
- Don't bypass paywalls or authentication without permission.
- Never scrape personal/sensitive data.stata
Comprehensive Stata reference for writing correct .do files. Covers syntax, options, gotchas, and idiomatic patterns. Use this skill whenever the user asks you to write, debug, or explain Stata code. Generates ready-to-run .do files for the user to execute manually.
# Stata Skill
You have access to comprehensive Stata reference files. **Do not load all files.**
Read only the 1-3 files relevant to the user's current task using the routing table below.
> **Execution model:** This skill generates Stata code and saves it as `.do` files for the user to run manually in Stata. Claude does **not** execute Stata code directly — there is no MCP server or CLI integration. After generating a `.do` file, always tell the user exactly which file to open or run and what output to expect.
---
## Critical Gotchas
These are Stata-specific pitfalls that lead to silent bugs. Internalize these before writing any code.
### Missing Values Sort to +Infinity
Stata's `.` (and `.a`-`.z`) are **greater than all numbers**.
```stata
* WRONG — includes observations where income is missing!
gen high_income = (income > 50000)
* RIGHT
gen high_income = (income > 50000) if !missing(income)
* WRONG — missing ages appear in this list
list if age > 60
* RIGHT
list if age > 60 & !missing(age)
```
### `=` vs `==`
`=` is assignment; `==` is comparison. Mixing them up is a syntax error or silent bug.
```stata
* WRONG — syntax error
gen employed = 1 if status = 1
* RIGHT
gen employed = 1 if status == 1
```
### Local Macro Syntax
Locals use `` `name' `` (backtick + single-quote). Globals use `$name` or `${name}`.
Forgetting the closing quote is the #1 macro bug.
```stata
local controls "age education income"
regress wage `controls' // correct
regress wage `controls // WRONG — missing closing quote
regress wage 'controls' // WRONG — wrong quote characters
```
### `by` Requires Prior Sort (Use `bysort`)
```stata
* WRONG — error if data not sorted by id
by id: gen first = (_n == 1)
* RIGHT — bysort sorts automatically
bysort id: gen first = (_n == 1)
* Also RIGHT — explicit sort
sort id
by id: gen first = (_n == 1)
```
### Factor Variable Notation (`i.` and `c.`)
Use `i.` for categorical, `c.` for continuous. Omitting `i.` treats categories as continuous.
```stata
* WRONG — treats race as continuous (e.g., race=3 has 3x effect of race=1)
regress wage race education
* RIGHT — creates dummies automatically
regress wage i.race education
* Interactions
regress wage i.race##c.education // full interaction
regress wage i.race#c.education // interaction only (no main effects)
```
### `generate` vs `replace`
`generate` creates new variables; `replace` modifies existing ones. Using `generate` on an existing variable name is an error.
```stata
gen x = 1
gen x = 2 // ERROR: x already defined
replace x = 2 // correct
```
### String Comparison Is Case-Sensitive
```stata
* May miss "Male", "MALE", etc.
keep if gender == "male"
* Safer
keep if lower(gender) == "male"
```
### `merge` Always Check `_merge`
Never skip `tab _merge` — it costs nothing and is the only diagnostic you get when `assert` fails.
```stata
merge 1:1 id using other.dta
tab _merge // ALWAYS tab before assert
assert _merge == 3 // fails silently without tab output
drop _merge
```
### `preserve` / `restore` + `tempfile` for Collapse-Merge-Back
The standard pattern for computing group stats and merging them onto the original data:
```stata
tempfile stats
preserve
collapse (mean) avg_x=x, by(group)
save `stats'
restore
merge m:1 group using `stats'
tab _merge
assert _merge == 3
drop _merge
```
For simple group means, `bysort group: egen avg_x = mean(x)` avoids the round-trip entirely.
### Weights Are Not Interchangeable
- `fweight` — frequency weights (replication)
- `aweight` — analytic/regression weights (inverse variance)
- `pweight` — probability/sampling weights (survey data, implies robust SE)
- `iweight` — importance weights (rarely used)
### `capture` Swallows Errors
```stata
capture some_command
if _rc != 0 {
di as error "Failed with code: " _rc
exit _rc
}
```
### Line Continuation Uses `///`
```stata
regress y x1 x2 x3 ///
x4 x5 x6, ///
vce(robust)
```
### Stored Results: `r()` vs `e()` vs `s()`
- `r()` — r-class commands (summarize, tabulate, etc.)
- `e()` — e-class commands (estimation: regress, logit, etc.)
- `s()` — s-class commands (parsing)
A new estimation command **overwrites** previous `e()` results. Store them first:
```stata
regress y x1 x2
estimates store model1
```
---
## Delivering .do Files to the User
Claude's role is to **write correct, well-commented Stata code** and save it as a `.do` file. The user runs it manually in Stata.
### Workflow
```
1. Understand the research task and identify the relevant reference files (see routing table)
2. Write the complete .do file with inline comments explaining each step
3. Save the file to the workspace folder so the user can open it directly
4. Tell the user:
- The file name and location
- Any required packages to install first (see packages.md)
- What the script produces (log, tables, graphs, .dta files, etc.)
- Any paths or variable names they may need to adjust
```
### .do File Template
Every generated .do file should follow this structure:
```stata
/*==============================================================
Project : [Project name]
Task : [What this script does]
Author : Generated by Claude
Date : [Date]
Stata : 16+ recommended
==============================================================*/
* ---------- Preamble ----------
clear all
set more off
version 16
* Set working directory — user should adjust this path
* cd "/path/to/your/data"
* ---------- Required packages ----------
* Install if not already installed:
* ssc install reghdfe, replace
* ssc install estout, replace
* ---------- [Step 1: ...] ----------
// ... code with comments ...
* ---------- [Step 2: ...] ----------
// ...
* ---------- Output ----------
// Tables saved to: results/
// Figures saved to: figures/
// Log: results/analysis.log
log using "results/analysis.log", replace text
// ... estimation code ...
log close
```
### After Delivering the File
Once the `.do` file is saved, tell the user:
> "Open `[filename].do` in Stata (or run `do [filename].do` from the Stata command line). The script will produce [describe outputs]. If you encounter any errors, paste the error message here and I'll fix the code."
Do **not** attempt to execute the `.do` file yourself or simulate output. Wait for the user to run it and share results if they want further analysis.
---
## Routing Table
Read only the guide(s) relevant to the user's task. All paths are relative to this SKILL.md file.
| Guide | Topics |
|-------|--------|
| `references/basics.md` | Syntax, variables, operators, macros, loops, programming, data import/export, Mata, workflow best practices, external tools integration |
| `references/methods.md` | OLS, panel data, DiD, IV/GMM, RDD, treatment effects, matching, time series, MLE, bootstrap, survey methods, sample selection, nonparametric |
| `references/data-mgmt.md` | Data management (generate, merge, reshape), string/date/math functions, descriptive statistics, missing data, Mata data access & matrix operations, specialized models (LDV, survival, SEM, spatial, ML) |
| `references/output.md` | Graphics (twoway, schemes, coefplot), regression tables (esttab, outreg2, asdoc), summary tables (tabout), export to LaTeX/Word/Excel, graph export best practices |
| `packages.md` | Community packages: reghdfe, ivreg2, xtabond2, csdid, rdrobust, synth, psmatch2, binsreg, gtools, winsor2, diagnostics, and package management |
---
## Common Patterns
### Regression Table Workflow
```stata
* Estimate models
eststo clear
eststo: regress y x1 x2, vce(robust)
eststo: regress y x1 x2 x3, vce(robust)
eststo: regress y x1 x2 x3 x4, vce(cluster id)
* Export table
esttab using "results.tex", replace ///
se star(* 0.10 ** 0.05 *** 0.01) ///
label booktabs ///
title("Main Results") ///
mtitles("(1)" "(2)" "(3)")
```
### Panel Data Setup
```stata
xtset panelid timevar // declare panel structure
xtdescribe // check balance
xtsum outcome // within/between variation
* Fixed effects
xtreg y x1 x2, fe vce(cluster panelid)
* Or with reghdfe (preferred for multiple FE)
reghdfe y x1 x2, absorb(panelid timevar) vce(cluster panelid)
```
### Difference-in-Differences
```stata
* Classic 2x2 DiD
gen post = (year >= treatment_year)
gen treat_post = treated * post
regress y treated post treat_post, vce(cluster id)
* Event study (uniform timing — must interact with treatment group)
reghdfe y ib(-1).rel_time#1.treated, absorb(id year) vce(cluster id)
testparm *.rel_time#1.treated // pre-trend test
* Modern staggered DiD (Callaway & Sant'Anna)
csdid y x1 x2, ivar(id) time(year) gvar(first_treat) agg(event)
csdid_plot
```
### Graph Export
```stata
* Publication-quality scatter with fit line
twoway (scatter y x, mcolor(navy%50) msize(small)) ///
(lfit y x, lcolor(cranberry) lwidth(medthick)), ///
title("Title Here") ///
xtitle("X Label") ytitle("Y Label") ///
legend(off) scheme(s2color)
graph export "figure1.pdf", replace as(pdf)
graph export "figure1.png", replace as(png) width(2400)
```
### Data Cleaning Pipeline
```stata
* Load and inspect
import delimited "raw_data.csv", clear varnames(1)
describe
codebook, compact
* Clean
rename *, lower // lowercase all varnames
destring income, replace force // convert string to numeric
replace income = . if income < 0
* Label
label variable income "Annual household income (USD)"
label define yesno 0 "No" 1 "Yes"
label values employed yesno
* Save
compress
save "clean_data.dta", replace
```
### Multiple Imputation
```stata
mi set mlong
mi register imputed income education
mi impute chained (regress) income (ologit) education = age i.gender, add(20) rseed(12345)
mi estimate: regress wage income education age i.gender
```
---
## Help Improve This Skill
If you produce Stata code with a significant error — wrong syntax, incorrect command usage, or a gotcha you failed to catch — and the issue seems to stem from a gap in these reference files rather than a one-off mistake, consider suggesting to the user that they report it so the documentation can be improved.
**When to raise this:** Only after you've already corrected the error and the user has working code. Frame it as optional: *"I made an error with [X] that I think comes from a gap in the Stata skill documentation. Would you like me to note this for future improvement?"*
If the user agrees, help them draft a clear description of the gap and the correct behavior.synthetic-control
Econometrics skill for Synthetic Control Method (SCM). Activates when the user asks about "synthetic control", "SCM", "placebo test", "synthetic DID", "合成控制", "安慰剂检验", "合成反事实", "合成DID".
# Synthetic Control Method (SCM) Skill
This skill guides complete synthetic control analyses: donor pool construction, weight optimization, gap estimation, placebo-based inference, and extensions including augmented SCM and synthetic DID. Designed for policy evaluation with few treated units.
## When to Use Synthetic Control
| Situation | Method |
|-----------|--------|
| Single treated unit, many potential controls | Classic SCM |
| Few treated units | Multi-unit SCM or Synthetic DID |
| Treatment at aggregate level (state, country) | Classic SCM |
| Want DID-like inference with SCM weighting | Synthetic DID (Arkhangelsky et al. 2021) |
**Key advantage over DID**: SCM constructs a data-driven counterfactual rather than assuming parallel trends for all control units equally.
## Core Logic
SCM constructs a weighted combination of untreated ("donor") units that best approximates the treated unit's pre-treatment characteristics and outcome trajectory.
**Estimand**: τ_t = Y₁ₜ − Ŷ₁ₜ^(SC) for post-treatment periods t > T₀
**Key Assumptions**:
1. No interference between units (SUTVA)
2. No anticipation of treatment
3. Treated unit can be well-approximated by donor pool in pre-treatment period
4. The data-generating process is stable across pre/post periods
## SCM Workflow
1. **Define units and treatment**: Identify treated unit, donor pool, treatment date
2. **Select predictors**: Choose pre-treatment covariates and outcome lags
3. **Optimize weights**: Minimize pre-treatment MSPE between treated and synthetic control
4. **Estimate effects**: Gap = treated outcome − synthetic control outcome
5. **Inference**: Run placebo tests (in-space and in-time)
6. **Report**: Gap plot, placebo plot, MSPE ratios
## Code Templates
### R — Classic SCM (Synth package)
```r
# R — Abadie-Diamond-Hainmueller SCM
library(Synth)
dataprep_out <- dataprep(
foo = df,
predictors = c("gdp_pc", "trade_share", "inflation"),
predictors.op = "mean",
time.predictors.prior = 1980:1999,
special.predictors = list(
list("outcome", 1995:1999, "mean"), # pre-treatment outcome lags
list("outcome", 1990, "mean"),
list("outcome", 1985, "mean")
),
dependent = "outcome",
unit.variable = "unit_id",
unit.names.variable = "unit_name",
time.variable = "year",
treatment.identifier = 1, # treated unit ID
controls.identifier = 2:20, # donor pool IDs
time.optimize.ssr = 1980:1999, # pre-treatment period
time.plot = 1980:2010 # full plot range
)
synth_out <- synth(dataprep_out)
# View donor weights (non-zero weights = selected donors)
synth.tab <- synth.tab(synth_out, dataprep_out)
print(synth.tab$tab.w) # unit weights
print(synth.tab$tab.v) # predictor weights
# Gap plot: treated vs synthetic control
path.plot(synth_out, dataprep_out,
Ylab = "Outcome", Xlab = "Year",
Main = "Treated vs Synthetic Control")
abline(v = 2000, lty = 2, col = "red")
# Gap (treatment effect) plot
gaps.plot(synth_out, dataprep_out,
Ylab = "Gap (Treated − Synthetic)", Xlab = "Year",
Main = "Treatment Effect Over Time")
abline(v = 2000, lty = 2, col = "red")
abline(h = 0, lty = 3)
```
### R — Augmented SCM (augsynth)
```r
# R — Augmented SCM (Ridge-augmented, Ben-Michael et al. 2021)
library(augsynth)
asyn <- augsynth(outcome ~ treatment,
unit = unit_id, time = year,
data = df,
progfunc = "Ridge", # augmentation method
scm = TRUE) # include SCM weights
summary(asyn)
plot(asyn)
```
**Ridge Augmentation** (`progfunc = "Ridge"`): When the pre-treatment fit of standard SCM is imperfect, Ridge augmentation adds a bias-correction term estimated by a penalized (ridge) regression of the residuals on donor outcomes. This reduces sensitivity to poor pre-treatment balance by shrinking the bias correction toward zero when donor pool fit is already good. The result is an estimator that converges to the standard SCM when pre-treatment fit is excellent, but provides robustness when it is not. Other options include `"gsyn"` (matrix completion) and `"None"` (standard SCM only).
### Python — SparseSC
```python
# Python — SparseSC (penalized synthetic control)
import SparseSC
import numpy as np
# Reshape data to (N_units × T_periods) matrix
Y = df.pivot(index='unit_id', columns='year', values='outcome').values
T0 = 20 # number of pre-treatment periods
# Fit sparse synthetic control
sc = SparseSC.fit(
features=Y[:, :T0], # pre-treatment outcomes
targets=Y[:, T0:], # post-treatment outcomes
treated_units=[0] # index of treated unit
)
# Treatment effect estimate
treated_actual = Y[0, T0:]
synthetic_control = sc.predict(Y[0:1, :T0])[0]
effect = treated_actual - synthetic_control
print(f"Average post-treatment effect: {np.mean(effect):.4f}")
```
### Stata — synth and synth_runner
```stata
* Stata — Classic SCM
ssc install synth
ssc install synth_runner
tsset unit_id year
synth outcome gdp_pc trade_share inflation ///
outcome(1995) outcome(1990) outcome(1985), ///
trunit(1) trperiod(2000) ///
fig keep(synth_results) replace
* Plot results
twoway (line outcome year if unit_id == 1, lcolor(black) lwidth(medium)) ///
(line _Y_synthetic year if unit_id == 1, lcolor(red) lpattern(dash)), ///
xline(2000, lpattern(dash)) ///
legend(label(1 "Treated") label(2 "Synthetic Control")) ///
title("Synthetic Control Estimate")
```
## Inference: Placebo Tests
SCM does not have standard errors in the traditional sense. Inference is based on placebo (permutation) tests.
### In-Space Placebo Test
Iteratively apply SCM to every control unit as if it were treated. If the treated unit's effect is unusually large relative to placebos, the effect is credible.
```r
# R — in-space placebo (Synth)
library(Synth)
placebo_gaps <- list()
all_units <- unique(df$unit_id)
for (u in all_units) {
controls <- setdiff(all_units, u)
dp <- dataprep(foo = df, predictors = c("gdp_pc", "trade_share"),
predictors.op = "mean", time.predictors.prior = 1980:1999,
special.predictors = list(list("outcome", 1995:1999, "mean")),
dependent = "outcome", unit.variable = "unit_id",
time.variable = "year", treatment.identifier = u,
controls.identifier = controls,
time.optimize.ssr = 1980:1999, time.plot = 1980:2010)
so <- synth(dp, Sigf.ipop = 3)
placebo_gaps[[as.character(u)]] <- dp$Y1plot - (dp$Y0plot %*% so$solution.w)
}
# Plot all gaps; treated unit should stand out
plot(1980:2010, placebo_gaps[["1"]], type = "l", lwd = 2, col = "black",
ylim = range(unlist(placebo_gaps)), ylab = "Gap", xlab = "Year")
for (u in names(placebo_gaps)[-1]) {
lines(1980:2010, placebo_gaps[[u]], col = "grey70")
}
abline(v = 2000, lty = 2); abline(h = 0, lty = 3)
legend("topleft", c("Treated", "Placebos"), col = c("black","grey70"), lwd = c(2,1))
```
```stata
* Stata — in-space placebo with synth_runner
synth_runner outcome gdp_pc trade_share inflation ///
outcome(1995) outcome(1990) outcome(1985), ///
trunit(1) trperiod(2000) gen_vars
effect_graphs
single_treatment_graphs
```
### MSPE Ratios
Calculate post/pre MSPE ratio for each unit. The treated unit's ratio should rank highest.
```r
# Rank by post/pre MSPE ratio
mspe_ratios <- sapply(names(placebo_gaps), function(u) {
gap <- placebo_gaps[[u]]
pre <- gap[1:20] # pre-treatment periods
post <- gap[21:31] # post-treatment periods
sum(post^2) / sum(pre^2)
})
# p-value: fraction of placebos with ratio ≥ treated
p_value <- mean(mspe_ratios >= mspe_ratios["1"])
cat("MSPE ratio rank p-value:", p_value, "\n")
# p < 0.05 → significant effect
```
### In-Time Placebo
Apply SCM with a fake treatment date in the pre-treatment period. Effect should be near zero.
```r
# Use earlier fake treatment date
dp_placebo <- dataprep(foo = df, ...,
time.optimize.ssr = 1980:1989, # shorter pre-period
time.plot = 1980:1999) # only pre-treatment
so_placebo <- synth(dp_placebo)
# Gap should be ≈ 0 if model is well-specified
```
## Donor Pool Construction
| Guideline | Rationale |
|-----------|-----------|
| Exclude units affected by similar treatment | Avoids contamination |
| Include only structurally similar units | Improves fit quality |
| Use pre-treatment outcome lags as predictors | Most powerful predictors |
| Drop donors with zero weight and large pre-MSPE | Focus on contributing donors |
| Leave-one-out: iteratively drop each donor | Check weight sensitivity |
### Covariate Balance Check
After constructing the synthetic control, verify that predictors are balanced between the treated unit and its synthetic counterpart:
```r
# R — predictor balance check after synth()
synth_tab <- synth.tab(synth_out, dataprep_out)
# tab.pred: treated, synthetic control, and sample average for each predictor
print(synth_tab$tab.pred)
# Look for rows where treated and synthetic values are close (small gap)
# Large gaps in key predictors signal poor donor pool or predictor choice
# Compute RMSPE on predictor balance:
pred_balance <- synth_tab$tab.pred[, 1:2] # treated vs synthetic
pred_gaps <- pred_balance[, 1] - pred_balance[, 2]
cat("Predictor balance (treated - synthetic):\n")
print(round(pred_gaps, 4))
```
```python
# Python — predictor balance after SparseSC or manual SCM
import numpy as np
import pandas as pd
# Treated unit predictor values (pre-treatment means)
treated_pred = X_treated.mean(axis=0)
synthetic_pred = (weights @ X_donors).flatten() # weights × donor covariates
balance_df = pd.DataFrame({
'Predictor': predictor_names,
'Treated': treated_pred,
'Synthetic': synthetic_pred,
'Gap': treated_pred - synthetic_pred,
'Pct_Gap': 100 * (treated_pred - synthetic_pred) / np.abs(treated_pred)
})
print(balance_df.to_string(index=False))
# Flag predictors where |Pct_Gap| > 5% — consider adjusting donor pool
```
```stata
* Stata — predictor balance displayed automatically after synth
synth outcome gdp_pc trade_share inflation ///
outcome(1995) outcome(1990) outcome(1985), ///
trunit(1) trperiod(2000)
* The output table "Predictor Balance" compares treated vs synthetic vs sample avg
* Verify that treated and synthetic columns are close for key predictors
```
## Synthetic DID (Arkhangelsky et al. 2021)
Combines SCM weighting with DID estimation; works with multiple treated units.
```r
# R — Synthetic DID
library(synthdid)
# Data must be a balanced panel in matrix form
setup <- panel.matrices(df, unit = "unit_id", time = "year",
outcome = "outcome", treatment = "treated")
sdid <- synthdid_estimate(setup$Y, setup$N0, setup$T0)
se <- sqrt(vcov(sdid, method = "placebo"))
cat("SDID estimate:", sdid, "SE:", se, "\n")
plot(sdid)
```
```python
# Python — synthdid
# pip install synthdid
from synthdid.model import SynthDID
model = SynthDID(df, unit='unit_id', time='year',
outcome='outcome', treatment='treated')
model.fit()
print(f"SDID ATT: {model.att():.4f}")
model.plot()
```
## Pre-Treatment Fit Diagnostics
Good pre-treatment fit is the foundation of SCM credibility. A poorly fitting synthetic control cannot be trusted as a counterfactual.
**RMSPE benchmark**: Pre-treatment RMSPE should generally be **< 5% of the treated unit's pre-treatment outcome mean**. Higher values suggest the synthetic control is unreliable; reconsider predictor selection or donor pool composition.
```r
# R — compute and assess pre-treatment RMSPE
# After synth() and dataprep():
gaps <- dataprep_out$Y1plot - (dataprep_out$Y0plot %*% synth_out$solution.w)
pre_periods <- which(as.numeric(rownames(gaps)) < treatment_year)
post_periods <- which(as.numeric(rownames(gaps)) >= treatment_year)
pre_rmspe <- sqrt(mean(gaps[pre_periods]^2))
post_rmspe <- sqrt(mean(gaps[post_periods]^2))
outcome_mean <- mean(dataprep_out$Y1plot[pre_periods])
cat(sprintf("Pre-treatment RMSPE: %.4f (%.1f%% of outcome mean)\n",
pre_rmspe, 100 * pre_rmspe / outcome_mean))
cat(sprintf("Post-treatment RMSPE: %.4f\n", post_rmspe))
cat(sprintf("Post/Pre MSPE ratio: %.2f\n", (post_rmspe/pre_rmspe)^2))
# Rule of thumb: if pre-RMSPE > 5% of outcome mean, reconsider the design
if (pre_rmspe / outcome_mean > 0.05) {
warning("Pre-treatment RMSPE exceeds 5% of outcome mean — fit may be poor.")
}
```
```python
# Python — pre-treatment fit visualization
import matplotlib.pyplot as plt
import numpy as np
years = np.array(all_years)
treated = Y_actual # actual treated unit outcomes
synthetic = Y_synthetic # synthetic control outcomes
treatment_year = 2000
fig, axes = plt.subplots(1, 2, figsize=(12, 4))
# Panel A: level plot
axes[0].plot(years, treated, 'k-', linewidth=2, label='Treated')
axes[0].plot(years, synthetic, 'r--', linewidth=1.5, label='Synthetic Control')
axes[0].axvline(treatment_year, color='gray', linestyle=':', linewidth=1)
axes[0].set_title('Treated vs Synthetic Control'); axes[0].legend()
# Panel B: gap plot with pre-period RMSPE annotation
gap = treated - synthetic
axes[1].plot(years, gap, 'b-', linewidth=2, label='Gap (τ̂)')
axes[1].axvline(treatment_year, color='gray', linestyle=':', linewidth=1)
axes[1].axhline(0, color='black', linestyle='-', linewidth=0.5)
pre_mask = years < treatment_year
pre_rmspe = np.sqrt(np.mean(gap[pre_mask]**2))
axes[1].set_title(f'Treatment Effect (Pre-RMSPE = {pre_rmspe:.3f})')
axes[1].legend()
plt.tight_layout()
plt.savefig('synthetic_control_fit.pdf', bbox_inches='tight')
```
```stata
* Stata — pre-treatment fit assessment
* After synth estimation, compute RMSPE manually:
* (synth stores results; use stored matrices)
matrix gaps = e(Y_treated) - e(Y_synthetic)
* Pre-treatment RMSPE:
* (Stata code depends on synth version; synth_runner automates this)
synth_runner outcome gdp_pc trade_share outcome(1995) outcome(1990), ///
trunit(1) trperiod(2000) gen_vars
* Outputs pre_rmspe and post_rmspe in stored results
pval2 using synth_results // rank-based p-value from synth_runner
```
## Reporting Standards
1. **Pre-treatment fit**: Show treated vs synthetic control plot with clear pre-treatment match
2. **Gap plot**: Show treatment effect over time with vertical treatment line
3. **Donor weights table**: Report which units contribute to synthetic control and their weights
4. **Predictor balance table**: Compare treated, synthetic, and sample average
5. **Placebo plot**: In-space placebos with treated unit highlighted
6. **MSPE ratio**: Report rank-based p-value (e.g., "treated unit ranks 1st of 20 units")
7. **Robustness**: Leave-one-out donor test, in-time placebo, alternative predictor sets
**Key sentence template**:
> "We construct a synthetic [unit] using a weighted combination of [N] donor [units] from the [donor pool description]. The synthetic [unit] closely tracks the treated [unit] in the pre-treatment period ([year range], pre-treatment MSPE = [value]). The estimated effect is [magnitude] ([% change]), with the treated unit ranking [1st/2nd] out of [N] units in post/pre-MSPE ratio (p = [value])."
## Common Pitfalls
- **Poor pre-treatment fit**: If pre-MSPE is large, the synthetic control is unreliable — reconsider predictor set or donor pool
- **Overfitting to noise**: Using too many outcome lags can overfit; use 3–5 evenly spaced lags
- **Interpolation bias**: SCM requires the treated unit to be within the convex hull of donors
- **Cherry-picking donors**: Always report full donor pool; justify exclusions
See `references/synthetic-control-reference.md` for multi-unit SCM extensions, staggered adoption SCM, and sensitivity analysis.table
Called by /plot to upgrade regression and summary tables to top-journal standards.
# LaTeX Table Formatting Skill
This skill generates publication-quality regression tables, summary statistics tables, and multi-panel layouts for economics journals. Covers the major table-making tools: `pyfixest` (Python, primary), `esttab/estout` (Stata), `modelsummary/fixest::etable` (R), and `stargazer` (R/Python).
## Step 0: Workflow Interface
When called by `/plot` or another upstream command, this skill accepts a structured context block. **Do not re-ask the user for information already present in the context.**
```
table_type: table1_descriptive | table2_balance | table3_main | table_robustness | table_heterogeneity
strategy: DiD | RDD | IV | PanelFE | SC | OLS
model_results: path to regression output .csv (e.g., tables/table3_main_results.csv)
software: python | r | stata
cluster_var: variable name or "none"
fe_list: e.g., ["unit FE", "time FE"] or []
sample_desc: e.g., "Firm-year panel, 2000–2015, N=12,450"
project_name: string
```
When called standalone (not via `/plot`), prompt the user for `table_type`, `strategy`, and `software` before proceeding.
**Software → tool routing (execute only the matching section; skip all others):**
| `software` | Primary tool | Section to execute |
|---|---|---|
| `python` | `pyfixest.etable()` | [Python — pyfixest.etable()](#python--pyfixedetable-primary) |
| `r` | `modelsummary` + `fixest::etable` | [R — modelsummary](#r--modelsummary) / [R — fixest::etable](#r--fixestetable) |
| `stata` | `esttab/estout` | [Stata — esttab/estout](#stata--esttabstout) |
**Routing rules:**
1. Generate code **only** for the `software` specified in the caller context. Do not generate parallel versions in other languages unless the user explicitly requests them.
2. For `software = r`: if the upstream `/code` command used `fixest::feols`, prefer `fixest::etable()` (native, zero conversion); use `modelsummary` when the user needs `.docx` or `.html` output in addition to `.tex`.
3. For `software = python`: `pyfixest` objects (`feols`, `fepois`, `iv2sls`) pass directly to `etable()`. Do not use `statsmodels` or `pystout` unless `pyfixest` is unavailable.
4. For `software = stata`: `reghdfe` → `eststo` → `esttab` is the standard chain. Always call `ssc install estout` check at the top of the do-file.
**Output convention (always dual-format):**
Every table must be saved in two files:
- `tables/tableN_name.tex` — body-only file for `\input{}` in the paper
- `tables/tableN_name.csv` — raw numbers for manual verification
**Table type → template routing:**
| `table_type` | Template | Key Features |
|---|---|---|
| `table1_descriptive` | Descriptive stats | Variable groups, mean/sd/min/max/N, \multicolumn section headers |
| `table2_balance` | Balance test | Pre-treatment filter enforced, normalized diff column, ✅/⚠️ flag |
| `table3_main` | Main regression | Multi-column, FE rows, controls Yes/No, clustered SE note |
| `table_robustness` | Robustness | Panel layout (Panel A/B/C), same dep. var across specs |
| `table_heterogeneity` | Heterogeneity | Subgroup headers, interaction terms, p-value of difference |
## Quick Decision: Which Tool to Use
| Tool | Language | Best For |
|------|----------|----------|
| `pyfixest.etable()` | Python ⚡ | Primary tool; works with feols/fepois/iv2sls from /code |
| `esttab/estout` | Stata | Most flexible; Stata-native workflows |
| `modelsummary` | R | Modern, clean API; many output formats |
| `fixest::etable` | R | Fast tables from `fixest` regressions |
| `stargazer` | R | Classic; widely used in econ |
## Regression Tables
### Stata — esttab/estout
```stata
* Stata — multi-model regression table
ssc install estout
* Run models
eststo clear
eststo m1: reg y x1, robust
eststo m2: reg y x1 x2, robust
eststo m3: reg y x1 x2 x3, robust
eststo m4: reghdfe y x1 x2 x3, absorb(fe_var) cluster(cluster_var)
* Export to LaTeX
esttab m1 m2 m3 m4 using "results.tex", replace ///
b(3) se(3) /// // 3 decimal places
star(* 0.10 ** 0.05 *** 0.01) /// // significance stars
title("Main Results") ///
mtitles("OLS" "OLS" "OLS" "FE") /// // column titles
label /// // use variable labels
keep(x1 x2 x3) /// // show only key vars
order(x1 x2 x3) ///
stats(N r2 r2_a, fmt(%9.0fc %9.3f %9.3f) ///
labels("Observations" "R-squared" "Adj. R-squared")) ///
addnotes("Robust standard errors in parentheses." ///
"*** p<0.01, ** p<0.05, * p<0.1") ///
booktabs /// // professional formatting
fragment // no \begin{table} wrapper
* --- CSV dual-output (always run alongside LaTeX export) ---
* Export coefficient table as CSV for manual verification
esttab m1 m2 m3 m4 using "tables/table3_main_results.csv", replace ///
b(3) se(3) star(* 0.10 ** 0.05 *** 0.01) ///
csv plain noobs
* Multi-panel table
esttab m1 m2 using "panel_a.tex", replace booktabs fragment ///
prehead("\begin{table}[htbp]" "\centering" "\caption{Results}" ///
"\begin{tabular}{lcc}" "\toprule" ///
"& \multicolumn{2}{c}{\textit{Panel A: Full Sample}} \\" ///
"\cmidrule(lr){2-3}")
esttab m3 m4 using "panel_b.tex", replace booktabs fragment ///
prehead("\midrule" ///
"& \multicolumn{2}{c}{\textit{Panel B: Subsample}} \\" ///
"\cmidrule(lr){2-3}") ///
postfoot("\bottomrule" "\end{tabular}" ///
"\begin{tablenotes}" "\small" ///
"\item Standard errors in parentheses." ///
"\end{tablenotes}" "\end{table}")
```
### R — modelsummary
> **Workflow note:** `modelsummary` accepts `fixest::feols` objects directly — no conversion needed. If `/code` used `feols`, pass those objects straight into `modelsummary()` or `fixest::etable()`. Prefer `fixest::etable()` for speed; use `modelsummary()` when you need `.docx` or `.html` output alongside `.tex`.
```r
# R — modelsummary (modern, flexible)
library(modelsummary)
m1 <- lm(y ~ x1, data = df)
m2 <- lm(y ~ x1 + x2, data = df)
m3 <- lm(y ~ x1 + x2 + x3, data = df)
# LaTeX output
modelsummary(
list("(1)" = m1, "(2)" = m2, "(3)" = m3),
coef_map = c("x1" = "Treatment",
"x2" = "Control 1",
"x3" = "Control 2"),
gof_map = c("nobs", "r.squared", "adj.r.squared"),
stars = c('*' = .1, '**' = .05, '***' = .01),
title = "Main Results",
notes = "Robust standard errors in parentheses.",
output = "tables/table3_main_results.tex"
)
# --- CSV dual-output ---
modelsummary(
list("(1)" = m1, "(2)" = m2, "(3)" = m3),
output = "tables/table3_main_results.csv" # same call, different extension
)
# Add fixed effects indicators
modelsummary(
list("(1)" = m1, "(2)" = m2, "(3)" = m3),
add_rows = tribble(
~term, ~"(1)", ~"(2)", ~"(3)",
"Year FE", "No", "Yes", "Yes",
"Industry FE", "No", "No", "Yes"
),
output = "results.tex"
)
```
### R — fixest::etable
```r
# R — etable (fast, built into fixest)
library(fixest)
m1 <- feols(y ~ x1, data = df, vcov = "HC1")
m2 <- feols(y ~ x1 + x2 | year, data = df, vcov = ~cluster_var)
m3 <- feols(y ~ x1 + x2 | year + industry, data = df, vcov = ~cluster_var)
etable(m1, m2, m3,
tex = TRUE,
file = "results.tex",
dict = c(x1 = "Treatment", x2 = "Control"),
order = c("Treatment", "Control"),
drop = "Intercept",
fixef.group = list("Year FE" = "year",
"Industry FE" = "industry"),
style.tex = style.tex("aer"), # AER journal style
title = "Main Results",
notes = "Clustered standard errors in parentheses.")
```
### R — stargazer
```r
# R — stargazer (classic)
library(stargazer)
stargazer(m1, m2, m3,
type = "latex",
out = "results.tex",
title = "Main Results",
dep.var.labels = "Outcome Variable",
covariate.labels = c("Treatment", "Control 1", "Control 2"),
keep = c("x1", "x2", "x3"),
add.lines = list(
c("Year FE", "No", "Yes", "Yes"),
c("Industry FE", "No", "No", "Yes")
),
omit.stat = c("f", "ser"),
notes = "Robust standard errors in parentheses.",
notes.align = "l",
star.cutoffs = c(0.1, 0.05, 0.01))
```
### Python — pyfixest.etable() (Primary)
`pyfixest` is the standard Python estimation tool used by the `/code` command. Its `etable()` function produces booktabs-ready LaTeX and a tidy DataFrame simultaneously, enabling dual-format output with no extra work.
```python
# Python — pyfixest regression table (primary workflow)
import pyfixest as pf
import pandas as pd
from pathlib import Path
TABLES_DIR = Path("tables")
TABLES_DIR.mkdir(exist_ok=True)
# Run models (mirrors /code output)
m1 = pf.feols("y ~ x1", data=df, vcov="HC3")
m2 = pf.feols("y ~ x1 + x2 | unit_id", data=df, vcov={"CRV1": "unit_id"})
m3 = pf.feols("y ~ x1 + x2 | unit_id + year", data=df, vcov={"CRV1": "unit_id"})
m4 = pf.feols("y ~ x1 + x2 | unit_id + year", data=df,
vcov={"CRV1": "unit_id^year"}) # double-clustered
# --- LaTeX output (.tex) ---
etable_tex = pf.etable(
[m1, m2, m3, m4],
digits=3,
stars=True, # *** 0.01, ** 0.05, * 0.10
coef_fmt="b (se)", # coefficient with SE below
dict={ # rename variables
"x1": "Treatment",
"x2": "Log Income",
"Intercept": "", # hide intercept label
},
keep=["Treatment", "Log Income"], # show only key vars
drop_intercept=True,
tex=True,
# Fixed effects and controls indicators added manually below
)
# Add FE / controls indicator rows
fe_block = r"""
\midrule
Unit FE & No & Yes & Yes & Yes \\
Year FE & No & No & Yes & Yes \\
Controls & No & No & No & Yes \\
"""
# Inject FE block before \bottomrule, then wrap in full table environment
def build_table_tex(etable_str, fe_block, caption, label, notes):
"""Wrap pyfixest etable fragment in a complete booktabs table."""
# etable(tex=True) returns a tabular fragment; inject FE rows + wrap
body = etable_str.replace(r"\bottomrule", fe_block + r"\bottomrule")
return (
r"\begin{table}[htbp]" + "\n"
r"\centering" + "\n"
rf"\caption{{{caption}}}" + "\n"
rf"\label{{{label}}}" + "\n"
+ body + "\n"
r"\begin{tablenotes}[flushleft]\footnotesize" + "\n"
rf"\item \textit{{Notes:}} {notes}" + "\n"
r"\end{tablenotes}" + "\n"
r"\end{table}"
)
notes_str = (
"Robust standard errors in parentheses, clustered at the unit level "
"(columns 2--4). All regressions include the controls listed in row "
"\\textit{Controls}. Sample: {sample_desc}. "
"*** p$<$0.01, ** p$<$0.05, * p$<$0.10."
)
full_tex = build_table_tex(
etable_tex, fe_block,
caption="Main Results: Effect of Treatment on Outcome",
label="tab:main",
notes=notes_str
)
tex_path = TABLES_DIR / "table3_main_results.tex"
tex_path.write_text(full_tex)
print(f"Saved: {tex_path}")
# --- CSV output (.csv) — always save for verification ---
coef_df = pf.etable([m1, m2, m3, m4], digits=3, type="df")
csv_path = TABLES_DIR / "table3_main_results.csv"
coef_df.to_csv(csv_path, index=False)
print(f"Saved: {csv_path}")
```
**FE indicator pattern:** Always add fixed-effects and controls indicator rows manually after `etable()` — `pyfixest` does not auto-generate these rows. The `fe_block` pattern above is the standard template.
**IV / 2SLS tables with pyfixest:**
```python
# First stage + Second stage in one table
first = pf.feols("d ~ z + x1 | unit_id + year", data=df, vcov={"CRV1": "unit_id"})
second = pf.feols("y ~ 1 | unit_id + year | d ~ z", data=df, vcov={"CRV1": "unit_id"})
reduced = pf.feols("y ~ z + x1 | unit_id + year", data=df, vcov={"CRV1": "unit_id"})
iv_tex = pf.etable([first, second, reduced], digits=3,
dict={"d": "Treatment (D)", "z": "Instrument (Z)",
"fit_d": "Treatment (D) — IV"},
tex=True)
# Add first-stage F-stat row manually:
fstat = first.F_stat() # pyfixest attribute
iv_tex = iv_tex.replace(
r"\bottomrule",
rf"First-stage F & {fstat:.1f} & & \\" + "\n" + r"\bottomrule"
)
## Journal-Specific Styles
### AER (American Economic Review)
```r
# fixest style
etable(m1, m2, m3, style.tex = style.tex("aer"), tex = TRUE)
```
Key conventions: booktabs rules, no vertical lines, significance noted in footnote not with stars (AER discourages stars).
### QJE / ReStud / Econometrica
```stata
* Stata — clean academic style
esttab m1 m2 m3 using "results.tex", replace ///
b(3) se(3) star(* 0.10 ** 0.05 *** 0.01) ///
booktabs fragment ///
alignment(D{.}{.}{-1}) ///
prehead("\begin{table}[htbp]" "\centering" ///
"\caption{Title Here}\label{tab:main}" ///
"\begin{tabular}{l*{3}{D{.}{.}{-1}}}" "\toprule") ///
postfoot("\bottomrule" "\end{tabular}" ///
"\begin{tablenotes}[flushleft]\footnotesize" ///
"\item \textit{Notes:} Standard errors in parentheses." ///
" *** p$<$0.01, ** p$<$0.05, * p$<$0.1" ///
"\end{tablenotes}" "\end{table}")
```
## Multi-Panel and Complex Layouts
### Side-by-Side Panels
```stata
* Panel A: OLS, Panel B: IV
esttab m_ols1 m_ols2 using "table.tex", replace booktabs fragment ///
prehead("\begin{table}[htbp]\centering" ///
"\caption{OLS and IV Estimates}" ///
"\begin{tabular}{lcc}\toprule" ///
"& \multicolumn{2}{c}{\textit{Panel A: OLS}} \\" ///
"\cmidrule(lr){2-3}")
esttab m_iv1 m_iv2 using "table.tex", append booktabs fragment ///
prehead("\midrule" ///
"& \multicolumn{2}{c}{\textit{Panel B: IV/2SLS}} \\" ///
"\cmidrule(lr){2-3}") ///
postfoot("\bottomrule\end{tabular}\end{table}")
```
### Interaction Effects Table
```r
# R — interaction table
library(modelsummary)
m_interaction <- lm(y ~ x1 * group, data = df)
modelsummary(m_interaction,
coef_rename = c("x1" = "Treatment",
"group" = "Group",
"x1:group" = "Treatment × Group"),
output = "interaction.tex")
```
## Tips for Clean Tables
| Tip | Details |
|-----|---------|
| Use `booktabs` | `\toprule`, `\midrule`, `\bottomrule` instead of `\hline` |
| No vertical lines | Standard in economics journals |
| Align decimals | Use `dcolumn` package with `D{.}{.}{-1}` column type |
| Stars in notes | Clearly state significance levels in table notes |
| Variable labels | Use descriptive names, not variable codes |
| Fixed effects rows | Show Yes/No indicators for FE inclusions |
| Consistent decimals | 3 decimals for coefficients/SE; 0 for N |
| Notes placement | Below the table, left-aligned, smaller font |
## Standardized Notes Format
Every table **must** include a Notes line below the bottom rule. Notes are not optional — journals reject tables without them. Use this template, filling in the bracketed fields:
```
Notes: [SE type] in parentheses[, clustered at the [unit] level]. [FE description.]
[Sample description: unit, time period, N.] [Any variable transformation or
winsorization.] *** p<0.01, ** p<0.05, * p<0.1.
```
**Strategy-specific Notes templates:**
| Strategy | Notes Template |
|---|---|
| DiD / TWFE | "Standard errors double-clustered by unit and year in parentheses. Regressions include unit and year fixed effects. Sample: [panel, years, N]." |
| Event Study | "Standard errors clustered at the [unit] level. Reference period: t = −1. Coefficients plot relative effects; pre-period joint test p-value = [p]." |
| RDD | "Bias-corrected robust standard errors (rdrobust) in parentheses. Bandwidth h = [h] selected by MSE-optimal selector. Local linear polynomial." |
| IV / 2SLS | "Two-stage least squares. Standard errors clustered at the [unit] level. First-stage F-statistic = [F] (Stock-Yogo weak instrument critical value 10% = 16.38)." |
| Panel FE | "Standard errors clustered at the [unit] level. Hausman test p-value = [p] (favors FE over RE). Regressions include [FE list]." |
| OLS | "Heteroskedasticity-robust standard errors (HC3) in parentheses." |
**LaTeX Notes implementation:**
```latex
% Recommended: threeparttable + tablenotes (renders correctly in all TeX distributions)
\usepackage{threeparttable}
\begin{table}[htbp]
\centering
\caption{...}
\begin{threeparttable}
\begin{tabular}{...}
...
\end{tabular}
\begin{tablenotes}[flushleft]
\footnotesize
\item \textit{Notes:} Standard errors double-clustered by firm and year in
parentheses. Regressions include firm and year fixed effects. Sample covers
manufacturing firms, 2000--2015 ($N = 12{,}450$ firm-year observations).
Revenue is winsorized at the 1st and 99th percentiles.
*** $p<0.01$, ** $p<0.05$, * $p<0.1$.
\end{tablenotes}
\end{threeparttable}
\end{table}
```
**Avoid:** bare `\footnote{}` for table notes (misplaces the note); `tabular` note rows (breaks alignment); notes that only say "Standard errors in parentheses" with no clustering or FE information.
## Common Pitfalls
- **Too many decimals**: 3 is standard for coefficients; more is noise
- **Missing clustering info**: Always state what SE are clustered on
- **Forgetting FE indicators**: Reviewers need to know which FE are included
- **Stars without notes**: Always define significance levels
- **Cramming too many models**: 4–6 columns is typical maximum
- **No CSV dual-output**: Always save `.csv` alongside `.tex` for verification
## LaTeX Integration: Paper-Ready Output
When the output will be `\input{}`-ed into a compiled paper (rather than compiled standalone), three things consistently cause failures. Address them upfront.
### 1. Body-Only Files — No Document Wrapper
Tools like `esttab`, `stargazer`, and manual Python scripts often emit a standalone `.tex` file with `\documentclass...\begin{document}...\end{document}`. This breaks `\input{}` in the parent paper because LaTeX cannot nest document environments.
Always generate two versions: the full standalone file for spot-checking, and a **body-only** file stripped of the document wrapper for inclusion in the paper.
```python
# Python — strip wrapper and save body-only file
import re
def save_body_only(tex_path):
"""Strip \documentclass...\\end{document} wrapper; keep only the table content."""
with open(tex_path) as f:
txt = f.read()
m = re.search(r'\\begin\{document\}(.*?)\\end\{document\}', txt, re.DOTALL)
body = m.group(1).strip() if m else txt
body_path = tex_path.replace('.tex', '_body.tex')
with open(body_path, 'w') as f:
f.write(body)
return body_path
```
In the parent paper, include as:
```latex
\input{tables/table2_main_results_body} % no .tex extension needed
```
Make sure each body file contains the full `\begin{table}...\end{table}` block — not just the `\begin{tabular}` fragment. A missing `\begin{table}` wrapper causes `\multicolumn` and `\caption` errors at compile time.
### 2. Avoid siunitx by Default
The `siunitx` package (used for the `S` decimal-aligned column type) is absent in many TeX distributions and causes `! LaTeX Error: File 'siunitx.sty' not found`. Prefer standard column types:
```latex
% Instead of: \begin{tabular}{l S S S} (requires siunitx)
% Use: \begin{tabular}{l c c c} (always works)
% For strict decimal alignment without siunitx, use the dcolumn package:
\usepackage{dcolumn} % ships with every standard TeX distro
\begin{tabular}{l D{.}{.}{-1} D{.}{.}{-1}}
```
For most robustness and heterogeneity tables, `c` columns are sufficient — the numbers are clearly readable without strict decimal alignment.
Also avoid Unicode characters in Python-generated `.tex` files. Characters like `>=`, `->`, `<=` typed directly will break LaTeX. Always use their LaTeX equivalents: `$\geq$`, `$\rightarrow$`, `$\leq$`.
### 3. Overflow Prevention for Wide Tables
A table with 6 or more columns, or with a text description column, will almost certainly overflow the page width in portrait mode. Apply these fixes together:
```latex
% Rule: >=6 columns → wrap in landscape; text description column → use p{Xcm} not l
\usepackage{pdflscape} % add to preamble
% In the body file:
\begin{landscape}
\begin{table}[ht]
\centering
\caption{...}
\begin{threeparttable}
{\footnotesize\setlength{\tabcolsep}{4pt} % shrink font + column padding
\begin{tabular}{p{4.5cm} c c c c c c} % p{} for text col, c for data cols
...
\end{tabular}}
\begin{tablenotes}[flushleft]\small
\item \textit{Notes}: ...
\end{tablenotes}
\end{threeparttable}
\end{table}
\end{landscape}
```
**Quick reference for portrait mode** (1.25in margins, ~16.5cm text width):
| Columns | First column | Approach |
|---------|-------------|----------|
| 3-4 | `l` | Portrait, no special treatment needed |
| 5-6 | `p{4.5cm}` + `{\footnotesize\setlength{\tabcolsep}{4pt}}` | Portrait, tight |
| 7+ | `p{Xcm}` + `\footnotesize` | Landscape always |
Keep `\begin{tablenotes}` text concise — a long inline math expression that cannot line-break (e.g., a full regression formula) will produce an `Overfull \hbox` even when the table itself fits. Summarize the spec in plain language in the note and put the equation in the methods section instead.time-series
|
# Time Series Analysis Skill
This skill provides guidance for univariate and multivariate time series analysis in empirical economics. It covers stationarity testing, ARIMA modeling, VAR/VECM systems, cointegration, and Granger causality.
## Analysis Workflow
### Step 1: Exploratory Inspection
- Plot the series (level, first difference, log)
- Examine ACF and PACF plots to identify dependence structure
- Check for obvious trends, seasonality, structural breaks
### Step 2: Stationarity Testing
Always test for unit roots before modeling. Preferred approach:
1. ADF test (H₀: unit root present)
2. KPSS test (H₀: series is stationary)
3. If ADF rejects AND KPSS fails to reject → stationary (I(0))
4. If both suggest non-stationary → take first difference, retest
**ADF vs KPSS Conflict Resolution**: When ADF fails to reject (suggests unit root) but KPSS also fails to reject (suggests stationary), the tests disagree. Recommended approach: (a) check for structural breaks — a break can make a stationary series look like it has a unit root; (b) use Zivot-Andrews test to allow for one structural break; (c) examine the series visually and consider economic theory. When in doubt, err on the side of differencing to avoid spurious regressions.
### Step 3: Model Selection
- **Univariate**: ARIMA(p,d,q) based on AIC/BIC and ACF/PACF
- **Multivariate, no cointegration**: VAR in differences
- **Multivariate, with cointegration**: VECM (VAR in error-correction form)
### Step 4: Estimation and Diagnostics
- Check residual white noise (Ljung-Box test)
- Test for parameter stability (CUSUM)
- Impulse response functions for structural interpretation
### Step 5: Forecasting (if applicable)
- Report RMSE and MAE on holdout sample
- State forecast horizon and confidence intervals
## Quick Code Templates
### Stationarity Tests
```python
# Python
from statsmodels.tsa.stattools import adfuller, kpss
# ADF test
adf_result = adfuller(series, autolag='AIC')
print(f"ADF stat: {adf_result[0]:.4f}, p-value: {adf_result[1]:.4f}")
print(f"Critical values: {adf_result[4]}")
# KPSS test
kpss_result = kpss(series, regression='c', nlags='auto')
print(f"KPSS stat: {kpss_result[0]:.4f}, p-value: {kpss_result[1]:.4f}")
```
```r
# R
library(tseries); library(urca)
adf.test(series)
kpss.test(series, null = "Level")
```
```stata
* Stata
dfuller series, lags(4) regress
kpss series
```
### ARIMA Model
```python
# Python — auto order selection
from statsmodels.tsa.arima.model import ARIMA
import itertools
# Manual
model = ARIMA(series, order=(1, 1, 1)).fit()
print(model.summary())
# Forecast
forecast = model.forecast(steps=12)
```
```r
# R
library(forecast)
model <- auto.arima(series, ic = "aic")
summary(model)
forecast(model, h = 12)
```
```stata
* Stata
arima series, arima(1,1,1)
predict yhat, xb
```
### VAR Model
```python
# Python
from statsmodels.tsa.api import VAR
model = VAR(data_matrix)
lag_order = model.select_order(maxlags=8)
print(lag_order.summary())
results = model.fit(lag_order.aic)
print(results.summary())
# Granger causality
results.test_causality('y1', ['y2'], kind='f')
# Impulse response
irf = results.irf(periods=10)
irf.plot(orth=True)
```
```r
# R
library(vars)
VARselect(data_matrix, lag.max = 8, type = "const")
var_model <- VAR(data_matrix, p = 2, type = "const")
causality(var_model, cause = "y2")
irf(var_model, impulse = "y2", response = "y1", n.ahead = 10)
```
## Key Decision Rules
| Finding | Implication |
|---------|-------------|
| All series I(0) | Estimate VAR in levels |
| All series I(1), no cointegration | Estimate VAR in first differences |
| All series I(1), cointegration found | Estimate VECM |
| Mixed orders of integration | Cannot use standard VAR/VECM; use ARDL bounds test |
## ARIMA Order Selection Guide
- **AR(p)**: PACF cuts off after lag p, ACF decays
- **MA(q)**: ACF cuts off after lag q, PACF decays
- **ARMA(p,q)**: Both ACF and PACF decay gradually
- Use AIC for forecasting, BIC for parsimony/inference
- Always check residual ACF for remaining autocorrelation
### Residual Diagnostics (Ljung-Box White Noise Test)
```python
# Python — Ljung-Box test for residual autocorrelation
from statsmodels.stats.diagnostic import acorr_ljungbox
# After fitting ARIMA model:
residuals = model.resid
lb_result = acorr_ljungbox(residuals, lags=[10, 20], return_df=True)
print(lb_result)
# If p-value > 0.05 at all lags → residuals are white noise (good)
```
```r
# R — Ljung-Box test
Box.test(residuals(model), lag = 10, type = "Ljung-Box")
# p > 0.05 → no remaining autocorrelation
```
```stata
* Stata — Ljung-Box (portmanteau) test
* After arima estimation:
wntestq residuals, lags(10)
* Q-stat with p > 0.05 → white noise residuals
```
### ARCH/GARCH Models (Volatility Clustering)
Use when residuals exhibit time-varying variance (heteroskedasticity that clusters over time). Common in financial and macroeconomic data.
```python
# Python — GARCH(1,1) using arch library
from arch import arch_model
# First, fit mean model (e.g., AR(1)) and extract residuals
# Then model the conditional variance:
garch_model = arch_model(series, vol='GARCH', p=1, q=1, dist='normal')
garch_result = garch_model.fit(disp='off')
print(garch_result.summary())
# Extract conditional volatility
cond_vol = garch_result.conditional_volatility
# ARCH LM test for remaining ARCH effects:
from arch.unitroot import VarianceRatio
from statsmodels.stats.diagnostic import het_arch
lm_stat, lm_pval, _, _ = het_arch(garch_result.resid)
print(f"ARCH LM test p-value: {lm_pval:.4f}") # should be > 0.05
```
```r
# R — GARCH(1,1) using rugarch
library(rugarch)
spec <- ugarchspec(
variance.model = list(model = "sGARCH", garchOrder = c(1, 1)),
mean.model = list(armaOrder = c(1, 0), include.mean = TRUE),
distribution.model = "norm"
)
fit <- ugarchfit(spec = spec, data = series)
show(fit)
# Test for remaining ARCH effects
# ArchTest(residuals(fit), lags = 10) # from FinTS package
```
```stata
* Stata — ARCH/GARCH
* First run mean model:
arima series, arima(1,0,0)
predict resid_mean, residuals
* ARCH LM test:
estat archlm, lags(1 2 5)
* Fit GARCH(1,1):
arch series, arch(1) garch(1) ar(1)
```
For cointegration tests (Johansen, Engle-Granger), VECM specification, and structural break tests, see `references/time-series-reference.md`.
## Common Pitfalls
- **Regressing non-stationary series on each other**: Produces spurious regression — always test for unit roots first
- **Using VAR in levels when series are I(1) without cointegration**: Leads to invalid inference — difference the data or use VECM
- **Wrong lag length**: Too few lags → autocorrelated residuals; too many → overfitting. Use information criteria
- **Confusing Granger causality with true causality**: Granger causality is about predictive content, not causal mechanisms
- **Ignoring structural breaks**: A break can mimic a unit root — use Zivot-Andrews test, which allows for one endogenous break under the alternative:
```r
# R — Zivot-Andrews test (allows break under stationarity alternative)
library(urca)
za <- ur.za(series, model = "both", lag = 4)
summary(za)
# If test statistic < critical value → reject unit root even with break
# za@teststat gives statistic; za@cval gives 1%, 5%, 10% critical values
```
```python
# Python — Zivot-Andrews test
from statsmodels.tsa.stattools import zivot_andrews
za_stat, pval, cvdict, bpindex, baselag = zivot_andrews(series, maxlag=4, regression='ct')
print(f"ZA stat: {za_stat:.4f}, p-value: {pval:.4f}, break at index: {bpindex}")
# pval < 0.05 → series is stationary with one structural break
```analyze
Identification Strategy Analysis. Synthesizes research question and literature review to diagnose endogeneity threats, evaluate feasible identification strategies, select the optimal one, and produce an Identification Strategy Memo.
# /analyze — Identification Strategy Analysis
## Step 0:读取上游输出文件
> 📎 **参见 [`shared/context-reader.md`](../shared/context-reader.md)**
> 本阶段所需文件:`research-question.md`(必需)、`literature-review-report.md`(必需)。
读取完成后,输出研究背景确认:
```
═══════════════════════════════════════════════
Phase 3 研究背景信息
═══════════════════════════════════════════════
研究问题:[X 对 Y 的影响]
研究对象:[单位、时间、地域]
识别层级(Phase 1 初判):Level [A/B/C]
数据来源(初步):[数据集名称]
文献主流方法:[DiD / IV / OLS / ...]
代表性估计量:[e.g., 弹性约为 -0.1 至 -0.3]
已知内生性威胁:[OVB:景气周期;反向因果:...]
识别缺口:[现有研究均为 OLS,缺乏准实验证据]
═══════════════════════════════════════════════
```
待用户确认无误后,进入 Step 1。
---
## Step 1: 构建因果框架
### 1.1 定义潜在结果与处理变量
用**潜在结果框架**(Potential Outcomes Framework)明确写出因果问题:
```
目标参数:
- ATE(Average Treatment Effect):所有单位的平均处理效应
- ATT(Average Treatment Effect on the Treated):受处理单位的平均效应
- LATE(Local Average Treatment Effect):顺从者子群的效应(IV专用)
- ATU(Average Treatment Effect on Untreated):未受处理单位的平均效应
请问本研究最关心哪个参数?(若用户未指定,根据研究问题推断并说明理由)
处理变量:[D = 1 if treated, 0 otherwise; 或连续型处理变量]
结果变量:[Y]
核心假设:[Y(1) - Y(0) ≠ 0,即存在处理效应]
```
### 1.2 绘制有向无环图(DAG)
用**文字版 DAG** 描述关键变量之间的因果路径:
```
[说明格式]
→ 表示因果方向
↔ 表示相关(可能存在共同原因)
U 表示未观测混淆变量
示例(最低工资对就业的影响):
最低工资(D) → 就业率(Y)
经济景气(U) ↔ 最低工资(D) ← 混淆路径1(游说与经济周期)
经济景气(U) → 就业率(Y)
劳动需求(M) ← 最低工资(D) → 就业率(Y) ← 可能的中介机制
```
**要求:**
- 明确标出所有**已知混淆路径**(open backdoor paths)
- 明确标出**潜在中介变量**(如果研究机制,避免控制中介)
- 明确标出**碰撞变量**(colliders,控制它反而打开虚假路径)
---
## Step 2: 内生性威胁诊断
对每种已知内生性来源进行系统性检查。**必须覆盖以下三类,不得跳过:**
### 2.1 遗漏变量偏误(Omitted Variable Bias)
**检查清单:**
| 候选遗漏变量 | 与处理变量D的相关方向 | 与结果变量Y的相关方向 | 偏误方向(高估/低估) | 严重程度 |
|------------|---------------------|---------------------|---------------------|---------|
| [列出2–4个最可能的OVB来源] | + / - | + / - | 上偏/下偏 | 高/中/低 |
**严重程度判断标准:**
- **高**:有强理论或实证依据认为相关性大,且无法通过现有数据控制
- **中**:有一定相关性但部分可通过控制变量或面板FE吸收
- **低**:理论上存在但量级可能可忽略
### 2.2 反向因果(Reverse Causality)
明确回答:
> "Y 是否可能反过来影响 D?"
如果是,描述反向因果路径,并说明其对 OLS 估计量的偏误方向。
### 2.3 测量误差(Measurement Error)
检查关键变量(处理变量 D 和结果变量 Y)是否存在:
- **经典测量误差**(随机噪声):导致处理变量系数衰减偏误(attenuation bias)
- **非经典测量误差**:偏误方向不定,需具体分析
- **数据构建问题**:如行政数据中的系统性遗漏、自报数据中的社会期望偏误
### 2.4 样本选择偏误(Sample Selection Bias)
如果样本并非随机,检查:
- 是否存在 **Heckman 型选择**(只有选择参与的人才有结果可观测)
- 是否存在**失访偏误**(attrition)
- 结果变量的**分母定义问题**(如"就业率"的分母人群界定)
---
### 2.5 内生性威胁评估
产出一个内生性威胁评估报告:
```
内生性威胁报告
─────────────────────────────────────────────────────
主要威胁(必须解决):
1. [最严重的内生性来源,一句话描述机制与偏误方向]
2. [次严重...]
次要威胁(需控制或讨论):
3. [...]
可忽略威胁(简要说明为何可忽略):
4. [...]
─────────────────────────────────────────────────────
结论:本研究 OLS 估计量的净偏误方向预计为 [高估/低估/不确定]。
```
---
## Step 3: 识别策略评估矩阵
根据研究问题、内生性威胁、和已知的数据可及性,系统评估**所有可能的识别策略**。
### 3.1 候选策略筛查
| 策略 | 核心假设 | 所需数据条件 | 文献先例 | 可行性评分 |
|------|----------|------------|---------|-----------|
| **工具变量(IV/2SLS)** | 排除约束 + 相关性 | 有效工具变量 | 是/否/部分 | ★★★★☆ |
| **双重差分(DiD)** | 平行趋势假设 | 处理组+控制组,准实验变异 | 是/否/部分 | ★★★★☆ |
| **断点回归(RDD)** | 连续性假设 | 连续型分配变量+阈值规则 | 是/否/部分 | ★★★☆☆ |
| **合成控制(SCM)** | 预处理拟合 | 多个潜在控制单位+长面板 | 是/否/部分 | ★★★☆☆ |
| **面板固定效应(Panel FE)** | 严格外生性 | 个体面板数据 | 是/否/部分 | ★★★★☆ |
| **机器学习因果(DML/CF)** | 无混淆(条件独立) | 高维控制变量可观测 | 是/否/部分 | ★★☆☆☆ |
| **OLS + 丰富控制** | 无遗漏相关变量 | 截面或面板均可 | 是/否/部分 | ★★☆☆☆ |
**可行性评分标准(1–5星):**
- 5★:数据完全满足,假设有据可查,文献先例充分
- 4★:数据基本满足,假设可合理辩护,有类似先例
- 3★:数据部分满足,假设较强但有检验方案
- 2★:数据勉强可用,假设较弱,需大量辩护
- 1★:数据不满足或假设在本研究中不可信
### 3.2 深度评估前两名候选策略
对可行性评分最高的 **2 个策略**,分别展开:
**策略 A:[名称]**
```
适用条件:[为何此策略适合本研究]
核心假设:
假设1(可检验):[正式表述]
→ 检验方案:[具体检验方法]
假设2(不可检验但可讨论):[正式表述]
→ 佐证论点:[理论或间接证据支持]
数据要求:
- 必须有:[...]
- 理想情况下有:[...]
- 替代方案:[...]
主要弱点:
- [最大漏洞是什么?审稿人最可能攻击哪里?]
文献中的先例:
- [引用 Phase 2 文献综述中使用过该方法的论文]
估计的参数类型:[ATE / ATT / LATE]
```
**策略 B:[名称]**
(同上格式)
---
## Step 4: 策略选择与辩护
### 4.1 选定策略
基于 Step 3 的评估,选定最优策略并给出**三层辩护**:
**层次1 — 经济学理论辩护**
> "[选定策略] 之所以适合本研究,是因为 [X 的外生变异来源于...,满足...假设]。"
**层次2 — 文献规范辩护**
> "研究 [类似问题] 的文献普遍采用 [方法](如 Author, Year;Author, Year),本文遵循这一规范并在以下方面有所改进:[...]。"
**层次3 — 数据可及性辩护**
> "本研究数据的结构 [具体描述:如面板维度、政策时点、阈值规则] 满足该策略的核心数据要求。"
### 4.2 拒绝其他策略的理由
对每个被排除的候选策略,给出简短的排除理由(防止审稿人质疑):
| 被排除策略 | 排除理由 |
|----------|---------|
| [策略X] | [具体原因,如:缺乏有效工具变量 / 处理是自选择不满足RDD条件 / ...] |
---
## Step 5: 识别假设的正式化与可检验化
这是 Phase 3 最核心的输出。对选定策略的**每一个核心假设**:
### 5.1 格式要求
每个假设必须包含:
```
假设名称:[e.g., 平行趋势假设]
正式表述:[数学或逻辑形式]
E[Y₀(t) | D=1] - E[Y₀(t) | D=0] = 常数,∀ t < T(处理前各期)
直觉解释:[一句话非技术语言表述]
可检验性:可直接检验 / 间接支持 / 不可检验
检验方法:
- 主检验:[具体方法,如:事件研究图(event study plot),检验处理前各期系数是否联合显著为0]
- 辅助检验:[如:安慰剂检验、假控制组检验]
- 预期结果:[如果假设成立,应观察到什么图形或统计结果]
- 对应 skill:[调用哪个 skill 执行此检验,如 did-analysis / iv-estimation]
违反后果:[如果假设被拒绝,估计量的偏误方向是什么?]
```
### 5.2 各策略的标准假设列表
根据选定策略自动填充以下核心假设:
**若选择 DiD:**
1. 平行趋势假设(Parallel Trends)
2. 无预期效应(No Anticipation)
3. 稳定单元处理值假设(SUTVA)
4. 处理变异的外生性(政策非内生于结果)
**若选择 IV/2SLS:**
1. 相关性假设(Relevance):F 统计量 > 10(Stock-Yogo 临界值)
2. 排除约束(Exclusion Restriction):工具变量仅通过处理变量影响结果
3. 独立性假设(Independence):工具变量与未观测混淆变量无关
4. 单调性(Monotonicity):适用于 LATE 估计
**若选择 RDD:**
1. 连续性假设(Continuity at Cutoff)
2. 无精确操纵(No Precise Manipulation):McCrary 密度检验
3. 无复合处理(No Compound Treatment):阈值处不存在其他不连续
4. 局部线性平滑(Local Linearity)
**若选择 合成控制:**
1. 预处理期拟合优度(Pre-treatment Fit)
2. 无溢出效应(No Spillovers)到控制单位
3. 因子结构稳定性(Factor Model Stability)
**若选择 Panel FE:**
1. 严格外生性(Strict Exogeneity):处理变量与过去和未来的误差项无关
2. 个体固定效应足以控制时不变混淆
3. 无动态面板偏误(若 T 较小)
---
## Step 6: 数据需求规格书
根据选定的识别策略,产出精确的数据需求,对接 Phase 4 的 `data-pipeline` skill:
```
数据需求规格书
══════════════════════════════════════════════════════
研究问题:[复述]
选定识别策略:[复述]
══════════════════════════════════════════════════════
【核心数据集】
数据集名称:[e.g., 中国工业企业数据库]
数据来源:[机构 / URL / 申请方式]
覆盖范围:[年份] × [地理单位] × [个体类型]
关键变量:
- 结果变量 Y:[精确变量名/定义]
- 处理变量 D:[精确变量名/定义]
- 识别变量 Z:[工具变量 / 断点变量 / 政策时点]
- 必须控制的协变量:[列表]
- 可选协变量(用于稳健性):[列表]
单元层级:[个体 / 企业 / 县 / 省 / 国家]
面板结构:是/否;T = [时间跨度]
【数据局限性预警】
- [已知问题1,如:某年份缺失 / 变量代理误差 / 样本代表性]
- [已知问题2]
【如有辅助数据集】
辅助数据集名称:[...]
用途:[用于工具变量构建 / 安慰剂检验 / 机制分析]
══════════════════════════════════════════════════════
```
---
## Step 7: 产出 Identification Strategy Memo
整合以上所有分析,生成`identification-memo.md`文档,保存到工作目录:
**Example Output Format:**
```markdown
# Identification Strategy Memo
**Research Question:**
**Version:** v1.0
**Date:** [YYYY-MM-DD]
---
## 1. 因果框架
### 1.1 目标参数
[ATE / ATT / LATE,附理由]
### 1.2 因果图(DAG)
[文字版 DAG]
---
## 2. 内生性威胁
### 2.1 主要威胁
[来自 Step 2 的威胁列表]
### 2.2 OLS 偏误方向预判
[高估 / 低估 / 不确定,附推理]
---
## 3. 识别策略
### 3.1 选定策略:[策略名称]
**策略概述**
[2–3句话,非技术语言描述识别逻辑]
**识别变异来源**
[具体描述:什么在外生变动?为什么可信?]
**三层辩护**
[来自 Step 4.1]
### 3.2 排除策略说明
[来自 Step 4.2 的表格]
---
## 4. 核心假设与检验计划
| 假设 | 正式表述 | 可检验性 | 检验方法 | 检验时机 |
|------|----------|---------|---------|---------|
| [假设1] | [...] | 可检验 | [...] | Phase 5 建模前 |
| [假设2] | [...] | 间接支持 | [...] | Phase 8 稳健性 |
[每个假设的详细展开,来自 Step 5]
---
## 5. 数据需求规格书
[来自 Step 6]
```
## Handoff
完成上述所有步骤后,提示用户识别策略分析已完成,可进入下一工作阶段:数据准备与探索性分析,是否继续?code
Phase 6 Code Generation & Execution. Reads identification-memo.md, data-report.md, and model-spec.md, asks user to select software (Python / R / Stata), dispatches to the appropriate estimation skill, generates a reproducible analysis script with main regression, diagnostics, and output export, then executes and verifies results.
# /code — 代码执行与复现
## 定位
`/code` 是实证研究工作流的**第六阶段**,承接 Phase 5(`/model`)的 `model-spec.md`,将形式化的计量模型转化为可执行、可复现的分析代码。核心流程:
1. 读取上游文档,提取估计所需的全部参数
2. 用户选择目标统计软件(Python / R / Stata)
3. 根据识别策略 × 软件,调用对应的**估计技能**生成代码
4. 代码执行 + 结果验证
5. 输出规范化的 `.py` / `.R` / `.do` 脚本至 `code/`,结果表格至 `tables/`
---
## Step 0:读取上游输出
> 📎 **参见 [`shared/context-reader.md`](../shared/context-reader.md)**
> 本阶段所需文件:`identification-memo.md`(必需)、`data-report.md`(必需)、`model-spec.md`(必需)。
提取以下信息用于代码生成:
| 来源 | 提取内容 |
|------|---------|
| `model-spec.md` | 识别策略类型、主方程(LaTeX 转为代码描述)、Y/D/Z/控制变量名称、FE 层级、SE 类型与聚类变量、目标参数 |
| `data-report.md` | 清洗后数据路径(`data/clean/[project_name]_clean.[dta\|parquet]`)、样本量、面板结构、数据质量预警 |
| `identification-memo.md` | 识别假设的可检验形式(决定生成哪些诊断代码) |
提取完成后向用户确认:
> **Phase 6 启动确认**
>
> 识别策略:**[策略]**
> 主方程:$[LaTeX 概述]$
> 数据路径:`data/clean/[文件名]`
> 标准误:**[类型,聚类变量]**
---
## Step 1:软件选择
向用户呈现软件选项:
> *"请选择生成代码所使用的统计软件:*
>
> - **A. Python** ⚡ 推荐 — 代码生成后**自动在沙箱中执行**,结果即时可见
> - **B. R** — 生成 `.R` 脚本,**需在本地安装 R 后手动运行**
> - **C. Stata** — 生成 `.do` 脚本,**需在本地安装 Stata 后手动运行**
>
等待用户确认后进入 Step 2。
> **注**:选择 B 或 C 时,代码文件生成至 `code/` 目录后,Claude 不会在沙箱中执行;请用户下载脚本后在本地环境运行,运行结果可再上传供后续分析。
---
## Step 2:策略 × 软件派发
根据 `identification_strategy` × 软件选择,**调用对应的估计技能**,并传入 Step 0 提取的完整上下文。
### 2.1 派发表
| 识别策略 | 调用的估计技能 | Python 核心包 | R 核心包 | Stata 核心命令 |
|---------|-------------|-------------|---------|--------------|
| DiD / TWFE | `did-analysis` | `linearmodels`, `pyfixest` | `fixest`, `did` | `reghdfe`, `eventstudyinteract` |
| Event Study(交错 DiD)| `did-analysis` | `pyfixest`, `doubleml` | `fixest`, `did`, `sunab` | `reghdfe`, `did_imputation`, `csdid` |
| Sharp / Fuzzy RDD | `rdd-analysis` | `rdrobust` | `rdrobust`, `rddensity` | `rdrobust`, `rddensity` |
| 2SLS / IV | `iv-estimation` | `linearmodels` (`IV2SLS`) | `ivreg`, `AER` | `ivregress`, `ivreg2`, `ranktest` |
| 面板固定效应 | `panel-data` | `linearmodels` (`PanelOLS`) | `fixest`, `plm` | `reghdfe`, `xtreg` |
| 合成控制 | `synthetic-control` | `pysynth`, `synth_runner` | `Synth`, `SCtools` | `synth`, `synth_runner` |
| OLS / 截面回归 | `ols-regression` | `statsmodels`, `linearmodels` | `lm`, `estimatr` | `regress`, `reghdfe` |
| 双重机器学习 | `ml-causal` | `doubleml`, `econml` | `DoubleML`(R 包)| 无原生支持,建议用 Python |
| 时间序列 / VAR | `time-series` | `statsmodels` | `vars`, `urca` | `var`, `vecm`, `arima` |
### 2.2 技能调用格式
> *"调用 `[估计技能名]`,生成 [软件名] 代码。传入参数:*
> - 策略:[识别策略]
> - 主方程:[文字描述]
> - 因变量:`[Y_var]`
> - 处理变量:`[D_var]`
> - 识别变量:`[Z_var(若有)]`
> - 控制变量:`[control_vars 列表]`
> - 个体/时间变量:`[id_var]` / `[time_var]`
> - FE 层级:[个体 + 时间 / 仅个体 / 省 × 年...]
> - SE 类型:[聚类层级]
> - 数据路径:`data/clean/[文件名]`"*
### 2.3 Stata 专项处理
选择 Stata 时,除调用策略对应的估计技能外,**同时调用 `stata` skill** 确保 do-file 格式符合最佳实践(标准文件头、全局路径宏、日志文件、`assert` 验证、版本声明)。
`stata` skill 负责提供:
- `00_master.do` 的标准结构
- `reghdfe` / `ivreg2` / `rdrobust` 的安装命令
- `estout` / `coefplot` 的结果导出代码
- Stata 版本兼容性注意事项
---
## Step 3:代码质量标准
无论选择哪种软件,生成的代码必须满足以下要求,缺一不可:
### 3.1 文件头(Header)
每个脚本顶部必须包含:
```python
# ============================================================
# Project: [研究问题一句话]
# Phase: 6 — Main Estimation
# Strategy: [识别策略]
# Software: Python [version] / R [version] / Stata [version]
# Author: [project_name]
# Date: [YYYY-MM-DD]
# Input: data/clean/[文件名]
# Output: tables/[结果表格], figures/[系数图]
# Model: [主方程的文字描述,对应 model-spec.md §2]
# ============================================================
```
### 3.2 路径与环境
```python
# Python
import os
from pathlib import Path
ROOT = Path(__file__).parent.parent # 项目根目录
DATA = ROOT / "data" / "clean"
OUT_TABLES = ROOT / "tables"
OUT_FIGURES = ROOT / "figures"
OUT_TABLES.mkdir(exist_ok=True)
OUT_FIGURES.mkdir(exist_ok=True)
data_path = DATA / "[project_name]_clean.parquet"
```
```r
# R
root <- here::here() # 需 here 包
data_path <- file.path(root, "data", "clean", "[project_name]_clean.rds")
out_tables <- file.path(root, "tables"); dir.create(out_tables, showWarnings=FALSE)
out_figures <- file.path(root, "figures"); dir.create(out_figures, showWarnings=FALSE)
```
```stata
* Stata
version 17
clear all
set more off
cap log close
global root "[workspace 绝对路径]"
global data "$root/data/clean"
global tables "$root/tables"
global figures "$root/figures"
cap mkdir "$tables"
cap mkdir "$figures"
log using "$root/logs/main_estimation_`c(current_date)'.log", replace
```
### 3.3 数据加载 + 样本确认
```python
# 加载并报告基本维度
df = pd.read_parquet(data_path)
print(f"样本:{len(df):,} 行 × {df.shape[1]} 列")
print(f"时间范围:{df[time_var].min()} – {df[time_var].max()}")
print(f"个体数:{df[id_var].nunique()}")
# 与 data-report.md 中的样本量核对
assert len(df) == EXPECTED_N, f"样本量不符:预期 {EXPECTED_N},实际 {len(df)}"
```
### 3.4 主回归(来自估计技能)
- 严格按照 `model-spec.md` §2 的方程规格,不擅自增减变量
- 使用 §4 指定的 SE 类型和聚类层级
- 代码注释中注明每个参数对应方程中的哪个符号(如 `# β: ATT estimate`)
### 3.5 诊断代码(执行 `model-spec.md §6` 的诊断计划)
诊断的**规格决策**(检验什么、通过标准、失败处理)已在 Phase 5(`/model` Step 3.5)中确定,写入 `model-spec.md §6`。本步骤的唯一职责是**按规格执行**:
1. 读取 `model-spec.md §6.1`(识别假设诊断规格)和 `§6.2`(统计假设诊断规格)
2. 生成 `code/02_diagnostics.[ext]`,严格按照 §6 指定的检验项目、通过标准和失败处理逻辑编写代码
3. 不得在代码阶段自行增减检验项目或修改通过标准
**代码生成规则:**
- §6.1 每一行 → 生成对应的识别诊断代码(`rddensity`、预趋势 F 检验、Hausman 等)
- §6.2 每一行 → 生成对应的统计诊断代码(VIF、BP、BG/Wooldridge、RESET、Cook's D)
- 若 §6 某行标注"失败则终止"→ 代码中加 `assert`(Python)/ `stop()`(R)/ `error`(Stata),失败时自动中断并打印返回建议
- 若 §6 某行标注"继续" → 代码记录 WARN 后继续执行,不中断
**各软件的诊断包对照:**
| 检验 | Python | R | Stata |
|------|--------|---|-------|
| VIF | `statsmodels.variance_inflation_factor` | `car::vif` | `estat vif` |
| 异方差 | `statsmodels.het_breuschpagan` | `lmtest::bptest` | `estat hettest` |
| 序列相关(截面)| `statsmodels.acorr_breusch_godfrey` | `lmtest::bgtest` | `estat bgodfrey` |
| 序列相关(面板)| `linearmodels` Wooldridge | `plm::pwartest` | `xtserial` |
| 截面相关 | `linearmodels` Pesaran CD | `plm::pcdtest` | `xtcsd` |
| RESET | `statsmodels.linear_reset` | `lmtest::resettest` | `estat ovtest` |
| Cook's D | `statsmodels.OLSInfluence` | `cooks.distance` | `predict cooksd` |
| McCrary | `rddensity` Python port | `rddensity::rddensity` | `rddensity` |
| 预趋势联合检验 | `pyfixest.wald_test` | `fixest::wald` | `test`(联合 F) |
| 第一阶段 F | `pyfixest` / `linearmodels` | `AER::ivreg` diagnostics | `estat firststage` |
| Hausman | `linearmodels.compare` | `plm::phtest` | `hausman` |
### 3.6 结果输出
> 📎 **输出格式规范参见 [`shared/output-standards.md`](../shared/output-standards.md)**
所有模型结果必须同时导出:
**回归系数表(`.tex` + `.csv`):**
```python
# Python:使用 pyfixest 或 statsmodels 的 summary_col
from pyfixest.summarize import etable
etable([model_main, model_control1, model_control2],
type="tex",
file=str(OUT_TABLES / "table_main.tex"))
```
```r
# R:使用 modelsummary 或 fixest 的 etable
library(modelsummary)
modelsummary(list("Baseline"=m1, "Controls"=m2, "Full"=m3),
stars=c("*"=0.1, "**"=0.05, "***"=0.01),
gof_omit="AIC|BIC|Log",
output=file.path(out_tables, "table_main.tex"))
```
```stata
* Stata:使用 esttab(estout 包)
esttab m1 m2 m3 using "$tables/table_main.tex", ///
b(3) se(3) star(* 0.10 ** 0.05 *** 0.01) ///
booktabs replace label ///
mtitles("Baseline" "Controls" "Full") ///
keep(`D_var') stats(N r2, fmt(0 3) labels("Obs." "R²"))
```
**系数图(`.png`):**
```python
# Python:matplotlib + 回归结果
fig, ax = plt.subplots(figsize=(7, 4))
coef = result.params[D_var]
ci_lo, ci_hi = result.conf_int().loc[D_var]
ax.errorbar(0, coef, yerr=[[coef - ci_lo], [ci_hi - coef]],
fmt="o", ms=8, capsize=5, color="#C0392B")
ax.axhline(0, ls="--", lw=1, color="gray")
ax.set_title(f"Main Estimate: Effect of {D_var} on {Y_var}")
plt.savefig(OUT_FIGURES / "coef_main.png", dpi=150, bbox_inches="tight")
```
```stata
* Stata:coefplot 包
coefplot m1 m2 m3, keep(`D_var') ///
xline(0) msymbol(D) ///
title("Main Estimates") ///
xtitle("Coefficient (95% CI)") ///
graphregion(color(white))
graph export "$figures/coef_main.png", replace
```
### 3.7 可复现性脚注
每个脚本末尾:
```python
# Python
import sys, platform
print(f"\n{'='*50}")
print(f"Python {sys.version}")
print(f"Platform: {platform.platform()}")
print(f"Date: {pd.Timestamp.now()}")
import importlib
for pkg in ["pandas", "numpy", "statsmodels", "linearmodels", "pyfixest"]:
try:
v = importlib.import_module(pkg).__version__
print(f" {pkg}: {v}")
except Exception:
pass
```
```r
# R
sessionInfo()
```
```stata
* Stata
which reghdfe
which rdrobust
di "Stata version: `c(stata_version)'"
di "Date: `c(current_date)'"
log close
```
---
## Step 4:代码文件结构
所有代码文件保存至 `code/` 目录,按编号命名,保持可单独运行:
```
code/
├── 00_master.[py|R|do] # 主运行脚本:依次调用 01–04
├── 01_main_estimation.[py|R|do] # 主回归(对应 model-spec.md §2 主方程)
├── 02_diagnostics.[py|R|do] # 识别假设诊断(§3.5 列出的全部检验)
├── 03_event_study.[py|R|do] # 事件研究 / 动态效应(DiD/RDD 专用)
└── 04_output_tables.[py|R|do] # 结果表格与系数图生成
```
> **说明**:若主回归、诊断和输出逻辑可在 100 行内完成,允许合并为单文件 `01_main_estimation.[py|R|do]`;复杂设计(交错 DiD、Fuzzy RDD、多种稳健性)应拆分文件。
**`00_master` 模板:**
```python
# 00_master.py — 按顺序运行全部分析脚本
import subprocess, sys
scripts = [
"code/01_main_estimation.py",
"code/02_diagnostics.py",
"code/03_event_study.py",
"code/04_output_tables.py",
]
for s in scripts:
print(f"\n{'='*50}\n运行:{s}\n{'='*50}")
result = subprocess.run([sys.executable, s], check=True)
```
```stata
* 00_master.do
do "$root/code/01_main_estimation.do"
do "$root/code/02_diagnostics.do"
do "$root/code/03_event_study.do"
do "$root/code/04_output_tables.do"
```
---
## Step 5:执行与验证
### 5.1 执行代码
**Python — 沙箱自动执行**
代码生成后,Claude 直接在沙箱中安装依赖并运行,无需用户任何操作:
```bash
pip install pyfixest linearmodels rdrobust doubleml pyreadstat --quiet
python code/00_master.py
```
执行完成后,结果表格(`tables/`)和图形(`figures/`)即时可见,Claude 将在对话中呈现关键数值并进行核查。
---
**R — 生成脚本,用户本地运行**
Claude 生成 `.R` 脚本至 `code/` 目录,**不在沙箱中执行**。用户下载后在本地 R 环境运行:
```r
# 本地 R 中:先安装依赖包(仅首次需要)
install.packages(c("fixest", "rdrobust", "ivreg", "modelsummary",
"here", "AER", "did", "Synth"),
repos = "https://cloud.r-project.org")
# 运行主脚本
source("[workspace]/code/00_master.R")
```
> *"R 脚本已生成至 `code/` 目录。请在本地 R(≥ 4.1)环境中运行 `00_master.R`。如需将运行结果(表格 / 图形)上传,我可继续协助核查和解读。"*
---
**Stata — 生成脚本,用户本地运行**
Claude 生成 `.do` 脚本至 `code/` 目录,**不在沙箱中执行**。用户下载后在本地 Stata 中运行:
> *"Stata do-file 已生成至 `code/` 目录。请在本地 Stata(≥ 15)中运行:*
> ```stata
> do "[workspace]/code/00_master.do"
> ```
> *首次运行前请先安装所需包(联网状态下在 Stata 命令窗口执行):*
> ```stata
> ssc install reghdfe
> ssc install ftools
> ssc install ivreg2 // IV 策略
> ssc install rdrobust // RDD 策略
> ssc install synth // 合成控制
> ssc install eventstudyinteract // Event Study
> ssc install csdid // Callaway-Sant'Anna
> ssc install estout // 结果导出
> ssc install coefplot // 系数图
> ```
> *如需将运行结果(日志 / 表格 / 图形)上传,我可继续协助核查和解读。"*
### 5.2 诊断报告生成与结果验证
代码执行后,汇总两层诊断结果,生成结构化报告并验证关键数值。
**生成 `diagnostic_report.md`(保存至项目根目录):**
```
═══════════════════════════════════════════════════════
MODEL DIAGNOSTIC REPORT
═══════════════════════════════════════════════════════
Project: [project_name]
Strategy: [识别策略]
Model: [Y_var] = f([D_var], [controls]) + [FE 层级]
N: [样本量]
Date: [YYYY-MM-DD]
═══════════════════════════════════════════════════════
【第一层:识别假设诊断】
────────────────────────────────────────────────────────
检验 统计量 结论
────────────────────────────────────────────────────────
[策略对应的识别检验] [值] ✅ PASS / ⚠️ WARN / ❌ FAIL
────────────────────────────────────────────────────────
【第二层:统计假设诊断】
────────────────────────────────────────────────────────
VIF(最大值) [值] ✅ / ❌
Breusch-Pagan p = [值] ✅ / ⚠️(已用稳健 SE)
Breusch-Godfrey p = [值] ✅ / ❌
RESET p = [值] ✅ / ❌
Cook's D(最大值) [值] ✅ / ⚠️
────────────────────────────────────────────────────────
【诊断结论与建议】
────────────────────────────────────────────────────────
❌ 识别假设失败
→ 立即终止,返回 Phase [4/5] 修正:[具体说明]
⚠️ 异方差(已用聚类 SE,可接受)
✅ 无多重共线性问题
✅ 无序列相关
⚠️ Cook's D 超阈值([N 个]高影响观测值)
→ 已纳入 Phase 8 稳健性检验:去除高影响值后重估
════════════════════════════════════════════════════════
```
**结果验证清单(逐项确认):**
```
=== Phase 6 验证清单 ===
□ 主系数方向与 EDA 阶段的方向一致?
□ 样本量与 data-report.md 记录的 N 一致?
□ 标准误类型与 model-spec.md §4 一致(聚类层级正确)?
□ 【第一层识别诊断全部通过?】
□ DiD:预趋势系数联合检验 p > 0.1
□ RDD:McCrary 检验 p > 0.1;协变量无显著跳跃
□ IV:一阶段 F > 10;J-test p > 0.05(过识别时)
□ FE:Hausman 检验支持 FE 选择
□ SC:处理单位预处理期 RMSPE 在捐赠者分布中居中
□ 【第二层统计诊断无阻断性问题?】
□ VIF < 10(所有控制变量)
□ 异方差:若 BP 失败,确认已使用稳健/聚类 SE
□ RESET:p > 0.05,或有经济理论支持当前函数形式
□ Cook's D:高影响值已列出并纳入稳健性
□ 结果表格已导出至 tables/(.tex + .csv)?
□ 系数图已导出至 figures/(.pdf + .png)?
□ diagnostic_report.md 已保存至项目根目录?
```
**若识别假设失败(❌)**:立即停止,不进入 Phase 7,明确告知用户返回哪个阶段修正,原因是什么。
**若统计假设存在 ⚠️**:继续流程,但将问题自动写入 diagnostic_report.md 的"建议"节,Phase 8 的 `/robustness` 命令读取该报告时将自动纳入对应的稳健性检验。
---
## Step 6:阶段确认与移交
向用户呈现阶段摘要:
> **Phase 6 产出摘要**
>
> ✅ 主回归代码:`code/01_main_estimation.[ext]`
> ✅ 诊断代码:`code/02_diagnostics.[ext]`
> ✅ 事件研究代码:`code/03_event_study.[ext]`(如适用)
> ✅ 输出脚本:`code/04_output_tables.[ext]`
> ✅ 主运行脚本:`code/00_master.[ext]`
> ✅ 主回归结果表:`tables/table_main.tex` + `.csv`
> ✅ 系数图:`figures/coef_main.png`
>
> ---
>
> **关键结果摘要:**
> - 主系数 $\hat{\beta}$ = [值](SE = [值],p = [值])
> - 识别假设诊断:[通过 / 存疑,具体说明]
> - 样本量:N = [值]
>
> 代码与结果验证通过后,自动调用 `results-analysis` skill,进入 **Phase 7(结果分析与解释)**,产出 `results-memo.md`。data
Phase 4 Data Preparation Pipeline. Data fetch, data clean and exploratory analysis. Produce a data report in the end.
# /data — 数据准备与探索性分析
## 定位
`/data` 是实证研究工作流的第四阶段,承接 Phase 3(`/analyze`)产出的 `identification-memo.md`,完成从数据获取到探索性分析的完整流水线,产出 `data-report.md`,为后续计量模型的构建提供分析就绪的数据集。
---
## Step 0: 读取 Phase 3 输出
读取工作目录中的 `identification-memo.md`:
```text
Read: [workspace]/identification-memo.md
```
如文件不存在,立即停止并提示:
> *"未找到 `identification-memo.md`。请先完成 Phase 3,确认识别策略和数据需求规格后再运行 `/data`。"*
如文件存在,从中提取并整理以下信息:
| 信息 | 用途 |
| ----------------------------------- | ---------------------------- |
| 研究问题 | 生成数据报告抬头与样本定义 |
| 选定识别策略 | 决定变量构造与 EDA 检查重点 |
| 目标参数(ATE / ATT / LATE 等) | 决定样本定义与变量解释口径 |
| 核心数据集 | 形成主数据获取任务 |
| 结果变量 Y | 关键变量审计 |
| 处理变量 D | 关键变量审计 |
| 识别变量 Z / cutoff / policy timing | 识别变量构造 |
| 必须控制的协变量 | 变量到位检查 |
| 面板结构 | 单元层级、时间范围、地域范围 |
| 已知数据局限性预警 | Phase 4 风险排查重点 |
提取完成后向用户确认:
> *"已读取 `identification-memo.md`。数据需求规格如下:*
> *核心数据集:[数据集名称]*
> *识别策略:[策略名称]*
> *关键变量:Y=[结果变量],D=[处理变量],Z=[识别变量]*
> *请确认无误,或告知是否有调整。"*
---
## Step 1: 数据获取
### 1.0:数据来源确认(多选)
根据 Step 0 中提取的数据集名称,向用户确认数据来源构成。**支持同时选择多个来源。**
向用户呈现以下确认请求:
> *"根据识别策略分析,本研究需要以下数据集:[从 identification-memo.md 提取的数据集列表]*
>
> 请确认每个数据集的获取方式(可多选):
>
> - **[ ] A. 公开数据**:可通过 API 或网络免费下载(如 FRED、World Bank)
> - **[ ] B. 商业数据库**:需机构订阅账号(如 CSMAR、WIND、Compustat、WRDS)
> - **[ ] C. 自备数据**:已持有文件(调查数据、行政数据、手工整理数据等)
>
> 请告知哪些数据集走哪个渠道,或直接说明来源情况。"*
收到用户回复后:
1. 整理所有数据来源任务清单(每个数据集 → 对应分支标记)
2. 若涉及多个来源,告知用户处理顺序:**先 A(自动获取),再 B(引导下载),再 C(读取自备)**,最后在 Step 2 统一合并
3. 逐一执行各分支,每个分支完成后记录至 `data/raw/data_log.md`
---
### 分支 A:公开数据——API 自动获取
**判断条件:** 数据集可通过公开 API 或网络直接下载,无付费墙。
常见公开数据源包括但不限于:
| 数据源 | 类型 | 获取方式 |
|--------|------|---------|
| FRED | 美国宏观时间序列 | `fredapi` Python 包 |
| World Bank | 跨国发展指标 | `wbdata` Python 包 |
| OECD | 跨国统计 | OECD SDMX API |
| IMF | 国际金融统计 | `imf-reader` Python 包 |
| Yahoo Finance | 金融资产价格 | `yfinance` Python 包 |
| 国家统计局 | 中国宏观数据 | NBS API / EPS 数据平台 |
| CEIC | 宏观时间序列 | WebFetch(需检查访问权限) |
**执行流程:**
1. 调用 `data-pipeline` skill,传入目标数据集名称、变量清单、时间范围
2. 由 `data-pipeline` skill 负责 API Key 检查、代码生成与执行
3. 原始数据保存至 `data/raw/[source_name]_[YYYYMMDD].[csv|dta]`
4. 在 `data/raw/data_log.md` 中追加本次获取记录:
```markdown
## [数据集名称] 获取记录(分支 A)
- 获取日期:[YYYY-MM-DD]
- 数据版本/vintage:[如适用]
- 来源 URL / API endpoint:[...]
- 原始文件路径:data/raw/[文件名]
- 变量数:[K],观测数:[N]
- 合并键(用于多源合并):[如 country_code + year]
```
5. 报告数据维度,继续执行下一个待处理来源(若有),否则进入 Step 1.1。
---
### 分支 B:商业数据库——指引用户自行获取
**判断条件:** 数据集需付费订阅或机构账号,存在付费墙,无法自动下载。
常见需要机构账号的数据库:
| 数据库 | 主要内容 | 典型用途 |
|--------|---------|---------|
| CSMAR(国泰安) | 中国上市公司财务、股票 | 企业金融、公司治理 |
| WIND(万得) | 中国金融市场全品类 | 宏观、债券、股票 |
| CNRDS | 中国研究数据服务平台 | 社会科学微观数据 |
| RESSET | 中国金融数据 | 资产定价 |
| 中国工业企业数据库 | 规模以上工业企业 | 产业经济、贸易 |
| 中国海关数据库 | 进出口交易记录 | 国际贸易 |
| IPUMS | 美国及国际人口普查微观数据 | 劳动经济、人口 |
| Compustat | 全球上市公司财务 | 企业金融 |
| WRDS 系列 | 金融研究综合平台 | 资产定价、金融 |
**执行流程:**
向用户给出具体操作指引:
> *"**[数据库名称]** 需要机构订阅账号,无法自动下载。请按以下步骤获取数据:*
>
> **① 登录数据库**
> [数据库官网或机构访问入口 URL]
>
> **② 下载以下变量**(根据 `identification-memo.md` 的数据需求规格书):
> - 结果变量:[Y 的精确变量名/字段名]
> - 处理变量:[D 的精确变量名/字段名]
> - 识别变量:[Z 的精确变量名/字段名]
> - 控制变量:[协变量列表]
> - 时间范围:[起止年份]
> - 地理范围:[省/行业/国家等]
> - **合并键**:[用于与其他数据集合并的标识变量,如股票代码 stkcd + 年份 year]
>
> **③ 保存文件至 `data/raw/`**:
> ```
> [workspace]/data/raw/[your_filename].[dta|csv|xlsx]
> ```
>
> **④ 完成后告知我**,我将继续处理其余数据来源。"*
等待用户确认数据已放入 `data/raw/` 后,在 `data_log.md` 中追加记录,继续执行下一个待处理来源(若有),否则进入 Step 1.1。
---
### 分支 C:用户自备数据——直接读取
**判断条件:** 用户已持有数据文件(调查数据、行政数据、手工整理数据等),无需获取。
**执行流程:**
提示用户将数据放入指定位置:
> *"请将您的数据文件放入工作目录的 `data/raw/` 文件夹:*
> ```
> [workspace]/data/raw/[your_filename].[dta|csv|xlsx|parquet]
> ```
> *支持格式:`.dta`(Stata)、`.csv`、`.xlsx`、`.parquet`。放好后告知我文件名,我将自动读取。"*
用户确认文件已就位后:
1. 用 `Read` 或 `Bash` 读取文件,报告基本维度(N × K)和变量列表
2. 对照 `identification-memo.md` 中的变量清单,检查**关键变量是否齐全**:
```
变量到位检查
──────────────────────────────────────
✅ 结果变量 Y:[变量名] — 找到
✅ 处理变量 D:[变量名] — 找到
⚠️ 识别变量 Z:[变量名] — 未找到,需在清洗阶段从 [变量X] 构建
✅ 协变量:[列表] — 找到 [N/M] 个
──────────────────────────────────────
```
3. 在 `data_log.md` 中追加记录,继续执行下一个待处理来源(若有),否则进入 Step 1.1。
---
### 1.1:多源数据合并规划(当存在两个及以上数据来源时触发)
所有数据来源处理完毕后,在进入 Step 2 前,明确合并方案:
> *"所有数据来源已就位,共 [N] 个数据文件:*
> *- `[文件1]`:[来源类型],[N1] 行,合并键:[key1]*
> *- `[文件2]`:[来源类型],[N2] 行,合并键:[key2]*
> *...*
>
> *计划合并方案:*
> *1. 以 `[主数据文件]` 为主体(left join / inner join)*
> *2. 依次合并 `[辅助数据1]`(合并键:[key],预期匹配率:[估计])*
> *3. 依次合并 `[辅助数据2]`(合并键:[key],预期匹配率:[估计])*
>
> *请确认合并逻辑,或告知是否需要调整。"*
用户确认后进入 Step 2,在 `data-pipeline` skill 中执行实际合并操作。
---
## Step 2: 数据清洗
调用 `data-pipeline` skill。传入以下上下文,确保清洗不是通用操作而是面向具体识别策略:
- 识别策略类型(来自 Step 0)
- 结果变量、处理变量、识别变量的名称(来自 Step 0)
- 面板结构(个体变量名、时间变量名)
`data-pipeline` skill 负责执行:
**通用清洗**
- 重复观测检测与处理
- 缺失值编码(识别 -99、-88 等缺失码)与处理方式决策
- 变量类型修正与命名规范化(snake_case)
- 异常值检测(1st/99th 百分位 Winsorize,需经济学判断)
- 变量标签与值标签
**识别策略专属变量构建**(根据 Step 0 的策略类型自动触发):
| 识别策略 | 必须构建的变量 |
|--------|--------------|
| DiD | `treated`(处理组虚拟变量)、`post`(政策后虚拟变量)、`treated_post`(交互项)、`event_time`(相对处理时间) |
| RDD | `running_var_centered`(以阈值中心化的分配变量)、`above_cutoff`(阈值上方虚拟变量)、带宽内样本筛选标记 |
| IV | 工具变量 Z(若需从原始变量构建,如地理距离、历史数据匹配) |
| Panel FE | 确认 `id` 和 `time` 变量唯一标识,生成平衡性标记 |
| 合成控制 | 预处理期与后处理期分组标记,结果变量的长格式重塑 |
**面板结构验证**(若适用):
```stata
* 强/弱平衡性检查
xtset id year
xtdescribe
* 唯一标识符验证
isid id year
* 时间维度连续性检查
tab year
```
清洗完成后保存至:`data/clean/[project_name]_clean.[dta|parquet]`
生成清洗日志 `data/clean/cleaning_log.md`,记录每项转换操作、处理前后的样本量变化、关键变量的缺失率变化。
---
## Step 3: 探索性分析
调用 `results-analysis` skill,生成以下统计输出。所有表格同时存储为 `.tex`(用于论文)和 `.csv`(用于核查):
**Table 1 — 描述性统计**
全样本:N、均值、标准差、p25、中位数、p75、min、max。必须覆盖 Y、D、Z 及主要协变量。
保存至:`tables/table1_descriptive.tex` + `.csv`
**Table 2 — 处理组/控制组平衡性检验**(若存在处理变量)
处理组均值、控制组均值、差值、标准误、p 值。重点检验**预处理期协变量**的组间平衡性,不平衡的变量需在 Step 2 中重新检查。
保存至:`tables/table2_balance.tex` + `.csv`
**识别变量分布检查**(根据识别策略自动触发):
| 识别策略 | 检查内容 |
|--------|---------|
| DiD | 处理时点前后的结果变量趋势图(初步目测平行趋势) |
| RDD | 分配变量在阈值附近的分布直方图;McCrary 密度检验的初步目视 |
| IV | 工具变量与处理变量的散点图及初步相关系数 |
| Panel FE | 个体内(within)与个体间(between)方差分解 |
保存至:`figures/eda_[strategy_name].png`
**缺失值报告**
缺失率 > 5% 的变量列表,标注 MCAR / MAR / MNAR 的初步判断依据。
---
## Step 4: 产出 `data-report.md`
整合以上三个步骤,生成阶段交付文档,保存至工作目录:
```
Write: [workspace]/data-report.md
```
**文档结构:**
```markdown
# Data Report
**项目:** [研究问题一句话]
**版本:** v1.0
**日期:** [YYYY-MM-DD]
---
## 1. 数据来源与获取
- **数据集:** [名称]
- **获取方式:** [API 自动 / 商业数据库 / 用户自备]
- **获取日期:** [日期](如适用)
- **原始数据路径:** `data/raw/[文件名]`
- **已清洗数据路径:** `data/clean/[文件名]`
## 2. 样本描述
- **分析单元:** [个体/企业/县/省…]
- **时间跨度:** [起止年份]
- **原始样本量:** [N]
- **清洗后样本量:** [N](删减原因:[缺失值/异常值/样本限制])
- **面板结构:** [强/弱平衡;个体数 × 时间期数](如适用)
## 3. 关键变量状态
| 变量 | 角色 | 变量名 | 缺失率 | 均值 | 标准差 | 备注 |
|------|------|--------|--------|------|--------|------|
| [Y] | 结果变量 | ... | ...% | ... | ... | ... |
| [D] | 处理变量 | ... | ...% | ... | ... | ... |
| [Z] | 识别变量 | ... | ...% | ... | ... | ... |
## 4. 数据质量问题与处理方式
[逐条列出发现的数据问题及处理决策,说明经济学依据]
## 5. 探索性分析关键发现
[2–4条实质性发现,如:处理组与控制组在预处理期特征基本平衡;工具变量与处理变量相关系数为 0.XX;分配变量在阈值处分布无明显跳跃]
## 6. 待关注事项
[进入 Phase 5 前需注意的数据层面风险,如:某协变量平衡性较差,建议在主回归中加入;工具变量与 Y 的原始相关性偏弱,需关注一阶段 F 统计量]
```
---
## Step 5: 阶段确认与移交
向用户呈现阶段摘要:
> **Phase 4 产出摘要**
>
> ✅ 原始数据:`data/raw/[文件名]`([N] 行 × [K] 列)
> ✅ 清洗数据:`data/clean/[文件名]`([N'] 行 × [K'] 列)
> ✅ 描述性统计:`tables/table1_descriptive.tex`
> ✅ 平衡性检验:`tables/table2_balance.tex`(如适用)
> ✅ EDA 图:`figures/eda_[strategy].png`
> ✅ 数据报告:`data-report.md`
>
> ---
>
> 待您确认数据质量后,进入下一阶段 Phase 5 计量模型构建
>
---
## 常见问题处理
**Q:`identification-memo.md` 中列出的某个变量在数据中不存在怎么办?**
在 Step 1 分支 C 的变量检查环节标注 ⚠️,并提出两个选项:
1. **构建代理变量**:在 Step 2 中用现有变量合成近似指标,需说明测量误差方向
2. **降格识别层级**:返回 Phase 3,在 `identification-memo.md` 中更新策略,再重新执行 `/data`
**Q:数据量极大(>100 万行),清洗代码运行超时怎么办?**
在 Step 2 中生成分块处理代码(chunked processing),或建议用户在本地 Stata/Python 环境中运行,将清洗结果上传。model
Phase 5 Econometric Model Construction. Reads identification-memo.md and data-report.md, writes formal model specification with LaTeX equations, discusses identification assumptions and SE strategy, calls the appropriate estimation skill, and produces model-spec.md.
# /model — 计量模型构建
## 定位
`/model` 是实证研究工作流的**第五阶段**,承接 Phase 3(`/analyze`)的 `identification-memo.md` 和 Phase 4(`/data`)的 `data-report.md`,完成:
1. 根据识别策略**形式化写出计量模型**(含完整 LaTeX 公式)
2. 明确**识别假设的可检验形式**及其在数据中的诊断状态
3. 确定**标准误策略**(聚类层级、异方差处理方式等)
4. 调用对应的**估计技能**执行回归
5. 产出 `model-spec.md`,供 Phase 6(代码执行)和 Phase 9(论文写作)使用
---
## Step 0:读取上游输出
> 📎 **参见 [`shared/context-reader.md`](../shared/context-reader.md)**
> 本阶段所需文件:`identification-memo.md`(必需)、`data-report.md`(可选)。
读取完成后提取并整理以下关键信息:
| 来源 | 提取内容 |
|------|---------|
| `identification-memo.md` | 识别策略类型、结果变量 Y、处理变量 D、识别变量 Z、控制变量列表、面板结构(id/time)、目标参数(ATE/ATT/LATE) |
| `data-report.md` | 样本量 N、面板维度(个体数 × 时期数)、关键变量缺失率、EDA 预警(平衡性不足、弱工具变量等)、平衡性检验结果 |
提取完成后向用户确认:
> **Phase 5 启动确认**
>
> 识别策略:**[策略名称]**
> 目标参数:**[ATE / ATT / LATE / 平均处理效应...]**
> 结果变量 Y:`[变量名]`
> 处理变量 D:`[变量名]`
> 识别变量 Z / 运行变量:`[变量名(如适用)]`
> 样本:[N] 观测,[个体数] 个个体 × [时期数] 期
>
> 继续构建模型,或告知是否需要调整。
---
## Step 1:模型类型与规格讨论
在形式化写出方程前,与用户确认以下**建模决策**。每个决策点需给出推荐选项及经济学理由:
### 1.1 主方程类型
| 识别策略 | 推荐主方程类型 | 备选 |
|---------|--------------|------|
| DiD(2×2 或少期) | 双向固定效应 TWFE | 加权 DiD、有控制变量的 OLS |
| DiD(交错、多期) | Callaway–Sant'Anna / Sun–Abraham | TWFE(需诊断异质性处理效应偏误) |
| RDD(精确服从) | Sharp RDD,局部线性 | 高阶多项式(通常不推荐) |
| RDD(模糊服从) | Fuzzy RDD(2SLS at cutoff) | — |
| IV | 2SLS(线性第一阶段) | LIML(弱工具变量时稳健) |
| 面板固定效应 | TWFE(个体 + 时间) | 随机效应(需 Hausman 检验) |
| 合成控制 | 加权合成控制(Abadie 2003) | 合成 DiD(Arkhangelsky 2021) |
| 截面 OLS | OLS + 稳健 SE | — |
### 1.2 控制变量决策
参考 `identification-memo.md` 的协变量列表和 `data-report.md` 的平衡性检验结果:
- 平衡性检验标准化差异 > 0.25 的变量:**必须列为重点控制变量**
- 理论上影响 Y 但与 D 无关的变量:加入可提升精度(降低残差方差),建议加入
- 与 D 相关且位于处理路径上的中介变量:**不得控制**(会引入中介偏误)
### 1.3 固定效应层级
向用户说明各层级固定效应控制的内容:
| 固定效应 | 控制内容 | 代价 |
|---------|---------|------|
| 个体 FE($\alpha_i$) | 个体层面不随时间变化的遗漏变量 | 吸收所有个体层面截面变异 |
| 时间 FE($\lambda_t$) | 共同时间趋势(商业周期、通货膨胀等) | 吸收所有时期层面共同冲击 |
| 个体 × 时间组合 FE | 更细粒度的组别趋势 | 大幅降低自由度 |
| 省 × 年 FE | 控制省级层面年度时变混淆 | 需要足够跨省 within 变异 |
---
## Step 2:形式化模型写作(LaTeX)
根据 Step 1 确认的策略类型,**输出完整的 LaTeX 模型规格**。每个模型必须包含:
① 主方程
② 下标与符号说明
③ 关键假设的形式化表述
④ 目标参数的经济学解释
⑤ 识别条件
---
### 模型 A:双向固定效应 DiD(TWFE)
**适用**:二元处理、平行趋势假设成立、处理效应同质或近似同质。
```latex
% ── 主方程 ──────────────────────────────────────────
\begin{equation}\label{eq:twfe}
Y_{it} = \alpha_i + \lambda_t + \beta \, D_{it}
+ \mathbf{X}_{it}'\boldsymbol{\gamma} + \varepsilon_{it}
\end{equation}
% ── 符号说明 ─────────────────────────────────────────
% Y_{it} 结果变量(个体 i,时期 t)
% \alpha_i 个体固定效应(吸收不随时间变化的遗漏变量)
% \lambda_t 时间固定效应(吸收共同时间趋势)
% D_{it} 处理变量:= 1 若个体 i 在 t 期已受处理,否则 = 0
% \mathbf{X}_{it} 时变控制变量向量
% \varepsilon_{it} 误差项,假设 E[\varepsilon_{it} \mid \alpha_i, \lambda_t, D_{it}, \mathbf{X}_{it}] = 0
% ── 目标参数 ─────────────────────────────────────────
% \hat{\beta} 估计处理组的平均处理效应(ATT)
```
**平行趋势假设(Parallel Trends Assumption)的形式化表述:**
```latex
\begin{assumption}[Parallel Trends]
E[Y_{it}(0) - Y_{it'}(0) \mid D_i = 1]
= E[Y_{it}(0) - Y_{it'}(0) \mid D_i = 0],
\quad \forall\, t \neq t'
\end{assumption}
% 含义:在反事实意义上,若处理组未受处理,
% 其结果变量的时间趋势与控制组相同。
```
---
### 模型 A':事件研究(Event Study)
**适用**:检验平行趋势假设;展示处理效应的动态路径(预期效应、短期冲击、长期效应)。
```latex
% ── 事件研究方程 ─────────────────────────────────────
\begin{equation}\label{eq:eventstudy}
Y_{it} = \alpha_i + \lambda_t
+ \sum_{\substack{k = k_{\min} \\ k \neq -1}}^{k_{\max}}
\beta_k \cdot \mathbf{1}[\text{EventTime}_{it} = k]
+ \mathbf{X}_{it}'\boldsymbol{\gamma} + \varepsilon_{it}
\end{equation}
% EventTime_{it} = t - T_i^*:相对于个体 i 首次受处理时间 T_i^* 的事件时间
% k = -1(处理前一期)设为基准组,其系数归一化为零
% k_{\min} < 0 的系数:检验预趋势(应联合不显著于零)
% k \geq 0 的系数:处理效应的动态路径
%
% 平行趋势的可检验形式:
% H_0: \beta_k = 0 \text{ for all } k < 0
```
---
### 模型 B:Sharp RDD(精确断点回归)
**适用**:处理分配完全由连续型分配变量(Running Variable)是否超过阈值 $c$ 决定。
```latex
% ── Sharp RDD 主方程 ─────────────────────────────────
\begin{equation}\label{eq:rdd_sharp}
Y_i = \tau \cdot D_i + f(X_i - c) + \varepsilon_i
\end{equation}
% 其中:
% D_i = \mathbf{1}[X_i \geq c] \quad \text{(处理变量,精确服从)}
% f(\cdot) \quad \text{(分配变量的灵活函数,通常为局部线性)}
% c \quad \text{(阈值,对应 X_i - c = 0)}
%
% 局部线性规格(推荐):
\begin{equation}\label{eq:rdd_ll}
Y_i = \alpha_0 + \tau \cdot D_i
+ \beta_1 (X_i - c)
+ \beta_2 (X_i - c) \cdot D_i
+ \varepsilon_i, \quad |X_i - c| \leq h
\end{equation}
% 带宽 h 由 \texttt{rdbwselect}(MSE 最优或 CER 最优)选择
% 参数 \tau 估计阈值处的局部平均处理效应(LATE at cutoff)
% ── 连续性假设 ────────────────────────────────────────
\begin{assumption}[Continuity at Cutoff]
E[Y_i(0) \mid X_i = x] \text{ 和 } E[Y_i(1) \mid X_i = x]
\text{ 在 } x = c \text{ 处连续}
\end{assumption}
% 等价于:分配变量在阈值处的密度无跳跃(McCrary 检验)
```
---
### 模型 B':Fuzzy RDD(模糊断点回归)
**适用**:超过阈值显著提高处理概率,但非完全服从(部分个体跨越阈值但未受处理,或未跨越阈值但受处理)。
```latex
% ── 两阶段规格 ───────────────────────────────────────
% 第一阶段:
\begin{equation}\label{eq:rdd_fuzzy_fs}
D_i = \pi_0 + \pi_1 \cdot \mathbf{1}[X_i \geq c]
+ g(X_i - c) + \nu_i, \quad |X_i - c| \leq h
\end{equation}
% 第二阶段(2SLS):
\begin{equation}\label{eq:rdd_fuzzy_ss}
Y_i = \alpha + \tau_{\text{FRD}} \cdot \hat{D}_i
+ f(X_i - c) + \varepsilon_i, \quad |X_i - c| \leq h
\end{equation}
% \tau_{\text{FRD}} = \frac{\lim_{x \to c^+} E[Y_i \mid X_i=x] - \lim_{x \to c^-} E[Y_i \mid X_i=x]}
% {\lim_{x \to c^+} E[D_i \mid X_i=x] - \lim_{x \to c^-} E[D_i \mid X_i=x]}
% 估计 Compliers 的局部平均处理效应(LATE)
```
---
### 模型 C:工具变量(2SLS)
**适用**:处理变量 D 内生,存在有效工具变量 Z(相关性 + 排他性限制)。
```latex
% ── 第一阶段 ─────────────────────────────────────────
\begin{equation}\label{eq:iv_first}
D_i = \pi_0 + \pi_1 Z_i + \mathbf{X}_i'\boldsymbol{\delta} + \nu_i
\end{equation}
% ── 第二阶段 ─────────────────────────────────────────
\begin{equation}\label{eq:iv_second}
Y_i = \beta_0 + \beta_1 \hat{D}_i + \mathbf{X}_i'\boldsymbol{\gamma} + \varepsilon_i
\end{equation}
% \hat{D}_i = \hat{\pi}_0 + \hat{\pi}_1 Z_i + \mathbf{X}_i'\hat{\boldsymbol{\delta}}
% \quad \text{(第一阶段拟合值)}
%
% IV 有效性的三个条件:
% 1. 相关性(Relevance): \pi_1 \neq 0,经验检验:F_{\text{first stage}} > 10
% 2. 排他性(Exclusion): Z_i \perp \varepsilon_i \mid \mathbf{X}_i
% (Z 仅通过 D 影响 Y,不可直接检验,需理论论证)
% 3. 单调性(Monotonicity):Z_i = 1 \Rightarrow D_i(1) \geq D_i(0)
% (无"Defiers",LATE 解释的前提)
%
% \beta_1 \text{ 估计 Compliers 的 LATE(本地平均处理效应)}
% ── 简约式(Reduced Form) ────────────────────────────
\begin{equation}\label{eq:iv_rf}
Y_i = \rho_0 + \rho_1 Z_i + \mathbf{X}_i'\boldsymbol{\phi} + \eta_i
\end{equation}
% \hat{\beta}_1^{\text{IV}} = \hat{\rho}_1 / \hat{\pi}_1
% 简约式应在论文中单独汇报,供读者核实
```
---
### 模型 D:面板固定效应(Panel FE)
**适用**:处理变量在个体内存在时变,个体固定效应足以控制截面异质性,无显著内生性(或已排除)。
```latex
% ── TWFE 主方程 ──────────────────────────────────────
\begin{equation}\label{eq:pfe}
Y_{it} = \alpha_i + \lambda_t
+ \boldsymbol{\beta}' \mathbf{D}_{it}
+ \mathbf{X}_{it}'\boldsymbol{\gamma}
+ \varepsilon_{it}
\end{equation}
% \alpha_i:个体固定效应(消除个体层面遗漏变量偏误)
% \lambda_t:时间固定效应(控制共同时间趋势)
% \mathbf{D}_{it}:关键解释变量向量(可为多个政策变量)
% \mathbf{X}_{it}:时变控制变量
%
% 外生性假设(严格外生性):
\begin{assumption}[Strict Exogeneity]
E[\varepsilon_{it} \mid \mathbf{D}_{i1}, \ldots, \mathbf{D}_{iT},
\mathbf{X}_{i1}, \ldots, \mathbf{X}_{iT}, \alpha_i] = 0
\end{assumption}
% 若严格外生性不满足(如 D 受滞后 Y 影响),需考虑 Arellano-Bond GMM
% ── Hausman 检验(FE vs RE) ──────────────────────────
% H_0: \text{随机效应模型一致}(\alpha_i \perp \mathbf{D}_{it})
% H_1: \text{固定效应模型必要}(\alpha_i \text{ 与 } \mathbf{D}_{it} 相关)
% 若拒绝 H_0,使用 FE;否则 RE 更有效率
```
---
### 模型 E:合成控制(Synthetic Control)
**适用**:单一处理单元(或少数处理单元)、较长预处理期、处理组与任何单一控制单元匹配不理想。
```latex
% ── 合成控制估计量 ────────────────────────────────────
% 构造控制组合权重 w_j^* 使得合成控制单元与处理单元在预处理期最相似:
\begin{equation}\label{eq:synth_weights}
(w_2^*, \ldots, w_{J+1}^*) = \arg\min_{\mathbf{w}}
\left\| \mathbf{X}_1 - \mathbf{X}_0 \mathbf{w} \right\|_V
\end{equation}
% \mathbf{X}_1:处理单元的预处理期结果和预测变量向量
% \mathbf{X}_0:控制池中各供体单元的对应向量
% \mathbf{V}:对预测变量的重要性加权矩阵(由外层优化确定)
% 约束:w_j \geq 0,\sum_j w_j = 1
% ── 处理效应估计 ──────────────────────────────────────
\begin{equation}\label{eq:synth_effect}
\hat{\alpha}_{1t} = Y_{1t} - \hat{Y}_{1t}^N
= Y_{1t} - \sum_{j=2}^{J+1} w_j^* Y_{jt},
\quad t > T_0
\end{equation}
% Y_{1t} 处理单元在 t 期的实际结果
% \hat{Y}_{1t}^N 合成控制(反事实)结果
% \hat{\alpha}_{1t} t 期的处理效应估计(可展示随时间变化的动态路径)
% ── 推断(排列检验) ─────────────────────────────────
% 对控制池中每个供体单元重复合成控制过程("安慰剂"),
% p 值 = 处理单元的 MSPE 比 ≥ 该比值的供体单元比例
```
---
## Step 3:标准误策略
根据数据结构和识别策略,明确推荐标准误类型,并给出理由:
| 数据结构 | 推荐 SE 类型 | 关键考量 |
|---------|------------|---------|
| 纯截面,同质方差 | OLS SE | 需通过 Breusch-Pagan 检验 |
| 纯截面,异方差 | HC3 稳健 SE | 几乎始终优于 OLS SE |
| 面板,DiD/TWFE | 双向聚类 SE(个体 + 时间) | 控制个体内序列相关和时间截面相关 |
| 面板,处理变量在组内不变 | 按处理分配单元聚类 | 如政策以省为单位,则按省聚类 |
| RDD,带宽内样本 | 稳健 bias-corrected SE(rdrobust) | 标准 SE 低估 bias;Calonico 等 2014 |
| IV / 2SLS | 与主回归一致的聚类 SE | 两阶段需保持聚类层级一致 |
| 少量聚类(< 30) | Wild Bootstrap(cgmwildboot / boottest) | 聚类数不足时 t 近似失效 |
```latex
% 在论文中标准误说明的规范写法(放在表格 Notes 中):
% "Standard errors clustered at the [province / firm / county] level
% are reported in parentheses. *, **, *** denote significance
% at the 10\%, 5\%, and 1\% levels, respectively."
```
---
## Step 3.5:诊断规格(Diagnostic Plan)
在进入代码执行之前,必须在模型规格阶段明确**要检验什么、为什么要检验、通过标准是什么、失败如何处理**。这些决策属于建模决策,不是执行细节。
诊断规格写入 `model-spec.md §6`,供 Phase 6(`/code`)直接读取执行,无需在代码阶段重新判断。
### 第一层:识别假设诊断规格
根据 Step 1 确认的策略,从以下表格中选取对应行,填入通过 / 失败时的具体数值标准:
| 策略 | 识别假设 | 诊断检验 | 通过标准 | 失败时处理 |
|------|---------|---------|---------|----------|
| DiD / TWFE | 平行趋势 | 事件研究图 + 预趋势系数联合 F 检验 | 联合检验 p > 0.1 | 返回 Phase 3,更换对照组或时间窗口 |
| 交错 DiD | 处理效应同质性 | Bacon 分解;C-S vs TWFE 系数对比 | TWFE / C-S 主系数差 < 20% | 改用 Callaway-Sant'Anna 估计量 |
| Sharp RDD | 分配变量连续性 | McCrary 密度检验(`rddensity`) | p > 0.1 | 返回 Phase 3,质疑 RDD 设计有效性 |
| Sharp / Fuzzy RDD | 预定变量平衡 | 对协变量做 RDD | 所有协变量 p > 0.05 | 加入协变量控制;检查数据清洗步骤 |
| Fuzzy RDD | 工具变量相关性 | 第一阶段 F;Anderson-Rubin CI | F > 10;AR CI 含 0 则无效 | 质疑 RDD 断点是否有效处理力度 |
| IV / 2SLS | 工具变量相关性 | 第一阶段 F(Stock-Yogo 临界值 16.38) | F > 10 | 寻找更强工具变量或改用 LIML |
| IV / 2SLS | 过识别约束(多 IV)| Sargan / Hansen J 检验 | p > 0.05 | 讨论哪个工具变量可能违反排他性 |
| 面板 FE | FE vs RE 选择 | Hausman 检验 | p < 0.05 → 选 FE | 若 p > 0.05,报告两种估计量并讨论 |
| 面板 FE | 严格外生性 | Wooldridge 序列相关检验 | p > 0.05 | 加滞后因变量;考虑 AB-GMM |
| 合成控制 | 预处理期拟合 | 预处理期 RMSPE 相对于供体分布 | 处理单位 RMSPE < 供体中位数 × 2 | 调整供体池;重新选择预测变量权重 |
### 第二层:统计假设诊断规格
以下检验对所有策略通用,无论识别策略如何均须执行。**注意:失败不等于终止流程——取决于失败项与 SE 策略的配合情况。**
| 检验 | 检验内容 | 通过标准 | 失败时处理 | 是否终止流程 |
|------|---------|---------|----------|------------|
| VIF(最大值)| 多重共线性 | VIF < 10 | 删除或合并共线变量,更新 §5 控制变量列表 | ⚠️ 视情况 |
| Breusch-Pagan | 异方差 | p > 0.05 | 已指定稳健/聚类 SE → 标记 WARN,继续 | 否(若已用稳健 SE)|
| Breusch-Godfrey | 序列相关(截面)| p > 0.05 | 加滞后因变量或使用 Newey-West SE | ⚠️ 视情况 |
| Wooldridge | 序列相关(面板)| p > 0.05 | 双向聚类 SE 或 AR(1) 误差 | 否(若已双向聚类)|
| RESET | 函数形式误设 | p > 0.05 | 尝试对数变换或加二次项;写入稳健性清单 | 否 |
| Cook's D(最大)| 强影响观测值 | < 4/N | 列出高影响观测值;纳入 Phase 8 稳健性 | 否 |
**关键原则:**
- **识别假设失败(第一层)→ 立即终止,返回对应阶段**:识别失败意味着因果解释无效,后续结果无意义
- **统计假设警告(第二层)→ 记录于 `model-spec.md §6`,继续**:统计问题通常可通过 SE 策略或稳健性检验应对
---
## Step 4:调用估计技能
根据 Step 1 确认的策略类型,调用对应的专属估计 skill:
| 策略 | 调用的 skill | 传入上下文 |
|------|------------|----------|
| DiD / TWFE / Event Study | `did-analysis` | 主方程规格、FE 层级、SE 聚类层级、事件窗口 |
| Sharp / Fuzzy RDD | `rdd-analysis` | 运行变量名、阈值值、带宽选择方法、局部多项式阶次 |
| 2SLS / IV | `iv-estimation` | 工具变量名、第一/第二阶段规格、SE 策略 |
| Panel FE | `panel-data` | FE 层级、Hausman 检验、SE 策略 |
| 合成控制 | `synthetic-control` | 供体池列表、预处理期、预测变量权重矩阵 |
| OLS + FE | `ols-regression` | 主方程规格、控制变量、SE 类型 |
调用格式:
> *"现在调用 `[skill 名称]`,执行 [识别策略名称] 估计。传入以下参数:*
> *主方程:[LaTeX 公式中文概述]*
> *标准误:[聚类层级]*
> *数据路径:`data/clean/[project_name]_clean.[dta|parquet]`"*
---
## Step 5:产出 `model-spec.md`
估计 skill 完成后,将本阶段所有决策整合为阶段交付文档:
```
Write: [workspace]/model-spec.md
```
**文档结构:**
```markdown
# Model Specification
**项目:** [研究问题一句话]
**版本:** v1.0
**日期:** [YYYY-MM-DD]
---
## 1. 识别策略
- **策略类型:** [DiD / RDD / IV / Panel FE / Synthetic Control]
- **目标参数:** [ATE / ATT / LATE,含经济学含义]
- **识别变量:** [Z / 运行变量 / 政策时点]
## 2. 主方程
[LaTeX 方程,直接可粘贴至论文]
$$
[主方程 LaTeX]
$$
**参数说明:**
| 符号 | 含义 |
|------|------|
| $Y_{it}$ | [结果变量描述] |
| $\beta$ | [处理效应的经济学解释,含量级] |
| ... | ... |
## 3. 识别假设
| 假设 | 形式化表述 | 诊断状态 |
|------|-----------|---------|
| [假设名] | $[LaTeX]$ | ✅ 通过 / ⚠️ 存疑 / ❌ 不满足 |
诊断状态来源于 `data-report.md` 的 EDA 结果(平行趋势图、McCrary 检验等)。
## 4. 标准误策略
- **类型:** [聚类 SE / 稳健 SE / Wild Bootstrap]
- **聚类层级:** [省 / 企业 / 县]
- **理由:** [经济学依据]
## 5. 控制变量
[变量列表及加入理由;注明哪些变量因平衡性不足被强制加入]
## 6. 诊断计划(Diagnostic Plan)
> 本节由 `/model` Step 3.5 生成,供 `/code` Phase 6 直接读取执行。不得在代码阶段修改诊断逻辑。
### 6.1 第一层:识别假设诊断
| 假设 | 检验方法 | 通过标准 | 失败时处理 |
|------|---------|---------|----------|
| [填入该策略对应的识别假设] | [检验名称] | [具体数值标准] | [返回阶段 + 修正方向] |
### 6.2 第二层:统计假设诊断
| 检验 | 通过标准 | 失败时处理 | 是否终止 |
|------|---------|----------|--------|
| VIF(最大值)| < 10 | [处理方式] | ⚠️ |
| Breusch-Pagan | p > 0.05 | 已用 [SE类型] → WARN,继续 | 否 |
| Breusch-Godfrey / Wooldridge | p > 0.05 | [处理方式] | ⚠️ |
| RESET | p > 0.05 | [处理方式] | 否 |
| Cook's D | < 4/N | 纳入 Phase 8 稳健性 | 否 |
### 6.3 诊断输出
`/code` 执行后生成 `diagnostic_report.md`,汇总两层诊断结果(✅ / ⚠️ / ❌)。
⚠️ 项自动纳入 Phase 8 `/robustness` 检验清单;❌ 项触发流程终止并返回对应阶段。
## 7. 数据质量预警与应对
[来自 data-report.md 的预警事项,及在模型设计中的应对方式]
## 8. 待执行的稳健性检验
[进入 Phase 8 前预先列出的稳健性检验清单,如:
- 替换控制变量集合
- 更换带宽(RDD)
- 安慰剂检验
- 剔除特殊样本]
```
---
## Step 6:阶段确认与移交
向用户呈现阶段摘要:
> **Phase 5 产出摘要**
>
> ✅ 主方程(LaTeX):`model-spec.md` §2
> ✅ 识别假设形式化:`model-spec.md` §3
> ✅ 标准误策略:`model-spec.md` §4
> ✅ 估计代码:由 `[skill 名称]` 生成,见 `code/`
> ✅ 模型规格文档:`model-spec.md`
>
> ---
>
> 待您确认模型规格后,进入 Phase 6(代码执行与复现)plot
Publication Polish. Runs after results-analysis (Phase 7). Audits all tables and figures produced in Phases 4–7, upgrades them to top-journal standards by calling the table and figure skills.
# /plot — 表格和图表可视化
## 定位
`/plot` 是 Phase 7(`results-analysis`)与 Phase 9(`/write`)之间的**图表可视化阶段**。将 Phase 4–7 产出的所有功能性表格和图形,统一升级为符合 Top5 顶刊排版标准的最终版本,并检查是否有识别策略要求的标准图形尚未生成。
完成后产出**输出清单 `output-manifest.md`**,供 `/write`(Phase 9)直接引用,不再需要手动查找文件路径。
---
## Step 0:读取上下文
```
Read: [workspace]/results-memo.md # Phase 7 产出:识别策略、主要结果、建议图表
Read: [workspace]/model-spec.md # 识别策略类型(决定标准图形集合)
Read: [workspace]/data-report.md # 变量名、样本定义
```
同时扫描已有的输出目录:
```python
import os
from pathlib import Path
tables = sorted(Path("tables").glob("*"))
figures = sorted(Path("figures").glob("*"))
print("=== 现有表格 ===")
for f in tables: print(f" {f.name}")
print("\n=== 现有图形 ===")
for f in figures: print(f" {f.name}")
```
根据识别策略,对照**标准输出清单**(见 Step 1)检查缺漏,输出审计报告:
```
=== 输出审计 ===
策略:[识别策略]
表格
✅ table1_descriptive.tex — 已有,待精修
✅ table2_balance.tex — 已有,待精修
✅ table_main.tex — 已有,待精修
❌ table_event_study.tex — 缺失,需生成
图形
✅ eda_did_trend.png — 已有,待升级为 PDF
❌ figure_event_study.pdf — 缺失,需生成(DiD 标准图)
✅ coef_main.png — 已有,待升级
```
---
## Step 1:按识别策略确定标准输出集合
> 📎 **输出格式规范参见 [`shared/output-standards.md`](../shared/output-standards.md)**
不同识别策略有约定俗成的"必备图表",缺少会影响审稿通过率:
### 标准表格集合(所有策略共用)
| 编号 | 文件名 | 内容 | 来源阶段 |
|------|--------|------|---------|
| Table 1 | `table1_descriptive.tex` | 描述性统计 | Phase 4 `results-analysis` |
| Table 2 | `table2_balance.tex` | 平衡性检验(若有处理变量)| Phase 4 `results-analysis` |
| Table 3 | `table_main.tex` | 主回归结果(多列规格)| Phase 6 `/code` |
| Table A1 | `table_robustness.tex` | 稳健性检验汇总 | Phase 8 `/robustness`(待生成)|
### 标准图形集合(按识别策略)
#### DiD / TWFE
| 编号 | 文件名 | 内容 | 优先级 |
|------|--------|------|--------|
| Figure 1 | `figure_event_study.pdf` | 事件研究系数图(动态处理效应)| **必须** |
| Figure 2 | `figure_parallel_trends.pdf` | 处理组 vs 控制组时间趋势 | **必须** |
| Figure A1 | `figure_bacon_decomp.pdf` | Bacon 分解(交错 DiD)| 若为交错设计 |
#### RDD
| 编号 | 文件名 | 内容 | 优先级 |
|------|--------|------|--------|
| Figure 1 | `figure_rdd_main.pdf` | RDD 主图:散点 + 两侧拟合线 + 跳跃 | **必须** |
| Figure 2 | `figure_mccrary.pdf` | McCrary 密度检验图 | **必须** |
| Figure A1 | `figure_rdd_bandwidth.pdf` | 带宽敏感性图 | 推荐 |
#### IV / 2SLS
| 编号 | 文件名 | 内容 | 优先级 |
|------|--------|------|--------|
| Figure 1 | `figure_iv_first_stage.pdf` | 第一阶段:Z vs D 散点 + 拟合线 | **必须** |
| Figure 2 | `figure_iv_reduced_form.pdf` | 简约式:Z vs Y 散点 + 拟合线 | **必须** |
#### Panel FE
| 编号 | 文件名 | 内容 | 优先级 |
|------|--------|------|--------|
| Figure 1 | `figure_coef_main.pdf` | 主系数图(多规格并列)| **必须** |
| Figure 2 | `figure_fe_variance.pdf` | Within/Between 方差分解 | 推荐 |
#### 合成控制
| 编号 | 文件名 | 内容 | 优先级 |
|------|--------|------|--------|
| Figure 1 | `figure_sc_gap.pdf` | Gap 图:处理单元 vs 合成控制 | **必须** |
| Figure 2 | `figure_sc_placebo.pdf` | In-space 安慰剂检验图 | **必须** |
| Figure 3 | `figure_sc_weights.pdf` | 供体权重条形图 | 推荐 |
---
## Step 2:表格精修(调用 `table` skill)
对 Step 0 审计中标注"✅ 待精修"的所有 `.tex` 表格,调用 `table` skill 执行以下标准化操作:
**调用格式:**
> *"调用 `table` skill,精修 `[文件名]`。要求:*
> - *格式标准:booktabs(`\toprule` / `\midrule` / `\bottomrule`),无竖线*
> - *多列对齐:系数列居中,标准误括号行与系数行紧邻*
> - *显著性标注:\*, \*\*, \*\*\*(10%/5%/1%),置于系数右上角*
> - *Notes 行:说明标准误类型、聚类层级、样本限制*
> - *宽度:`\textwidth` 自适应或指定 `tabular` 列格式*
> - *输出:覆盖原文件,同时保存 `_final` 后缀版本备份"*
---
## Step 3:图形精修与生成(调用 `figure` skill)
对审计中"✅ 待升级"的图形执行精修,对"❌ 缺失"的标准图形从头生成。
**调用格式:**
> *"调用 `figure` skill,[精修/生成] `[图形名称]`。要求:*
> - *期刊标准:AER/QJE 风格(无顶框/右框,浅灰网格)*
> - *尺寸:单栏 3.5 in × 2.8 in / 双栏 7 in × 4 in*
> - *字体大小:轴标签 10pt,标题 11pt,图注 9pt*
> - *颜色:灰度优先,彩色图须灰度可读(无红绿对)*
> - *置信区间:阴影带(`fill_between`)或误差棒,透明度 0.2*
> - *输出格式:PDF(矢量,投稿用)+ PNG 300 DPI(草稿用)*
> - *文件名:`figures/[figure_name].pdf` + `.png`"*
各策略核心图形的具体实现代码(DiD 事件研究图、RDD 主图、合成控制 Gap 图、多规格系数图)统一维护在 **`skills/figure/SKILL.md`**。调用 `figure` skill 时传入图形名称和识别策略,由该 skill 负责选择对应模板并执行。
---
## Step 4:产出 `output-manifest.md`
精修完成后,生成完整的输出清单,供 `/write` 直接引用:
```
Write: [workspace]/output-manifest.md
```
**文档结构:**
```markdown
# Output Manifest
**项目:** [研究问题一句话]
**生成日期:** [YYYY-MM-DD]
**精修标准:** AER / QJE / JPE
---
## 表格
| 论文编号 | 文件路径 | 内容 | LaTeX 标签 | 状态 |
|---------|---------|------|-----------|------|
| Table 1 | `tables/table1_descriptive.tex` | 描述性统计 | `\ref{tab:descriptive}` | ✅ 精修完成 |
| Table 2 | `tables/table2_balance.tex` | 平衡性检验 | `\ref{tab:balance}` | ✅ 精修完成 |
| Table 3 | `tables/table_main.tex` | 主回归结果 | `\ref{tab:main}` | ✅ 精修完成 |
| Table A1 | `tables/table_robustness.tex` | 稳健性检验 | `\ref{tab:robustness}` | ⏳ Phase 8 后生成 |
## 图形
| 论文编号 | 文件路径(PDF) | 内容 | LaTeX 标签 | 状态 |
|---------|--------------|------|-----------|------|
| Figure 1 | `figures/figure_event_study.pdf` | 事件研究图 | `\ref{fig:eventstudy}` | ✅ 精修完成 |
| Figure 2 | `figures/figure_parallel_trends.pdf` | 平行趋势图 | `\ref{fig:trends}` | ✅ 精修完成 |
| Figure A1 | `figures/figure_bacon_decomp.pdf` | Bacon 分解 | `\ref{fig:bacon}` | ⏳ Phase 8 后生成 |
## /write 引用说明
在论文中插入表格:
```latex
\input{tables/table_main.tex}
```
在论文中插入图形:
```latex
\begin{figure}[htbp]
\centering
\includegraphics[width=\textwidth]{figures/figure_event_study.pdf}
\caption{Dynamic Treatment Effects}
\label{fig:eventstudy}
\end{figure}
```
```
---
## Step 5:阶段确认与移交
向用户呈现精修摘要:
> **Phase 7 产出摘要**
>
> ✅ 精修表格:[N] 个(`.tex`,booktabs 规范)
> ✅ 精修图形:[N] 个(`.pdf` 矢量 + `.png` 300 DPI)
> ⏳ 待 Phase 8 补充:稳健性检验表格、异质性图形
> ✅ 输出清单:`output-manifest.md`(含 LaTeX 引用代码)
>
> ---
>
> **下一步选择:**
> - 进入 **Phase 8**(`/robustness`)—— 稳健性、异质性与机制检验
> - 或直接进入 **Phase 9**(`/write`)—— 若稳健性检验已在 `/code` 阶段完成
---
## 常见问题处理
**Q:图形中文字用中文还是英文?**
论文图形一律用**英文**(变量名、轴标签、图注),以备直接投稿。描述性文字和分析讨论用中文或英文均可,但图形本身须为英文。
**Q:PDF 与 PNG 都需要吗?**
投稿期刊用 **PDF**(矢量,无限缩放),Word 稿件或 Overleaf 草稿用 **PNG**(300 DPI 足够屏幕和打印)。两者同步生成,按需取用。
**Q:表格宽度超出单栏怎么办?**
若变量多、列数多,调用 `table` skill 使用 `\resizebox{\textwidth}{!}{...}` 或 `landscape` 环境;或将控制变量行改为"Yes/No"标记,折叠系数展示。
**Q:Phase 8 的稳健性图表如何纳入?**
`output-manifest.md` 中已为 Phase 8 输出预留 ⏳ 占位符。`/robustness` 完成后,再次调用 `/plot` 将新图表加入精修流程,并更新 `output-manifest.md`。present
Phase 10 (optional) Beamer-Style PPTX Generation. Reads all upstream outputs and paper/sections/, maps research content to a presentation-type-specific slide outline, calls beamer-ppt skill to generate a Beamer-style .pptx file (navy-blue Metropolis theme, publication-quality), and produces slides/ directory.
# /present — 学术报告幻灯片生成
## 定位
`/present` 是实证研究工作流的**可选阶段**,通常在 Phase 9(`/write`)完成后运行,也可在初稿完成前并行进行。核心职责:
1. 读取所有上游输出,提取幻灯片所需的关键内容
2. 根据演示类型(会议报告 / 学术讲座 / 求职报告)确定幻灯片数量和深度
3. 调用 `beamer-ppt` skill,用 python-pptx 生成 **Beamer 学术风格的 `.pptx` 文件**(海军蓝配色、16:9 宽屏、标题栏+进度条)
4. 验证幻灯片结构并输出文件
---
## Step 0:读取上游输出
### 0.1 读取文件
```
Read: [workspace]/research-question.md
Read: [workspace]/results-memo.md
Read: [workspace]/robustness-report.md
Read: [workspace]/model-spec.md
Read: [workspace]/data-report.md
```
若 `paper/sections/` 目录存在,同时读取:
```
Read: [workspace]/paper/sections/introduction.tex # 提取贡献定位
Read: [workspace]/paper/sections/results.tex # 提取主要发现措辞
```
**若 `results-memo.md` 不存在**,立即停止:
> *"未找到 `results-memo.md`。幻灯片的核心内容(主系数、量级解读、识别可信度)来自 Phase 7 产出。请先完成 Phase 7(`results-analysis` skill),再运行 `/present`。"*
### 0.2 提取幻灯片关键信息
从各文件中提取以下内容,作为幻灯片**内容素材库**:
| 来源 | 提取内容 | 用于哪页幻灯片 |
|------|---------|--------------|
| `research-question.md` | 研究问题一句话、政策背景数字 | Motivation、This Paper |
| `model-spec.md §1–2` | 识别策略名称、主方程(简化版)| Identification Strategy |
| `model-spec.md §3` | 识别假设(可检验形式)| Identification Strategy |
| `data-report.md §2` | 样本量 N、时间跨度、数据来源 | Data slide |
| `results-memo.md §1` | 主系数 β̂、SE、p 值 | Main Results |
| `results-memo.md §2` | 经济显著性(量级换算)| Main Results(Takeaway) |
| `results-memo.md §4` | 识别可信度 + 因果语言建议 | 全程语言控制 |
| `robustness-report.md §1` | 稳健性一句话结论 | Robustness slide |
| `robustness-report.md §2` | 关键异质性发现 | Heterogeneity slide |
| `robustness-report.md §3` | 机制证据 | Mechanism slide(若有)|
---
## Step 1:演示参数确认
### 1.1 演示类型
向用户确认:
> *"请选择演示类型(决定幻灯片数量和深度):*
>
> **A. 15–20 分钟 会议报告**(≤ 15 张幻灯片)
> — NBER/AEA 分会场报告;时间有限,直奔主题
>
> **B. 45–60 分钟 学术讲座/研讨班**(≤ 30 张幻灯片)
> — 院系 seminar;受众深度参与,需完整展示方法细节
>
> **C. 求职报告(Job Market Talk)**(≤ 20 张幻灯片)
> — 求职季使用;精炼呈现贡献、方法和主要结果"*
### 1.2 受众确认
> *"目标受众的领域背景:*
> *A. 同领域经济学家(可用技术语言和方程)*
> *B. 跨领域经济学家(需简化技术细节,强调直觉)*
> *C. 政策制定者 / 非经济学家(需要去专业化,强调政策含义)"*
### 1.3 主题风格
> *"幻灯片主题(或接受推荐):*
> *A. Metropolis(推荐)— 简洁现代,当前学术界最流行*
> *B. 极简自定义(海军蓝 + 白色)— 稳重,适合求职和顶刊作者*
> *C. Madrid — 传统学术风格,适合保守场合*"*
---
## Step 2:幻灯片大纲设计
根据 Step 1 确认的演示类型,生成**幻灯片大纲**并向用户确认后再生成 LaTeX。
### 标准大纲(各类型幻灯片数量对照)
| 幻灯片板块 | A(15 min,≤15)| B(45 min,≤30)| C(求职,≤20)|
|----------|----------------|----------------|--------------|
| 标题页 | 1 | 1 | 1 |
| Motivation | 1–2 | 2–3 | 2–3 |
| This Paper(贡献预告)| 1 | 1 | 1 |
| Related Literature | — | 1–2 | 1–2 |
| Data | 1 | 2 | 2 |
| Identification Strategy | 2 | 3–4 | 3 |
| Main Results | 3 | 5–7 | 4–5 |
| Robustness | 1 | 2–3 | 2 |
| Heterogeneity & Mechanism | — | 2–3 | 1–2 |
| Conclusion / Takeaways | 1 | 1 | 1 |
| **合计上限** | **≤ 15** | **≤ 30** | **≤ 20** |
向用户呈现大纲并确认:
```
═══════════════════════════════════════════════════
演示大纲([类型],共 [N] 张,上限 [L] 张)
═══════════════════════════════════════════════════
[1] Title
[2] Motivation: [研究问题背景]
[3] This Paper: [一句话问题 + 一句话发现]
[4] Data & Sample
[5] Identification Strategy
[6] Validity Check: [策略对应的诊断图/检验]
[7] Main Results(表格)
[8] Main Results(图形/事件研究)
[9] Economic Magnitude
[10] Robustness
[11] Takeaways
═══════════════════════════════════════════════════
```
待用户确认或修改大纲后,进入 Step 3 确认各页内容规范,再到 Step 4 生成 PPTX 文件。
---
## Step 3:各类幻灯片内容规范
> 📎 **输出格式规范参见 [`shared/output-standards.md`](../shared/output-standards.md)**
以下规范是经济学学术报告的约定俗成,逐条实施于每张幻灯片的内容生成。
---
### 3.1 标题页(Title Slide)
```latex
\begin{frame}[plain]
\titlepage
\end{frame}
```
标题格式要求:
- 论文标题:冒号后接副标题,副标题描述识别策略或数据
- 好标题:"The Effect of [X] on [Y]: Evidence from [Natural Experiment]"
- 差标题:"An Empirical Study of X and Y"(过于笼统)
- 作者行:姓名、机构(单位两行以内)
- 日期行:会议/研讨班名称 + 年月
---
### 3.2 Motivation(动机幻灯片,1–2 张)
**第 1 张:为什么关心这个问题**
每个条目不超过一行,用数字锚定重要性:
```latex
\begin{frame}{Why Does [TOPIC] Matter?}
\begin{itemize}
\item<1-> \textbf{[政策/社会重要性]:} [一句话 + 具体数字]
\item<2-> \textbf{[经济学理论相关性]:} [一句话]
\item<3-> \textbf{[识别缺口]:} Existing work relies on [OLS/cross-section]
— likely [overstates/understates] true effect
\end{itemize}
\vspace{1em}
\only<4>{
\begin{alertblock}{This paper's question}
Does [D] causally affect [Y]?
\end{alertblock}
}
\end{frame}
```
**禁忌**:
- ❌ 超过 5 个条目
- ❌ 只有文字没有数字("很多研究关注这个问题")
- ❌ 第一张就放方程
---
### 3.3 This Paper(核心预告幻灯片,1 张,必须)
**这是经济学幻灯片中最重要的一张**,通常是第 2–3 张,让听众在 5 分钟内知道这场报告值不值得继续听。
必须包含:
1. 研究问题(一句话)
2. 识别策略(一句话,直觉层面)
3. 主要发现(一句话,含量级数字)
4. 贡献定位(1–2 条,简洁)
```latex
\begin{frame}{This Paper}
\textbf{Question:} Does [D] causally affect [Y]?
\vspace{0.8em}
\textbf{What we do:} Exploit [NATURAL EXPERIMENT / INSTRUMENT / CUTOFF]
to isolate exogenous variation in [D].
\quad $\Rightarrow$ [DATA SOURCE], [N] [UNITS], [YEARS]
\vspace{0.8em}
\textbf{Main finding:} [D] [increases/decreases] [Y] by
\alert{[MAGNITUDE] [UNITS/\%]},
[significant at X\% level].
\vspace{0.8em}
\textbf{Contribution:}
\begin{enumerate}
\item First causal evidence on [TOPIC] using [METHOD]
\item [SECOND CONTRIBUTION, if any — be brief]
\end{enumerate}
\end{frame}
```
---
### 3.4 Data(数据幻灯片,1–2 张)
**第 1 张:数据来源与样本(来自 `data-report.md §2`)**
```latex
\begin{frame}{Data}
\begin{columns}
\begin{column}{0.5\textwidth}
\textbf{Data Sources:}
\begin{itemize}
\item [数据集1]:[描述,年份]
\item [数据集2]:[描述,链接方式]
\end{itemize}
\vspace{0.5em}
\textbf{Sample:}
\begin{itemize}
\item Unit: [个体/企业/县/省]
\item Period: [起止年份]
\item N = [样本量,加逗号分隔]
\end{itemize}
\end{column}
\begin{column}{0.5\textwidth}
\begin{table}
\caption{Summary Statistics}
\tiny
\begin{tabular}{lcc}
\toprule
Variable & Mean & SD \\
\midrule
[Y var] & [值] & [值] \\
[D var] & [值] & [值] \\
[Control 1] & [值] & [值] \\
\bottomrule
\end{tabular}
\end{table}
\end{column}
\end{columns}
\end{frame}
```
---
### 3.5 Identification Strategy(识别策略,2–3 张)
**第 1 张:直觉层面的识别逻辑(非技术)**
先给直觉,再给方程——不要一上来就放方程:
```latex
\begin{frame}{Identification Challenge}
\textbf{Problem:} [D] is endogenous because...
\begin{itemize}
\item \textbf{OVB:} [一句话描述混淆路径]
\item \textbf{Reverse causality:} [若适用]
\end{itemize}
\vspace{0.8em}
\textbf{Our solution:} Exploit [外生变异来源]
\begin{center}
\begin{tikzpicture}[node distance=2cm, auto]
% 简洁 DAG:D → Y,Z → D,Z ⊥ Y|D
\node (Z) {$Z$};
\node (D) [right of=Z] {$D$};
\node (Y) [right of=D] {$Y$};
\node (U) [above of=D] {$U$ (unobserved)};
\draw[->] (Z) -- (D) node[midway,below]{\tiny First stage};
\draw[->] (D) -- (Y);
\draw[->, dashed] (U) -- (D);
\draw[->, dashed] (U) -- (Y);
\draw[->, red, thick] (Z) to[bend left=40]
node[above]{\tiny \textcolor{red}{Excluded}} (Y);
\end{tikzpicture}
\end{center}
\end{frame}
```
**第 2 张:识别假设 + 主方程**
```latex
\begin{frame}{Empirical Specification}
\textbf{Key assumption:} [识别假设一句话直觉表述]
\vspace{0.5em}
\begin{block}{Main Equation}
\begin{equation*}
[主方程 LaTeX — 来自 model-spec.md §2,简化版]
\end{equation*}
\end{block}
\vspace{0.5em}
\begin{itemize}
\item $[β]$: [系数的经济学含义]
\item SE clustered at [聚类层级]
\item [FE 说明]
\end{itemize}
\vspace{0.5em}
\textbf{Evidence for assumption:} $\Rightarrow$ [下一张幻灯片]
\end{frame}
```
**第 3 张:识别假设检验(策略特异性)**
| 识别策略 | 专属视觉化幻灯片 | 图形文件 |
|---------|---------------|---------|
| DiD / 交错 DiD | 事件研究图(Event Study Plot)| `figures/fig*_event_study.pdf` |
| RDD | 断点处散点图(Binscatter at Cutoff)| `figures/fig*_rdd_binscatter.pdf` |
| IV / 2SLS | 第一阶段散点图(First Stage Scatter)| `figures/fig*_iv_scatter.pdf` |
| 面板 FE | 平行趋势目视图 | `figures/fig*_parallel_trends.pdf` |
| 合成控制 | 预处理期拟合图 | `figures/fig*_sc_gap.pdf` |
```latex
% DiD 事件研究图幻灯片示例
\begin{frame}{Parallel Trends: Event Study}
\begin{center}
\includegraphics[width=0.85\textwidth]{figures/fig01_event_study.pdf}
\end{center}
\vspace{-0.5em}
{\small \textit{Notes:} Coefficients from Eq.~(\ref{eq:eventstudy}).
95\% CI. Pre-period joint F-test: $p = [值]$ (cannot reject $H_0$: all pre-period coefficients $= 0$).}
\end{frame}
```
---
### 3.6 Main Results(主要结果,2–3 张)
**原则:用图而不是表,如果必须用表则精简到极致。**
**第 1 张:主结果表(精简版)**
幻灯片中的回归表**必须精简**,规则:
- 最多 3–4 列(不要把完整的 6 列表格搬上来)
- 只保留主系数行 + SE + N + FE 指示器
- 用 `\alert{}` 或加粗突出首选规格的主系数
- 字号:`\footnotesize` 或 `\scriptsize`
```latex
\begin{frame}{Main Results}
\begin{table}
\centering
\footnotesize
\begin{tabular}{lccc}
\toprule
& (1) & (2) & \textbf{(3)} \\
& Baseline & + Controls & \textbf{Preferred} \\
\midrule
[D var] & [β̂_1]*** & [β̂_2]*** & \alert{[β̂_3]***} \\
& ([SE_1]) & ([SE_2]) & \alert{([SE_3])} \\
\midrule
FE & No & No & Yes \\
$N$ & [N1] & [N2] & [N3] \\
\bottomrule
\multicolumn{4}{l}{\tiny SE clustered at [level]. *p<0.1 **p<0.05 ***p<0.01}
\end{tabular}
\end{table}
\vspace{0.3em}
\textbf{Takeaway:} [D] [increases/decreases] [Y] by
\alert{[COEFFICIENT] [UNITS]}.
\end{frame}
```
**第 2 张:经济量级(Economic Magnitude)**
单独一张幻灯片专门讲量级——这是听众最想要的信息:
```latex
\begin{frame}{Economic Magnitude}
\begin{center}
\Large
\textbf{[D] [increases/decreases] [Y] by \alert{[MAGNITUDE]}}
\end{center}
\vspace{1em}
\begin{itemize}
\item Relative to mean [Y] of [MEAN]: \alert{[PCT]\% change}
\item In SD units: \alert{[SD-COEF] $\sigma_{Y}$}
([小/中/大] effect by Cohen's benchmark)
\item[vs.] Closest prior estimate ([AUTHOR Year]):
[他们的系数]
— our estimate is [larger/smaller] because [识别策略差异]
\end{itemize}
\vspace{0.5em}
{\small \textit{Policy implication:} [一句话政策含义]}
\end{frame}
```
**第 3 张(若 DiD):动态效应图(Event Study)**
直接引用 `figures/fig*_event_study.pdf`,比静态系数更有说服力:
```latex
\begin{frame}{Dynamic Effects}
\begin{center}
\includegraphics[width=0.85\textwidth]{figures/fig01_event_study.pdf}
\end{center}
\vspace{-0.5em}
{\small \textbf{Takeaway:} No pre-trend. Effect appears [immediately/gradually]
after treatment. [Long-run/Short-run] effect: [β̂_long] ([SE], [sig]).}
\end{frame}
```
---
### 3.7 Robustness(稳健性,1–3 张)
**原则:结论先行,不要把所有检验逐条念一遍。**
```latex
\begin{frame}{Robustness}
\textbf{Main finding is robust to:}
\begin{columns}[t]
\begin{column}{0.5\textwidth}
\textbf{Inference:}
\begin{itemize}
\item[\checkmark] HC3 robust SE
\item[\checkmark] Two-way clustered SE
\item[\checkmark] Wild bootstrap (if < 30 clusters)
\end{itemize}
\vspace{0.5em}
\textbf{Sample:}
\begin{itemize}
\item[\checkmark] Exclude top/bottom 1\% of $Y$
\item[\checkmark] Drop high-influence observations
\end{itemize}
\end{column}
\begin{column}{0.5\textwidth}
\textbf{Specification:}
\begin{itemize}
\item[\checkmark] Log transformation
\item[\checkmark] Alternative control sets
\item[\checkmark] Oster $\delta = [值] \gg 1$
\end{itemize}
\vspace{0.5em}
\textbf{Identification:}
\begin{itemize}
\item[\checkmark] Placebo [treatment/cutoff/instrument]
\item[\checkmark] [策略特异性检验]
\end{itemize}
\end{column}
\end{columns}
\vspace{0.3em}
\small Coefficients range from [MIN] to [MAX] across all specifications.
\end{frame}
```
---
### 3.8 Takeaways(结论幻灯片,1 张)
**经济学报告的最后一张幻灯片绝对不能是 "Thank you / Questions?"**,必须是内容幻灯片,让听众带着结论离开。
```latex
\begin{frame}{Takeaways}
\begin{enumerate}
\item \textbf{Causal evidence:}
[D] [causes] [Y] to [increase/decrease] by
\alert{[MAGNITUDE]}.
[量级的直觉类比]
\item \textbf{Why it matters:}
[政策含义 / 理论贡献,一句话]
\item \textbf{Mechanism:}
[机制证据一句话(若有)]
\end{enumerate}
\vspace{1.5em}
\begin{center}
\normalsize [Author Name] \quad \texttt{[email]}
\end{center}
\end{frame}
```
---
## Step 4:PPTX 文件生成(Beamer 学术风格)
调用 `beamer-ppt` skill,使用 **python-pptx** 按 Step 3 的内容规范逐张生成幻灯片,输出 `.pptx` 文件。
### 4.1 配色主题方案
按 Step 1.3 用户选定的主题使用以下 RGB 配色:
| 主题 | 标题栏背景 | 强调色 | 正文背景 |
|------|-----------|--------|---------|
| **A. Metropolis(推荐)** | 海军蓝 `(0, 35, 82)` | 强调红 `(180, 30, 30)` | 浅灰 `(245, 245, 245)` |
| **B. 极简自定义** | 海军蓝 `(0, 35, 82)` | 海军蓝 `(0, 35, 82)` | 白色 `(255, 255, 255)` |
| **C. Madrid(传统)** | 深蓝 `(31, 73, 125)` | 金色 `(189, 152, 44)` | 白色 `(255, 255, 255)` |
### 4.2 幻灯片母版基础设置
```python
from pptx import Presentation
from pptx.util import Inches, Pt, Emu
from pptx.dml.color import RGBColor
from pptx.enum.text import PP_ALIGN
from pptx.enum.shapes import MSO_SHAPE_TYPE
# 16:9 宽屏(对应 Beamer aspectratio=169)
prs = Presentation()
prs.slide_width = Inches(13.33)
prs.slide_height = Inches(7.5)
# 配色(以 Metropolis 主题为例)
NAVY = RGBColor(0, 35, 82)
RED = RGBColor(180, 30, 30)
LGRAY = RGBColor(245, 245, 245)
WHITE = RGBColor(255, 255, 255)
BLACK = RGBColor(30, 30, 30)
MGRAY = RGBColor(100, 100, 100)
```
### 4.3 标题栏辅助函数(Frame Title)
每张内容幻灯片必须有海军蓝标题栏(高 1.1 英寸,白色加粗 24pt),模拟 Beamer `\frametitle`:
```python
from pptx.util import Inches, Pt
from pptx.dml.color import RGBColor
def add_slide_bg(slide, prs, color):
"""全屏背景色块"""
bg = slide.shapes.add_shape(
1, 0, 0, prs.slide_width, prs.slide_height)
bg.fill.solid()
bg.fill.fore_color.rgb = color
bg.line.fill.background()
return bg
def add_frame_title(slide, prs, title_text, bg_color=None, fg_color=None):
"""Beamer 风格标题栏:海军蓝背景 + 白色粗体标题"""
bg_color = bg_color or RGBColor(0, 35, 82)
fg_color = fg_color or RGBColor(255, 255, 255)
bar = slide.shapes.add_shape(
1, 0, 0, prs.slide_width, Inches(1.1))
bar.fill.solid()
bar.fill.fore_color.rgb = bg_color
bar.line.fill.background()
tf = bar.text_frame
tf.word_wrap = False
p = tf.paragraphs[0]
p.text = title_text
p.font.bold = True
p.font.size = Pt(24)
p.font.color.rgb = fg_color
p.alignment = PP_ALIGN.LEFT
# 左边距
from pptx.util import Inches
tf.margin_left = Inches(0.3)
tf.margin_top = Inches(0.2)
```
### 4.4 底部进度条(Metropolis 风格)
```python
def add_progress_bar(slide, prs, current, total):
"""Metropolis 风格底部进度条(可选)"""
bar_h = Inches(0.06)
bar_top = prs.slide_height - bar_h
# 灰色底条
bg_bar = slide.shapes.add_shape(
1, 0, bar_top, prs.slide_width, bar_h)
bg_bar.fill.solid()
bg_bar.fill.fore_color.rgb = RGBColor(200, 200, 200)
bg_bar.line.fill.background()
# 海军蓝进度条
prog_w = int(prs.slide_width * current / total)
prog = slide.shapes.add_shape(1, 0, bar_top, prog_w, bar_h)
prog.fill.solid()
prog.fill.fore_color.rgb = RGBColor(0, 35, 82)
prog.line.fill.background()
```
### 4.5 逐类幻灯片生成
按 Step 2 大纲依次生成,每张幻灯片内容来自 Step 0.2 的**内容素材库**。
#### 标题页(深色背景全屏)
```python
def make_title_slide(prs, title, subtitle, author, institute, date):
slide = prs.slides.add_slide(prs.slide_layouts[6]) # blank
add_slide_bg(slide, prs, NAVY)
# 论文标题(白色,36pt,加粗,居中)
tb = slide.shapes.add_textbox(Inches(1), Inches(1.8), Inches(11.33), Inches(2))
tf = tb.text_frame; tf.word_wrap = True
p = tf.paragraphs[0]
p.text = title; p.font.bold = True
p.font.size = Pt(36); p.font.color.rgb = WHITE
p.alignment = PP_ALIGN.CENTER
# 副标题(浅灰,22pt)
p2 = tf.add_paragraph()
p2.text = subtitle; p2.font.size = Pt(22)
p2.font.color.rgb = LGRAY; p2.alignment = PP_ALIGN.CENTER
# 作者 + 机构(白色,18pt)
tb2 = slide.shapes.add_textbox(Inches(1), Inches(4.5), Inches(11.33), Inches(1.2))
tf2 = tb2.text_frame
tf2.paragraphs[0].text = author
tf2.paragraphs[0].font.size = Pt(18); tf2.paragraphs[0].font.color.rgb = WHITE
tf2.paragraphs[0].alignment = PP_ALIGN.CENTER
p3 = tf2.add_paragraph()
p3.text = institute; p3.font.size = Pt(16); p3.font.color.rgb = LGRAY
p3.alignment = PP_ALIGN.CENTER
# 日期/会议(浅灰,14pt)
tb3 = slide.shapes.add_textbox(Inches(1), Inches(6.2), Inches(11.33), Inches(0.8))
tf3 = tb3.text_frame
tf3.paragraphs[0].text = date
tf3.paragraphs[0].font.size = Pt(14); tf3.paragraphs[0].font.color.rgb = LGRAY
tf3.paragraphs[0].alignment = PP_ALIGN.CENTER
return slide
```
#### 通用内容幻灯片(条目列表)
```python
def make_content_slide(prs, title, bullets, current=None, total=None):
"""标题栏 + 条目列表正文"""
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_slide_bg(slide, prs, LGRAY)
add_frame_title(slide, prs, title)
# 正文区
tb = slide.shapes.add_textbox(
Inches(0.5), Inches(1.3), Inches(12.33), Inches(5.8))
tf = tb.text_frame; tf.word_wrap = True
for i, (lvl, text) in enumerate(bullets):
p = tf.paragraphs[i] if i == 0 else tf.add_paragraph()
p.text = text; p.level = lvl
p.font.size = Pt(20 if lvl == 0 else 17)
p.font.color.rgb = BLACK
p.space_before = Pt(6 if lvl == 0 else 3)
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
```
#### 图形幻灯片
```python
def make_figure_slide(prs, title, img_path, caption="", current=None, total=None):
"""标题栏 + 居中图形 + 图注"""
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_slide_bg(slide, prs, LGRAY)
add_frame_title(slide, prs, title)
slide.shapes.add_picture(
img_path, Inches(1.17), Inches(1.3), Inches(11), Inches(5.2))
if caption:
cap = slide.shapes.add_textbox(
Inches(0.5), Inches(6.6), Inches(12.33), Inches(0.7))
cap.text_frame.paragraphs[0].text = caption
cap.text_frame.paragraphs[0].font.size = Pt(11)
cap.text_frame.paragraphs[0].font.color.rgb = MGRAY
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
```
#### 回归表幻灯片
```python
def make_table_slide(prs, title, headers, rows, footnote="",
highlight_col=-1, current=None, total=None):
"""标题栏 + 精简回归表(≤4列)"""
slide = prs.slides.add_slide(prs.slide_layouts[6])
add_slide_bg(slide, prs, LGRAY)
add_frame_title(slide, prs, title)
nc = len(headers); nr = len(rows) + 1
tbl = slide.shapes.add_table(
nr, nc, Inches(0.5), Inches(1.4), Inches(12.33), Inches(4.5)).table
# 表头:海军蓝背景,白色字体
for j, h in enumerate(headers):
cell = tbl.cell(0, j)
cell.text = h
cell.text_frame.paragraphs[0].font.bold = True
cell.text_frame.paragraphs[0].font.size = Pt(14)
cell.text_frame.paragraphs[0].font.color.rgb = WHITE
cell.fill.solid(); cell.fill.fore_color.rgb = NAVY
# 数据行
for i, row in enumerate(rows):
for j, val in enumerate(row):
cell = tbl.cell(i + 1, j)
cell.text = str(val)
cell.text_frame.paragraphs[0].font.size = Pt(13)
if j == (nc + highlight_col) % nc: # 首选规格列加粗
cell.text_frame.paragraphs[0].font.bold = True
# 脚注
if footnote:
fn = slide.shapes.add_textbox(
Inches(0.5), Inches(6.0), Inches(12.33), Inches(1.2))
fn.text_frame.paragraphs[0].text = footnote
fn.text_frame.paragraphs[0].font.size = Pt(10)
fn.text_frame.paragraphs[0].font.color.rgb = MGRAY
if current and total:
add_progress_bar(slide, prs, current, total)
return slide
```
### 4.6 图形文件预处理
若 `figures/` 目录存在 **PDF 向量图**,先转为高分辨率 PNG 再嵌入 PPTX:
```python
import subprocess, os
def pdf_to_png(pdf_path, dpi=200):
"""PDF → PNG(需系统安装 poppler)"""
png_base = pdf_path.replace(".pdf", "")
subprocess.run([
"pdftoppm", "-r", str(dpi), "-png", "-singlefile",
pdf_path, png_base
], check=True)
return png_base + ".png"
# 若 pdftoppm 不可用,回退 pdf2image
def pdf_to_png_fallback(pdf_path, dpi=200):
from pdf2image import convert_from_path
imgs = convert_from_path(pdf_path, dpi=dpi)
png_path = pdf_path.replace(".pdf", ".png")
imgs[0].save(png_path, "PNG")
return png_path
```
安装依赖:
```bash
pip install python-pptx pdf2image --break-system-packages
apt-get install -y poppler-utils 2>/dev/null || true
```
### 4.7 演讲者备注
为每张关键幻灯片在 `notes_slide` 中添加备注,包含:核心论点(1–2 句)、预期Q&A 回答、时间提示。
```python
def add_speaker_notes(slide, notes_text):
notes_slide = slide.notes_slide
tf = notes_slide.notes_text_frame
tf.text = notes_text
```
---
## Step 5:文件输出
### 5.1 文件目录
```
slides/
├── slides.pptx # Beamer 风格 PPTX(可用 PowerPoint / LibreOffice 编辑)
└── slides.pdf # PDF 版本(用于投稿、邮件分发、归档)
```
### 5.2 保存 PPTX 文件
```python
import os
output_dir = os.path.join(workspace, "slides")
os.makedirs(output_dir, exist_ok=True)
pptx_path = os.path.join(output_dir, "slides.pptx")
prs.save(pptx_path)
print(f"✅ PPTX 已保存:{pptx_path}")
print(f" 共 {len(prs.slides)} 张幻灯片")
```
### 5.3 导出 PDF
用 LibreOffice headless 将 PPTX 转为 PDF(推荐,保留字体和布局):
```python
import subprocess, shutil
def pptx_to_pdf(pptx_path, output_dir):
"""使用 LibreOffice headless 将 PPTX 转为 PDF"""
result = subprocess.run(
["libreoffice", "--headless", "--convert-to", "pdf",
"--outdir", output_dir, pptx_path],
capture_output=True, text=True
)
pdf_path = pptx_path.replace(".pptx", ".pdf")
if os.path.exists(pdf_path):
print(f"✅ PDF 已导出:{pdf_path}")
return pdf_path
else:
print(f"⚠️ LibreOffice 转换失败:{result.stderr}")
return None
pdf_path = pptx_to_pdf(pptx_path, output_dir)
```
若 LibreOffice 不可用,回退方案(需安装 `unoconv` 或 `soffice`):
```bash
# 检查 LibreOffice 是否可用
which libreoffice soffice 2>/dev/null || echo "未找到 LibreOffice"
# Ubuntu/Debian 安装
apt-get install -y libreoffice 2>/dev/null || true
```
### 5.4 验证检查
```python
from pptx import Presentation
# 验证 PPTX
verify = Presentation(pptx_path)
assert len(verify.slides) > 0, "PPTX 文件为空!"
pptx_kb = os.path.getsize(pptx_path) // 1024
# 验证 PDF(若已生成)
pdf_ok = pdf_path and os.path.exists(pdf_path)
pdf_kb = os.path.getsize(pdf_path) // 1024 if pdf_ok else 0
print(f"✅ 验证通过")
print(f" PPTX:{len(verify.slides)} 张,{pptx_kb} KB")
if pdf_ok:
print(f" PDF :{pdf_kb} KB")
else:
print(f" PDF :未生成(可在 PowerPoint 中手动导出)")
```
---
## Step 6:阶段确认与移交
> **演示文稿产出摘要**
>
> ✅ Beamer 风格幻灯片:`slides/slides.pptx`(共 [N] 张,上限 [L] 张)
> ✅ PDF 版本:`slides/slides.pdf`(用于投稿、邮件分发、归档)
> ✅ 主题:[A Metropolis / B 极简 / C Madrid],16:9 宽屏,海军蓝配色
> ✅ PPTX 可直接用 PowerPoint 或 LibreOffice Impress 打开和编辑
>
> ---
>
> **幻灯片质量自检(报告前建议核查):**
> - [ ] 总张数是否在上限以内(A ≤ 15 / B ≤ 30 / C ≤ 20)?
> - [ ] "This Paper" 幻灯片是否包含主系数数值和量级?
> - [ ] 回归表是否精简到 ≤ 3–4 列?首选规格列是否加粗高亮?
> - [ ] 最后一张是否为 Takeaways(不是 "Thank you / Questions?")?
> - [ ] 图形是否以 ≥ 200 DPI PNG 嵌入,清晰无锯齿?
> - [ ] 每张关键幻灯片是否有演讲者备注?
> - [ ] 总时长估算:[N 张] × 1.5 分钟/张 ≈ [总分钟数],是否合理?
---
## 常见问题处理
**Q:图形文件不存在(`figures/` 目录为空),如何处理?**
先运行 `/plot` 命令(Phase 7 的绘图输出),生成所有分析图形,再运行 `/present`。若用户希望先生成幻灯片框架,用占位矩形替代:
```python
# 占位图(灰色矩形 + 说明文字)
from pptx.util import Inches, Pt
ph = slide.shapes.add_shape(
1, Inches(1), Inches(1.5), Inches(11.33), Inches(4.5))
ph.fill.solid(); ph.fill.fore_color.rgb = RGBColor(200, 200, 200)
ph.line.fill.background()
tf = ph.text_frame
tf.paragraphs[0].text = "[图形占位:figures/figNN_name.png]"
tf.paragraphs[0].font.size = Pt(18)
tf.paragraphs[0].font.color.rgb = RGBColor(100, 100, 100)
```
**Q:python-pptx 未安装?**
```bash
pip install python-pptx pdf2image --break-system-packages
```
**Q:PDF 图形转换失败?**
检查是否安装了 poppler:`which pdftoppm`。若未安装:
```bash
# macOS
brew install poppler
# Ubuntu/Debian
apt-get install poppler-utils
```
若环境受限,直接将 PNG 图形(来自 `/plot` 阶段输出的 `figures/*.png`)嵌入,跳过 PDF→PNG 转换。question
Help user transform their idea into a clear research question through interactive dialogue.
# /question — Research Question Scoping
When the user invoke /question, transforms their preliminary idea into a confirmed, well-scoped research question through focused and multi-round conversations. The output is a single clear research question statement.
---
## Step 1: Draw Out the Idea
Ask the user to describe their idea in 1–3 sentences. Frame it as low-stakes:
> "请用1–3句话描述你初步的研究想法。"
After they respond, identify the most unclear part and ask **one targeted follow-up** — focusing on whichever is most ambiguous:
- **谁受影响?**(研究对象:个体、企业、地区、国家?)
- **什么在变化?**(关键自变量或政策冲击是什么?)
- **关心什么结果?**(因变量:工资、健康、生产率、犯罪?)
Do not ask all three at once. One follow-up at a time.
---
## Step 2: Identify the Motivation
Once you understand the core idea, tell the user which type of research motivation this is. This helps them articulate the "why this paper" framing later.
| 类型 | 描述 | 例子 |
|------|------|------|
| **Type 1: 有趣的现象** | 数据中观察到的规律,需要解释 | 靠近高速公路的农村土地反而规模化更快 |
| **Type 2: 理论与现实的矛盾** | 理论预测X,但现实显示¬X | 最低工资理论上减少就业,但实证结果不一致 |
| **Type 3: 新制度/政策/技术** | 外生变化创造了研究机会 | 某省2015年推行宅基地改革 |
| **Type 4: 新数据** | 过去无法回答的问题现在数据可及 | 首次获得某平台的个体级交易数据 |
Say something like: *"你的研究动机最接近 Type X——[简短解释]。"*
---
## Step 3: Quick Check
Apply three quick checks. Flag any failure immediately.
**可检验 (Testable)**
Can the question be stated as "X 对 Y 的影响",where both X and Y are measurable variables?
- ✅ Pass: both X and Y can be operationalized as data columns
- ❌ Fail: question is too conceptual, normative, or tautological → help the user operationalize
**可识别 (Identifiable)**
Is there a plausible source of variation to answer this question credibly?
- ✅ Level A (因果): 有准实验变异(政策、阈值、工具变量)
- ✅ Level B (可信相关): 面板数据 + 固定效应 + 充分控制
- ⚠️ Level C (描述): 截面OLS,需降低因果主张的力度
- ❌ Fail: 完全内生,且无可用变异 → 需重新设计或降格为描述性研究
Be honest about the identification level. If it's Level C, say so clearly and note what it means for the kind of claims the paper can make.
**可操作 (Feasible)**
Is the required data obtainable and the method executable?
- Is there a publicly available dataset, or does the user have their own data?
- Is the scope answerable in one paper (one main question, not five)?
If any check fails, help the user either reformulate or scope down before proceeding.
---
## Step 4: Confirm the Research Question
Once the three checks pass (or the user accepts a scoped-down version), produce the confirmed research question in this format:
```
研究问题:[X] 如何影响 [Y]?
研究对象:[总体/单位],[时间/地域范围]
识别层级:Level [A/B/C]
数据来源(初步):[数据集名称或类型]
研究假设:
H₀:X 对 Y 无显著影响(β = 0)
H₁:X 对 Y 有 [正向/负向/显著] 影响(β ≠ 0)
预期方向:[理论或直觉上的预期,简要说明原因]
```
Then ask: *"这个研究问题符合你的想法吗?有什么需要调整的?"*
After the user explicitly confirms, save the output as a `research-question.md` document in your working directory. Then move to next phase `literature-review`.
---
## Common Reformulation Examples
**Too broad → Narrowed**
- ❌ "数字化对经济的影响"
- ✅ "工业机器人的普及是否降低了中国制造业城市的就业率(2000–2015)?"
**Fails identifiable → Redesigned**
- ❌ "ESG评分高的企业财务表现更好吗?"(OLS内生)
- ✅ "强制ESG披露政策(2008年深交所要求)是否改善了中小上市公司的财务绩效?"(DiD)
**Too vague → Operationalized**
- ❌ "教育对人的影响"
- ✅ "义务教育年限延长对个人30岁时工资的影响(利用教育改革作为工具变量)"robustness
Phase 8 Robustness, Heterogeneity & Mechanism Tests. Reads model-spec.md, diagnostic_report.md, and results-memo.md to build a personalised checklist, then runs method-specific robustness checks, heterogeneity analysis, and mechanism tests; generates code; and produces robustness-report.md.
# /robustness — 稳健性、异质性与机制检验
## 定位
`/robustness` 是实证研究工作流的**第八阶段**,承接 Phase 6(`/code`)的 `diagnostic_report.md` 和 Phase 7(`results-analysis` skill)的 `results-memo.md`,完成三类核心检验:
1. **稳健性检验(Robustness Checks)**:验证主要结论在不同规格、样本和推断方法下的稳定性
2. **异质性分析(Heterogeneity Analysis)**:识别处理效应在不同子群体中的差异
3. **机制检验(Mechanism Tests)**:提供因果机制的间接证据,区分竞争性解释
本阶段产出 `robustness-report.md`,直接供 Phase 9(`/write`)的实证结果和稳健性讨论节使用。
---
## Step 0:读取上游输出,构建个性化检验清单
> 📎 **参见 [`shared/context-reader.md`](../shared/context-reader.md)**
> 本阶段所需文件:`model-spec.md`(必需)、`diagnostic_report.md`(可选)、`results-memo.md`(可选)。
### 0.2 提取关键信息
从各文件中提取以下信息,构建**个性化稳健性检验清单**:
| 来源 | 提取内容 | 用途 |
|------|---------|------|
| `model-spec.md §1` | 识别策略类型 | 选择方法特异性检验菜单 |
| `model-spec.md §3` | 识别假设及诊断状态(✅/⚠️/❌) | ⚠️ 项自动转为稳健性检验 |
| `model-spec.md §8` | 预列的稳健性检验清单 | 直接纳入执行清单 |
| `diagnostic_report.md` | 第二层统计诊断 ⚠️ 项 | 对应的稳健性处理方式 |
| `results-memo.md §6` | Phase 7 建议的优先检验 | 调整执行优先级 |
### 0.3 输出启动确认
向用户输出Phase 8 启动摘要:
```
═══════════════════════════════════════════════════════
Phase 8 启动确认
═══════════════════════════════════════════════════════
识别策略:[策略类型]
主系数:β̂ = [值](p = [值])
识别可信度(Phase 7):[高/中/低]
【已继承的待检验项】(来自上游阶段)
model-spec.md §8:[N1] 项预列稳健性检验
diagnostic_report.md ⚠️ 项:[N2] 项统计诊断警告
results-memo.md §6 优先建议:[N3] 项
【本阶段将执行】
① 方法特异性稳健性检验(见 Step 2)
② 异质性分析(见 Step 3)
③ 机制检验(见 Step 4)
═══════════════════════════════════════════════════════
```
待用户确认后继续,或询问是否有额外的稳健性检验需求。
---
## Step 1:构建稳健性检验主清单
综合三个来源,生成本次执行的**完整稳健性检验清单**,按优先级排序:
```
稳健性检验主清单
─────────────────────────────────────────────────────
编号 | 来源 | 检验类型 | 执行优先级
─────────────────────────────────────────────────────
R1 | 方法通用 | [具体检验名称] | 高(必须执行)
R2 | model-spec §8 | [具体检验名称] | 高(预列)
R3 | diagnostic ⚠️ | Cook's D 高影响观测值 | 高(已标记)
R4 | results §6 | [建议检验名称] | 中(建议执行)
R5 | 主动设计 | [检验名称] | 中(主动发现)
...
─────────────────────────────────────────────────────
```
**主动性原则(Activeness)**:在继承上游清单基础上,主动识别用户未明确提出但**审稿人很可能质疑**的检验点,自动补充到清单并标注"主动设计"。
---
## Step 1.5:写入上下文 → 自动调用 Checker Agent(并行执行)
> **此步骤是 Steps 2–4 的替代入口**。完成检验清单后,将上下文序列化并委托 `checker` agent 并行执行三轨检验,而非在主线程中串行运行 Steps 2/3/4。
### 1.5.1 序列化检验上下文
将 Step 0–1 提取的信息写入 `phase8/context.json`:
```json
{
"data_path": "<来自 model-spec.md 的数据路径>",
"depvar": "<因变量>",
"treatment": "<处理变量>",
"controls": ["<控制变量列表>"],
"fe": ["<固定效应>"],
"cluster": "<聚类层级>",
"method": "<识别策略:OLS/IV/DID/RDD/Panel FE>",
"baseline_coef": <来自 results-memo.md 的基准系数>,
"baseline_se": <基准标准误>,
"N": <样本量>,
"checklist": {
"robustness": ["R1: ...", "R2: ...", "R3: ..."],
"heterogeneity": ["H1: ...", "H2: ..."],
"mechanism": ["M1: ...", "M2: ..."]
}
}
```
`checklist` 字段直接来自 Step 1 构建的个性化清单,供各 subagent 优先执行用户研究特定的检验。
### 1.5.2 读取 Checker Agent 规格
```
Read: agents/checker.md
```
### 1.5.3 按 Checker agent spec 执行并行派发
严格按照 `agents/checker.md` 的 **Step 3** 规定,在**单条消息**中同时发出三个 Agent tool call:
- **Agent A**:稳健性检验 → 写入 `phase8/robustness/`
- **Agent B**:异质性分析 → 写入 `phase8/heterogeneity/`
- **Agent C**:机制检验 → 写入 `phase8/mechanism/`
每个 subagent 的 prompt 应包含:
1. 指向 `phase8/context.json` 的读取指令
2. 来自 `checklist` 字段的个性化检验项(优先于 `agents/checker.md` 中的通用菜单)
3. 来自 Steps 2–4 技术规范的对应方法代码(按 `method` 字段选择对应小节)
### 1.5.4 等待汇总
三路 subagent 完成后,由 checker 执行其 Step 4(汇总 `phase8_report.md`)和 Step 5(呈现给用户确认)。
> **Steps 2–4 以下内容为各 subagent 可调用的技术规范库**,不在主线程中串行执行。
---
## Step 2:方法特异性稳健性检验
根据 `model-spec.md §1` 中的识别策略,执行对应检验菜单。**同一研究中可能同时适用多个策略的检验**(如 DiD + IV 组合设计)。
---
### 2.A OLS / 截面回归稳健性
适用于主方程为 OLS 的情形。
**1. 控制变量敏感性(Oster 2019 系数稳定性)**
逐步添加控制变量,并计算 Oster δ 统计量:
```python
# δ = (β_受控 - β_基准) / (β_受控 - 假设的 OVB 估计量)
# |δ| > 1 表示 OVB 解释全部效应所需的选择性相对可观测性失调程度 > 1,不合理
# 论文惯例:报告 β_受控 在 R² → R²_max = 1.3 × R²_全控 时对应的 δ
delta_oster = (beta_controlled - beta_baseline) / (beta_controlled - 0)
print(f"Oster δ = {delta_oster:.3f}(|δ| > 1 说明 OVB 不足以推翻结论)")
```
**2. 替代标准误**
并排报告 HC1、HC3 和聚类 SE:
```python
# Python — statsmodels
import statsmodels.formula.api as smf
m_ols = smf.ols(formula, data).fit()
m_hc1 = smf.ols(formula, data).fit(cov_type='HC1')
m_hc3 = smf.ols(formula, data).fit(cov_type='HC3')
m_clus = smf.ols(formula, data).fit(cov_type='cluster',
cov_kwds={'groups': data[cluster_var]})
```
**3. 异常值剔除**
- 剔除结果变量 Y 前后 1% 极端值
- 剔除 `diagnostic_report.md` 中 Cook's D 超阈值的高影响观测值
**4. 函数形式检验**
- RESET 检验(来自 Phase 6 诊断;若 ⚠️ 则此处测试对数变换规格)
- 对数变换:`log(Y)` 和 `log(D)`(若 Y/D 右偏分布)
- 二次项:在主方程中加入 $D^2$,检验非线性效应
**5. 子样本稳健性**
按关键维度分组估计,检验主效应是否由特定子样本驱动:
- 地理维度(如东/西部;高/低收入地区)
- 时间维度(前半段 vs. 后半段样本期)
- 规模维度(大/小企业;高/低收入群体)
---
### 2.B 双重差分(DiD / TWFE)稳健性
适用于主方程为 DiD 或交错 DiD 设计。
**1. 预趋势检验(最高优先级)**
事件研究图 + 预处理期系数联合 F 检验(此项若在 Phase 6 已通过仍须在报告中呈现):
```stata
* Stata — reghdfe + coefplot
reghdfe Y i.event_time##i.treated controls, ///
absorb(id year) cluster(id)
* 联合检验预处理期系数 = 0
test [event_time = -4].treated [event_time = -3].treated [event_time = -2].treated
```
**2. 安慰剂处理时点(Placebo in Time)**
将处理时点提前 1–2 年重新估计;安慰剂效应应不显著:
```python
# 将 post 变量提前 1 年(安慰剂)
df['post_placebo'] = (df[time_var] >= treatment_year - 1).astype(int)
df['treat_post_placebo'] = df['treated'] * df['post_placebo']
# 在原始处理时点前的样本中估计,效应应趋于零
```
**3. 替代控制组**
- 限制为处理时点前特征更相似的对照单元(倾向得分匹配后构建控制组)
- 剔除政策可能有溢出效应的相邻单元
**4. 交错 DiD 估计量比较(Staggered Adoption)**
若存在交错处理,对比 TWFE 与异质性稳健估计量:
```r
# R — 对比四种估计量
library(did); library(fixest); library(sunab)
# (1) TWFE(可能有负权重偏误)
m_twfe <- feols(Y ~ treat_post | id + year,
cluster = ~id, data = df)
# (2) Callaway-Sant'Anna
m_cs <- att_gt(yname="Y", tname="year", idname="id",
gname="first_treated", data=df)
m_cs_agg <- aggte(m_cs, type="simple")
# (3) Sun-Abraham
m_sa <- feols(Y ~ sunab(first_treated, year) | id + year,
cluster = ~id, data = df)
# (4) Bacon 分解 — 检查负权重比例
library(bacondecomp)
bacon_result <- bacon(Y ~ treat_post, data=df,
id_var="id", time_var="year")
```
结果比较标准:若 TWFE 与 C-S / S-A 主系数差异 > 20%,优先使用异质性稳健估计量,并在论文中解释。
**5. 推断方法稳健性**
- 若聚类数 < 30:报告 Wild Cluster Bootstrap p 值
```r
library(fwildclusterboot)
boot_res <- boottest(m_twfe, clustid = "id",
param = "treat_post", B = 9999)
print(boot_res)
```
---
### 2.C 工具变量(IV / 2SLS)稳健性
适用于主方程为 2SLS 设计。
**1. 工具变量强度诊断**
```stata
* Stata — ivreg2 完整诊断
ivreg2 Y controls (D = Z), robust ffirst
* 报告:第一阶段 F、Kleibergen-Paap rk Wald F、Cragg-Donald F
* Stock-Yogo (2005) 10% 偏误临界值:16.38(单一 IV)
* 弱工具变量稳健置信区间(Anderson-Rubin)
weakiv
```
**2. 2SLS vs. LIML vs. JIVE 对比**
弱工具变量时,LIML 比 2SLS 具有更好的小样本性质:
```stata
ivregress 2sls Y controls (D = Z), robust // 2SLS
ivregress liml Y controls (D = Z), robust // LIML
ivregress gmm Y controls (D = Z), wmatrix(robust) // GMM
```
**3. 排他性限制的间接检验(Reduced Form 对比)**
```stata
* 简约式(Reduced Form)应显著
reg Y Z controls, robust // 简约式
reg D Z controls, robust // 第一阶段
* IV 估计量 = 简约式系数 / 第一阶段系数
* 若简约式不显著但 IV 显著,说明工具变量可能无效
```
**4. 安慰剂工具变量**
使用一个理论上应无第一阶段效应的变量代替 Z,验证排他性:
- 若安慰剂 IV 估计量显著,说明主工具变量可能违反排他性
**5. 控制变量集合敏感性**
在第一阶段和第二阶段中分别增减控制变量,检验 IV 估计量的稳定性。
---
### 2.D 断点回归(RDD)稳健性
适用于 Sharp 或 Fuzzy RDD 设计。
**1. 带宽敏感性(最重要的稳健性检验)**
在 MSE 最优带宽的 50%、75%、100%、125%、150% 下并排报告:
```r
# R — rdrobust
library(rdrobust)
bw_opt <- rdbwselect(Y, X, c=cutoff)$bws["mserd","left"]
for (mult in c(0.5, 0.75, 1.0, 1.25, 1.5)) {
rdd_res <- rdrobust(Y, X, c=cutoff, h=bw_opt*mult)
cat(sprintf("h = %.2f×opt: β = %.4f (SE = %.4f, p = %.3f)\n",
mult, rdd_res$coef[1],
rdd_res$se[3], rdd_res$pv[3]))
}
```
**2. 多项式阶次敏感性**
对比局部线性(p=1)、局部二次(p=2)、局部三次(p=3):
```r
for (p in 1:3) {
rdd_res <- rdrobust(Y, X, c=cutoff, p=p)
cat(sprintf("p=%d: β = %.4f\n", p, rdd_res$coef[1]))
}
```
**3. 核函数敏感性**
三角核(triangular)、矩形核(uniform)、Epanechnikov 核结果对比。
**4. Donut-Hole 检验**
剔除阈值附近最近的 $\delta$ 个单位($\delta = 1, 2, 3$),检验精确操纵:
```r
# 检验是否有精确堆积在 cutoff 处(刚好等于阈值的观测值)
for (donut in c(1, 2, 3)) {
df_donut <- df[abs(df$X - cutoff) > donut, ]
rdd_donut <- rdrobust(df_donut$Y, df_donut$X, c=cutoff)
cat(sprintf("Donut Δ=%d: β = %.4f\n", donut, rdd_donut$coef[1]))
}
```
**5. 安慰剂截断点(Placebo Cutoffs)**
在真实阈值左右各取 25%、50% 分位点作为虚假阈值,效应应为零:
```r
placebo_cutoffs <- quantile(df$X[df$X < cutoff], c(0.25, 0.5, 0.75))
for (c_fake in placebo_cutoffs) {
rdd_p <- rdrobust(df$Y[df$X < cutoff],
df$X[df$X < cutoff], c=c_fake)
cat(sprintf("Placebo c=%.2f: β = %.4f (p = %.3f)\n",
c_fake, rdd_p$coef[1], rdd_p$pv[3]))
}
```
**6. 预定协变量平衡(Covariate Balance at Cutoff)**
对所有预处理协变量做 RDD,估计量应接近零:
- 若协变量不平衡,说明存在操纵或其他处理同时发生
---
### 2.E 面板固定效应稳健性
适用于面板 TWFE 主方程。
**1. 聚类层级比较**
并排报告不同聚类层级的标准误:
- 个体层级聚类 vs. 行业/省级聚类 vs. 双向聚类(个体 + 时间)
**2. 样本时间窗口敏感性**
缩短/延长样本期,检验是否由特定年份驱动:
- 逐年剔除(Jackknife by year)
**3. 逐单位 Jackknife**
逐个剔除个体单元,检验是否存在极端影响的单元:
```python
jackknife_coefs = []
for unit in df[id_var].unique():
df_jack = df[df[id_var] != unit]
model_j = run_model(df_jack) # 调用主方程估计函数
jackknife_coefs.append(model_j.params[D_var])
import numpy as np
print(f"Jackknife 系数范围:[{min(jackknife_coefs):.4f}, {max(jackknife_coefs):.4f}]")
print(f"Jackknife 均值:{np.mean(jackknife_coefs):.4f}(主系数:{beta_main:.4f})")
```
**4. 固定效应规格对比**
| 规格 | FE 层级 | β̂ | SE | N |
|------|---------|---|---|---|
| 仅个体 FE | $\alpha_i$ | ... | ... | ... |
| 双向 FE(主规格)| $\alpha_i + \lambda_t$ | ... | ... | ... |
| 个体 + 行业×年 FE | $\alpha_i + \delta_{j,t}$ | ... | ... | ... |
**5. 动态规格(加入滞后因变量)**
若主规格不含滞后因变量,加入 $Y_{i,t-1}$ 检验估计量稳定性(但注意 Nickell 偏误)。
---
### 2.F 合成控制稳健性
适用于合成控制(SCM)主方案。
**1. 供体池敏感性**
- 逐一剔除供体单元,检验合成控制估计是否对特定供体依赖过强(Leave-one-out)
- 扩大/缩小供体池条件(地理 / 经济相似性标准)
**2. 安慰剂检验(Permutation Test)**
对供体池每个单元重复合成控制,计算"虚假处理效应"分布,报告:
- p 值 = 处理单元 MSPE 比 ≥ 该值的供体比例
- 可视化:安慰剂效应路径图(灰色)+ 处理单元路径(黑色加粗)
```r
library(Synth)
placebos <- generate.placebos(synth_out, synth_data, strategy="multisession")
plot_placebos(placebos)
mspe_test(placebos) # 精确 p 值
```
**3. 预测变量权重矩阵(V 矩阵)敏感性**
分别使用:等权重、主成分确定权重、PCA 降维后的权重,对比合成路径拟合质量(pre-RMSPE)和处理效应。
---
## Step 3:异质性分析(Heterogeneity Analysis)
**异质性分析是 Top-5 期刊近年日益强调的内容**,要求从理论假设出发设计分组,而非事后数据挖掘。
### 3.1 异质性分析设计原则
设计异质性检验前,必须在 `robustness-report.md` 中**先写出理论预测**:
```
异质性分析设计记录
─────────────────────────────────────────────────────
分组变量:[变量名]
理论预测:[为什么该变量应调节处理效应大小或方向?]
预期模式:[预期哪个子组效应更大/更小/方向相反?]
检验方式:[交互项 / 子组分别估计]
─────────────────────────────────────────────────────
```
**无理论支撑的事后分组分析不得纳入论文正文**;若需报告,只能放入附录并标注为"探索性"。
### 3.2 子组分组估计
根据研究背景和理论,选择 2–3 个最重要的分组维度:
```python
# Python — 子组分别估计
heterogeneity_vars = ['group_A', 'group_B', 'group_C'] # 替换为实际分组变量
subgroup_results = {}
for var in heterogeneity_vars:
for val in df[var].unique():
df_sub = df[df[var] == val]
res_sub = run_main_model(df_sub)
subgroup_results[f"{var}={val}"] = {
'coef': res_sub.params[D_var],
'se': res_sub.bse[D_var],
'n': len(df_sub)
}
```
```stata
* Stata — 分组估计 + 系数对比(附交互项显著性检验)
* 高组
reghdfe Y D controls if group == 1, absorb(id year) cluster(id)
estimates store m_high
* 低组
reghdfe Y D controls if group == 0, absorb(id year) cluster(id)
estimates store m_low
* 交互项检验(形式上等价的 Wald 检验)
reghdfe Y c.D##i.group controls, absorb(id year) cluster(id)
test c.D#1.group // 交互项是否显著
```
### 3.3 交互项规格(首选方法)
**推荐做法**:在主方程中加入 $D \times \text{Moderator}$ 交互项,而非单独分组估计,以正式检验异质性的统计显著性。
```latex
% 含调节变量的主方程扩展
\begin{equation}\label{eq:heterogeneity}
Y_{it} = \alpha_i + \lambda_t
+ \beta_1 D_{it}
+ \beta_2 D_{it} \times M_{it}
+ \beta_3 M_{it}
+ \mathbf{X}_{it}'\boldsymbol{\gamma}
+ \varepsilon_{it}
\end{equation}
% M_{it}:调节变量(分组变量)
% \beta_2:异质性处理效应估计量(审稿人重点关注此系数)
% 若 M 为虚拟变量,\beta_1 为 M=0 时的效应;\beta_1+\beta_2 为 M=1 时的效应
```
### 3.4 常见异质性维度(按识别策略推荐)
| 识别策略 | 建议分组维度 |
|---------|-----------|
| DiD(政策评估)| 政策合规强度高/低;处理密度强/弱;首次处理 vs. 后续处理(SUTVA 测试)|
| RDD | 阈值附近更靠近处理区 vs. 控制区(连续度量);人口密度;城乡差异 |
| IV | Compliers 子群体特征(低/高 first-stage 值);工具变量强度的空间分布 |
| Panel FE | 行业周期敏感度;企业规模;地区市场化程度 |
---
## Step 4:机制检验(Mechanism Tests)
机制检验的目标是**提供支持特定因果渠道的间接证据**,而非直接证明机制(通常不可能)。
### 4.1 机制检验设计框架
在执行机制检验前,先写出竞争性机制的逻辑链:
```
机制假设分析
═══════════════════════════════════════════════════════
主要机制假设:
D → [中间变量 M] → Y
理论基础:[经济学/制度学解释]
可观测预测:[若此机制成立,应观察到什么额外规律?]
竞争性机制(需要排除):
D → Y(直接效应,绕过 M)
D → [其他中间变量 M'] → Y
排除方法:[如何区分?]
═══════════════════════════════════════════════════════
```
### 4.2 中介变量分析(Mediation Analysis)
**注意**:社会科学中的"中介分析"通常仅为描述性,因为中介变量本身往往是内生的。应明确声明局限性。
```python
# 三方程中介框架(Baron-Kenny 1986;在论文中标注局限性)
# 方程 1:D → Y(总效应)
total_effect = run_model(Y ~ D + controls)
# 方程 2:D → M(D 对中介变量的影响)
first_stage_M = run_model(M ~ D + controls)
# 方程 3:D + M → Y(条件效应)
direct_effect = run_model(Y ~ D + M + controls)
# 若 β_D 在方程 3 中减小(衰减),说明 M 部分中介 D→Y
# 注意:此处需明确说明中介变量 M 本身的内生性问题
print(f"总效应:{total_effect.params[D_var]:.4f}")
print(f"直接效应(控制M后):{direct_effect.params[D_var]:.4f}")
print(f"间接效应(近似):{total_effect.params[D_var] - direct_effect.params[D_var]:.4f}")
```
### 4.3 排除竞争性解释
对每个竞争性机制,设计一个直接检验加以排除:
| 竞争性解释 | 排除检验 | 预期结果(若此解释不成立)|
|----------|---------|----------------------|
| [解释 A] | [检验方法] | [效应应为零/方向相反] |
| [解释 B] | [检验方法] | [效应应为零] |
### 4.4 机制检验的合规性判断
**关键原则**:若中间变量 M 位于 D→Y 的因果路径上(即 M 是处理的结果),**绝对不能**在主回归中控制 M,否则引入中介偏误(Over-control Bias)。机制检验应以独立的回归方程形式呈现。
```
合规性检查
□ 中介变量 M 是否在主回归中被控制?
→ 若是,立即移除!M 在主方程中是碰撞变量或中介变量
□ 机制分析是否声明"描述性证据"局限?
□ 机制回归是否使用与主方程一致的标准误策略?
```
---
## Step 5:代码生成与执行
### 5.1 软件确认
稳健性检验代码与 Phase 6 使用相同软件。若上游文件中未记录,向用户确认:
> *"请确认稳健性检验代码的软件:*
> *A. Python(自动执行)/ B. R(本地运行)/ C. Stata(本地运行)"*
### 5.2 代码文件结构
```
code/
├── 05_robustness.[py|R|do] # 方法特异性稳健性检验
├── 06_heterogeneity.[py|R|do] # 异质性分析
└── 07_mechanism.[py|R|do] # 机制检验
```
每个脚本头部使用与 Phase 6 一致的文件头格式:
```python
# ============================================================
# Project: [研究问题一句话]
# Phase: 8 — Robustness, Heterogeneity & Mechanism
# Script: 05_robustness.py / 06_heterogeneity.py / 07_mechanism.py
# Strategy: [识别策略]
# Date: [YYYY-MM-DD]
# Input: data/clean/[文件名],tables/table_main.csv
# Output: tables/table_robustness.tex/.csv
# tables/table_heterogeneity.tex/.csv
# tables/table_mechanism.tex/.csv
# figures/fig_robustness_coefplot.png
# ============================================================
```
### 5.3 执行规范
**Python(自动执行)**:
```bash
pip install pyfixest linearmodels rdrobust fwildclusterboot doubleml --quiet
python code/05_robustness.py
python code/06_heterogeneity.py
python code/07_mechanism.py
```
**R / Stata(用户本地执行)**:
生成脚本至 `code/` 目录,提示用户在本地运行后上传结果。
### 5.4 稳健性结果验证清单
```
=== Phase 8 验证清单 ===
□ 主系数在稳健性检验中是否保持显著且符号一致?
□ 若某规格下系数发生实质性变化,是否已明确解释原因?
□ 异质性交互项是否有理论支撑(避免数据挖掘)?
□ 机制分析中是否避免控制了中介变量?
□ 稳健性表格是否同时有 .tex 和 .csv 输出?
□ 系数比较图(coefplot)是否已生成?
```
---
## Step 6:产出稳健性汇总表(LaTeX)
### 6.1 稳健性结果表格规范
稳健性表格通常较宽,**使用横向排版(landscape)**,规格为行、数字列(β̂、SE、N)为列:
```latex
\begin{landscape}
\begin{table}[ht]
\centering
\caption{\textbf{Robustness Checks}}
\label{tab:robustness}
\begin{threeparttable}
{\footnotesize\setlength{\tabcolsep}{5pt}
\begin{tabular}{l p{5.5cm} c c c c c}
\toprule
& & \multicolumn{5}{c}{Dependent Variable: \textit{[Y]}} \\
\cmidrule(lr){3-7}
Spec. & Description & $\hat{\beta}$ & SE & $p$-val & $N$ & $R^2$ \\
\midrule
\textbf{(1)} & \textbf{Main specification} & & & & & \\
\midrule
\multicolumn{7}{l}{\textit{Panel A: Standard Error Alternatives}} \\
(2) & HC3 robust SE & & & & & \\
(3) & Two-way clustered SE & & & & & \\
\midrule
\multicolumn{7}{l}{\textit{Panel B: Sample Restrictions}} \\
(4) & Exclude top/bottom 1\% of $Y$ & & & & & \\
(5) & Exclude high Cook's $D$ obs. & & & & & \\
\midrule
\multicolumn{7}{l}{\textit{Panel C: Alternative Specifications}} \\
(6) & Log transformation of $Y$ & & & & & \\
(7) & Add quadratic $D^2$ & & & & & \\
\midrule
\multicolumn{7}{l}{\textit{Oster (2019) Coefficient Stability}} \\
\quad $\delta$ &
\multicolumn{6}{l}{$\delta = [VALUE]$;\quad
$|\delta| \gg 1$ implies OVB implausibly large to explain result.} \\
\bottomrule
\end{tabular}}
\begin{tablenotes}[flushleft]\small
\item \textit{Notes}: [样本描述]. SE clustered at [level] unless stated.
$^{***}p<0.01$, $^{**}p<0.05$, $^{*}p<0.10$.
\end{tablenotes}
\end{threeparttable}
\end{table}
\end{landscape}
```
**Oster δ 格式规范**(延续已有设计):
若 δ 文本超过一行,移入 tablenotes 节,正文单元格只放数值:
```latex
\quad $\delta$ & \multicolumn{6}{l}{$\delta = [VALUE]$} \\
% tablenotes 中:
\item Oster (2019) $\delta$: ratio of selection on unobservables to observables
needed to fully explain the result with omitted variable bias.
$|\delta| \gg 1$ implies the result is robust to OVB.
```
### 6.2 异质性结果表格
异质性表格的标准格式:主效应 + 分组效应 + 交互项列:
```latex
% 异质性表格列结构(示例)
\begin{tabular}{lccc}
\toprule
& Full Sample & High [M] & Low [M] \\
\midrule
$D$ (treatment) & [β̂] & [β̂_H] & [β̂_L] \\
& ([SE]) & ([SE]) & ([SE]) \\
$D \times$ High [M] & & [β̂_int]$^{**}$ & \\
& & ([SE]) & \\
\midrule
$N$ & [N] & [N_H] & [N_L] \\
\bottomrule
\end{tabular}
```
### 6.3 系数比较图(Coefplot)
生成稳健性检验主系数及 95% CI 的并排系数图:
```python
# Python — 稳健性系数比较图
import matplotlib.pyplot as plt
import numpy as np
specs = ["Main", "HC3 SE", "Clustered SE", "No Outliers", "Log Y", ...]
coefs = [beta_main, beta_hc3, beta_clus, beta_noout, beta_log, ...]
ci_lo = [ci_lo_main, ...]
ci_hi = [ci_hi_main, ...]
fig, ax = plt.subplots(figsize=(8, 5))
y_pos = np.arange(len(specs))
for i, (spec, coef, lo, hi) in enumerate(zip(specs, coefs, ci_lo, ci_hi)):
ax.errorbar(coef, i, xerr=[[coef-lo], [hi-coef]],
fmt='o', ms=6, capsize=4,
color='#C0392B' if i==0 else '#2C3E50')
ax.axvline(0, color='gray', ls='--', lw=1)
ax.set_yticks(y_pos); ax.set_yticklabels(specs)
ax.set_xlabel("Coefficient (95% CI)")
ax.set_title("Robustness Checks: Main Coefficient")
plt.tight_layout()
plt.savefig("figures/fig_robustness_coefplot.png", dpi=150)
```
```stata
* Stata — coefplot
coefplot (m_main, label("Main")) ///
(m_hc3, label("HC3 SE")) ///
(m_clus, label("Clustered SE")) ///
(m_nout, label("No Outliers")) ///
, keep(D_var) xline(0) ///
title("Robustness Checks: Main Coefficient") ///
graphregion(color(white))
graph export "figures/fig_robustness_coefplot.png", replace
```
---
## Step 7:产出 `robustness-report.md`
整合本阶段所有结果,写入工作目录:
```
Write: [workspace]/robustness-report.md
```
**文档结构:**
```markdown
# Robustness Report
**项目:** [研究问题一句话]
**版本:** v1.0
**日期:** [YYYY-MM-DD]
**主系数(Phase 6):** β̂ = [值](SE = [值],p = [值])
---
## 1. 稳健性检验汇总
### 1.1 检验清单完成情况
| 编号 | 检验类型 | 来源 | 结果 | 结论 |
|------|---------|------|------|------|
| R1 | [检验名] | 方法通用 | β̂=[值], p=[值] | ✅ 稳健 / ⚠️ 有变化 / ❌ 不稳健 |
| ... | | | | |
### 1.2 主要结论
[2–3句话:哪些规格下结论稳健?哪些规格下系数有变化及其经济学解释]
**主系数稳健性判断:**
- 符号一致性:[在所有/绝大多数/部分规格下符号一致]
- 量级稳定性:[系数范围 X 至 Y,主规格为 Z]
- 统计显著性:[在所有/绝大多数规格下在 X% 水平显著]
---
## 2. 异质性分析结果
### 2.1 [分组维度 1]
**理论预测:** [预设的异质性方向]
**结果:** [β̂_高 = X vs. β̂_低 = Y;交互项 p = Z]
**解释:** [与理论预测是否一致?如不一致,给出经济学解释]
### 2.2 [分组维度 2]
[同上格式]
---
## 3. 机制检验结果
### 3.1 主要因果渠道
**机制假设:** D → [M] → Y
**支持证据:**
- D 对 M 的效应:β̂_{D→M} = [值](p = [值])[✅ 显著 / ❌ 不显著]
- 控制 M 后 D 对 Y 的效应:β̂_{直接} = [值](原始效应:[主系数值])
- 间接效应(近似):[值],占总效应的 [%]%
**注意事项:** [中介分析的内生性局限性]
### 3.2 竞争性解释排除
| 竞争性解释 | 排除检验 | 结果 | 结论 |
|----------|---------|------|------|
| [解释 A] | [检验] | p = [值] | ✅ 已排除 / ⚠️ 部分支持 |
---
## 4. 产出文件清单
- `tables/table_robustness.tex` / `.csv`
- `tables/table_heterogeneity.tex` / `.csv`
- `tables/table_mechanism.tex` / `.csv` (如适用)
- `figures/fig_robustness_coefplot.png`
- `code/05_robustness.[ext]`
- `code/06_heterogeneity.[ext]`
- `code/07_mechanism.[ext]`(如适用)
---
## 5. Phase 9 写作建议
根据本阶段结果,给出论文 Robustness Section 的写作建议:
- **稳健性结论一句话概括**(可直接用于论文):
"[主系数] 在各种稳健性检验中保持稳定,系数范围为 [X, Y],均在 [Z]% 水平显著(表 X)。"
- **需要重点解释的偏差**:[若某规格系数变化较大,建议在论文中主动解释]
- **异质性发现的定位**:[异质性分析应进主结果节还是附录?]
- **机制讨论的语言强度建议**:[直接机制证据 / 间接相容证据 / 排除竞争解释]
```
---
## Step 8:阶段确认与移交
向用户呈现阶段摘要:
> **Phase 8 产出摘要**
>
> ✅ 稳健性检验代码:`code/05_robustness.[ext]`
> ✅ 异质性分析代码:`code/06_heterogeneity.[ext]`
> ✅ 机制检验代码:`code/07_mechanism.[ext]`(如适用)
> ✅ 稳健性结果表:`tables/table_robustness.tex` + `.csv`
> ✅ 异质性结果表:`tables/table_heterogeneity.tex` + `.csv`
> ✅ 系数比较图:`figures/fig_robustness_coefplot.png`
> ✅ 稳健性报告:`robustness-report.md`
>
> ---
>
> **Phase 8 结论摘要:**
>
> - 稳健性结论:[主系数在 X/Y 项检验中保持一致,符号稳定,量级范围 ...]
> - 最重要的异质性发现:[...]
> - 机制证据:[支持 / 不支持 / 无结论性证据]
>
> 待您确认稳健性结论后,进入 **Phase 9(全文写作)**,运行 `/write` 指令。
---
## 常见问题处理
**Q:某稳健性检验下主系数不再显著,怎么办?**
按以下优先级判断和处理:
1. 先检查规格变化是否引入了经济上合理的不同识别(如带宽缩窄导致样本代表性变化)
2. 若系数量级变化不大但 SE 增大(如小样本子组),诚实汇报而非选择性不报
3. 若系数量级实质缩小,在论文中主动讨论该规格反映了什么不同假设
4. **不得选择性只报告显著结果**(p-hacking),所有预设检验均须报告
**Q:异质性分析发现了意外的差异模式,但缺乏理论支撑,该怎么处理?**
意外的异质性发现有两种处理方式:
1. 若能事后提供合理的经济学机制解释,可纳入论文但必须明确标注"探索性发现"
2. 若无法提供合理解释,建议只在附录中报告,正文不作过度解读
**Q:机制分析中的中介变量本身是内生的,还有意义吗?**
有意义,但需在论文中清晰声明局限性。标准做法是:
1. 说明中介分析仅提供"与机制一致的描述性证据"
2. 若中介变量有外生变异来源,可进一步使用 IV 或 DiD 验证中介路径
3. 若无法解决中介变量内生性,重新定位该检验为"排除竞争性解释"(检验 D→M 是否显著)而非完整中介分析
---
## 与其他指令/技能的衔接
- **上游**:`/code`(Phase 6,产出 `diagnostic_report.md`)→ `results-analysis` skill(Phase 7,产出 `results-memo.md`)
- **本阶段调用的技能**:
- `did-analysis` — 交错 DiD 估计量比较(Callaway-Sant'Anna、Sun-Abraham)
- `iv-estimation` — IV 稳健性(弱工具变量检验、LIML、Anderson-Rubin CI)
- `rdd-analysis` — RDD 稳健性(带宽敏感性、安慰剂截断点)
- `panel-data` — 面板稳健性(聚类层级比较、Jackknife)
- `ml-causal` — 异质性处理效应的机器学习估计(可选)
- **下游**:`/write`(Phase 9,读取 `robustness-report.md` 写作稳健性节)write
Phase 9 Full Paper Writing. Reads all upstream outputs (model-spec.md, results-memo.md, robustness-report.md, literature-review-report.md) and drafts any section or the complete paper to Top-5 economics journal standards following a narrative-driven structure; assembles LaTeX, compiles to PDF, and exports DOCX via pandoc, producing a final paper/ directory.
# /write — 全文写作
## 定位
`/write` 是实证研究工作流的**第九阶段**,承接前八个阶段的全部产出,将研究从"有结果"推进到"有论文"。核心职责:
1. 读取所有上游输出文件,提取可直接嵌入论文的数字、表格引用和写作建议
2. 按 Top-5 期刊(AER/QJE/JPE/REStud/Econometrica)的写作规范,以**叙事推进逻辑**分节起草论文
3. 正文图表与附录图表按重要性分流;横向宽表独占横排页面
4. 组装完整 LaTeX 全文,编译验证,产出 `paper/paper.pdf`;并用 pandoc 导出 `paper/paper.docx`
---
## Step 0:读取上游输出
> 📎 **参见 [`shared/context-reader.md`](../shared/context-reader.md)**
> 本阶段所需文件:`model-spec.md`(必需)、`results-memo.md`(必需)。可选文件:`research-question.md`、`literature-review-report.md`、`identification-memo.md`、`data-report.md`、`robustness-report.md`、`tables/table_main.csv`。
### 0.2 提取写作关键信息
从各文件中提取以下内容,构建**写作信息底稿**,后续各节直接调用:
| 来源 | 提取内容 | 用于哪节 |
|------|---------|---------|
| `research-question.md` | 研究问题一句话、政策背景 | 摘要、引言 |
| `literature-review-report.md` | 文献综述主线、本文贡献定位、近年引用 | 引言、文献综述 |
| `identification-memo.md` | 识别策略逻辑(三层辩护)、核心假设 | 实证策略节 |
| `model-spec.md §2` | 主方程 LaTeX、符号说明 | 实证策略节 |
| `model-spec.md §4` | 标准误策略及理由 | 实证策略节 |
| `data-report.md §2–3` | 样本量 N、时间跨度、变量定义 | 数据节 |
| `results-memo.md §1` | 主系数 β̂、SE、p 值、95% CI | 摘要、引言、结果节 |
| `results-memo.md §2` | 经济显著性(σ单位效应量、相对均值%) | 结果节 |
| `results-memo.md §4` | 识别可信度及语言强度建议 | 全文因果语言 |
| `results-memo.md §5` | 外部有效性局限性 | 讨论节、结论节 |
| `robustness-report.md §1` | 稳健性一句话结论(范围 + 显著性) | 摘要、稳健性节 |
| `robustness-report.md §2` | 异质性结论(分组 + 交互项显著性)| 异质性/讨论节 |
| `robustness-report.md §3` | 机制证据及语言强度 | 机制/讨论节 |
| `robustness-report.md §5` | Phase 9 写作建议(直接使用)| 所有节 |
提取完成后向用户输出写作信息底稿确认:
```
═══════════════════════════════════════════════════════
Phase 9 写作信息底稿
═══════════════════════════════════════════════════════
研究问题:[X 对 Y 的影响]
识别策略:[策略名称]
目标参数:[ATE / ATT / LATE]
主系数:β̂ = [值](SE = [值],p = [值])
因果语言强度:[因果语言 / 审慎因果 / 相关性语言]
稳健性结论:[一句话]
异质性关键发现:[有/无,一句话]
机制证据:[有直接证据 / 间接相容证据 / 无]
═══════════════════════════════════════════════════════
```
---
## Step 1:写作任务确认
### 1.1 节次选择
向用户确认写作范围:
> *"请选择写作目标:*
>
> - **A. 完整论文**(推荐):按顺序起草所有节,组装 LaTeX 全文并编译
> - **B. 指定节次**:选择一个或多个节次单独起草
>
> 可用节次:*
> ① 摘要(Abstract)*
> ② 引言(Introduction)*
> ③ 文献综述(Literature Review)*
> ④ 数据节(Data)*
> ⑤ 实证策略节(Empirical Strategy)*
> ⑥ 结果节(Results)*
> ⑦ 稳健性节(Robustness)*
> ⑧ 异质性与机制节(Heterogeneity & Mechanisms)*
> ⑨ 讨论节(Discussion)*
> ⑩ 结论节(Conclusion)"*
### 1.2 语言确认
> *"论文语言:中文 / 英文?(默认:英文,符合国际期刊投稿惯例)"*
### 1.3 目标期刊(可选)
若用户指定目标期刊,按该期刊的格式要求调整:
| 期刊 | 字数上限 | 表格位置 | 引注格式 | 特殊要求 |
|------|---------|---------|---------|---------|
| AER | ~12,000 词 | 文末 | Author (Year) | 贡献宣示简洁,不宜超过 3 条 |
| QJE | 无硬性上限 | 文末 | Author (Year) | Introduction 要求极高,通常 4–6 页 |
| JPE | ~12,000 词 | 文末 | Author (Year) | 偏好简洁,讨论节可与结果节合并 |
| REStud | ~12,000 词 | 文末 | Author (Year) | 理论/方法结合型论文偏多 |
| Econometrica | 无硬性上限 | 文末 | Author (Year) | 数学严谨性要求最高 |
---
## Step 1.5:叙事推进原则(写作前必读)
**经济学实证论文本质上是一个因果故事**,读者从第一段开始就应该被带着走完整个研究旅程。写作时必须遵循以下原则,而非生成孤立段落的堆砌。
### 核心叙事弧
```
引言(提问)
→ 文献(已知边界)
→ 数据(研究舞台)
→ 识别策略(解题思路)
→ 主要结果(答案 + 量级)
→ 稳健性(为什么可信)
→ 机制/异质性(为什么如此)
→ 结论(So what)
```
每一节都应以**承接上一节的一句话**开篇,并以**预告下一节的一句话**结尾,形成连贯的叙事链条。
### 图表叙事配合规则
每一张引用的表格或图形,正文中必须做到三件事:
| 步骤 | 说明 | 示例 |
|------|------|------|
| **① 预告**(在引用前) | 告诉读者即将看到什么 | *"Table 2 presents our main results."* |
| **② 解读**(与引用同段)| 从表/图中提炼一句核心结论,包含具体数字 | *"The coefficient on D is 0.12 (s.e. = 0.04), implying..."* |
| **③ 评价**(引用后) | 说明这个结果对研究故事意味着什么 | *"This is consistent with our identification assumption."* |
**禁止**只写 *"See Table X."* 或 *"Figure Y shows the results."* 而不给出数字和解读。
### 正文 vs. 附录的图表分流规则
**放正文的图表**(直接支撑主叙事的发现):
| 类型 | 条件 |
|------|------|
| 描述性统计表(Table 1)| 必须在正文 |
| 主回归表(通常为 Table 2)| 必须在正文 |
| 核心识别图(事件研究图、RDD 断点图、IV 一阶段图)| 必须在正文 |
| 主要异质性结果表(若构成核心贡献之一)| 在正文 |
**放附录的图表**(支撑性或补充性发现):
| 类型 | 理由 |
|------|------|
| 稳健性检验表(多列规格对比)| 验证性,不是新发现 |
| 平衡性检验表(DiD/IV 的 balance table)| 技术性,编辑通常接受 |
| 变量定义表 | 参考性 |
| 额外的异质性分组 | 超出核心贡献的延伸 |
| 数据清洗流程表(样本筛选漏斗)| 技术性 |
| 补充机制检验图 | 探索性 |
### 横向宽表(景观页)规则
**判断标准**:回归表超过 **5 列**,或在标准页面宽度(`\textwidth`)内会产生 `Overfull \hbox > 5pt` 时,**必须**独占一页横向排版:
```latex
\begin{landscape}
\begin{table}[p] % [p] = 独占浮动页
\caption{[表格标题]}
\label{tab:xxx}
\footnotesize % 稍小字号
\begin{tabular}{...}
...
\end{tabular}
\end{table}
\end{landscape}
```
适用场景:主稳健性表(多规格对比)、异质性分组表(多子组)、平行趋势前后期分列表。
---
## Step 2:各节写作规范
> 📎 **输出格式规范参见 [`shared/output-standards.md`](../shared/output-standards.md)**
**分工说明**:本步骤定义的是**内容规范**——每节该写什么、从哪个上游文件取数据、禁止哪些表达。**LaTeX 模板结构**(preamble、各节骨架、编译流程)由 `paper-writing` skill 提供,执行写作时同时参照两者。
调用 `paper-writing` skill 并传入以下上下文(Mode A,跳过 skill 内部 Q&A):
- 研究类型:实证
- 目标期刊:`[来自 Step 1.3]`
- 论文语言:`[来自 Step 1.2]`
- 写作范围:`[来自 Step 1.1]`
- 因果语言强度:`[来自 results-memo.md §4]`
各节内容规范如下,这是 Top-5 期刊区别于一般期刊的核心要求:
---
### 2.1 摘要(Abstract)
**格式**:150–200 词,5–6 句话,结构固定:
```
句1:研究问题(What question)
句2–3:方法与数据(How + Where)
句4–5:主要发现,必须包含具体数字(What we find)
句6:贡献/政策含义(So what)
```
**必填项**:
- 必须含主系数量级(绝对值或相对均值百分比)
- 必须含识别策略名称("exploiting..."/"using a difference-in-differences design...")
- 不得仅报告统计显著性而不报告量级
**模板**:
```latex
\begin{abstract}
\noindent
[RESEARCH QUESTION IN ONE SENTENCE].
Using [DATA SOURCE] covering [N] [UNITS] over [PERIOD],
we exploit [IDENTIFICATION STRATEGY] to identify
the causal effect of [D] on [Y].
We find that [MAIN FINDING WITH COEFFICIENT]:
[D] [increases/decreases] [Y] by [MAGNITUDE]
[UNITS/PERCENT/SD],
[SIGNIFICANCE STATEMENT].
This effect is [heterogeneous/concentrated in/robust to],
with [KEY HETEROGENEITY OR ROBUSTNESS FINDING].
Our results [CONTRIBUTION: inform/challenge/extend]
[POLICY/THEORY IMPLICATION].
\end{abstract}
```
---
### 2.2 引言(Introduction)
Top-5 期刊的 Introduction 通常为 4–6 页(约 1,500–2,000 词),严格遵循以下**五段结构**:
**第一段:研究问题 + 主要发现(数字必须出现)**
> 要求:第一段结束前,读者必须知道(1)研究的问题是什么;(2)主要发现是什么;(3)效应的量级是多少。这是 AER/QJE 的不成文规范。
```latex
% 第一段模板
[HOOK SENTENCE — 一个引发政策或理论关注的事实/数字].
[RESEARCH QUESTION]. This paper provides causal evidence that
[MAIN FINDING]: [D] [causes/is associated with] a
[MAGNITUDE] [increase/decrease] in [Y],
[SIGNIFICANCE — e.g., "statistically significant at the 1\% level"].
```
**第二段:为什么这个问题难回答(内生性威胁)**
> 说明 OLS 的偏误方向和来源,为识别策略做铺垫。
```latex
% 第二段模板
Estimating the causal effect of [D] on [Y] is challenging
for three reasons. First, [OVB THREAT]. Second,
[REVERSE CAUSALITY THREAT]. Third, [MEASUREMENT ERROR/SELECTION].
Prior work using [OLS/cross-section] is likely to
[overstate/understate] the true effect because [DIRECTION OF BIAS].
```
**第三段:本文的识别策略与数据**
> 清晰描述外生变异来源,让读者理解为什么这里的估计是可信的。
**第四段:主要发现汇总(含异质性 + 稳健性一句话)**
> 量级、方向、主要异质性维度、稳健性结论。
**第五段:贡献定位 + 路线图**
> 对应 `literature-review-report.md` 的文献定位,明确说明与最近相关论文的区别;最后一句是路线图。
**引言写作禁忌**:
```
❌ 第一句话是 "This paper examines..."(太弱,直接陈述发现)
❌ 只报告统计显著性,不报告量级("我们发现显著正效应")
❌ 贡献宣示超过 4 条(审稿人会认为过度宣传)
❌ 路线图段超过 5 句话(简洁:一节一句)
❌ 在引言中出现数学公式或技术细节
```
---
### 2.3 文献综述节(Literature Review / Related Work)
**结构**:按研究主题组织,**不按时间顺序**。通常 2–4 个子主题,每个主题 1–3 段。
**必须包含**:
1. **与本文最相关的 3–5 篇论文**:说明它们做了什么、发现了什么、本文如何不同
2. **识别策略的先例**:哪些论文用过类似方法研究类似问题
3. **明确的贡献定位**:用 "Unlike [PAPER], we..." 句式结束每个子主题
**来源**:直接使用 `literature-review-report.md` 的结构,不重新检索。
---
### 2.4 数据节(Data)
**结构**:
```
§ 数据来源与样本构建
§ 核心变量定义
§ 描述性统计(Table 1,引用 tables/table1_descriptive.tex)
[§ 平衡性检验,若 DiD/IV,引用 tables/table2_balance.tex]
```
**规范**:
- 样本量、时间跨度、地域范围必须用具体数字(来自 `data-report.md §2`)
- 变量定义精确,与 `model-spec.md §2` 的符号保持一致
- 异常值处理方式需在此节说明(来自 `data-report.md §4`)
- Table 1 必须覆盖 Y、D、Z 及主要控制变量,格式:N、均值、标准差、p25、p75
**Table 1 引用格式**:
```latex
Table~\ref{tab:sumstats} reports summary statistics for our
main analysis sample. The average [Y] is [MEAN] ([SD]),
and [D PERCENT]\% of [UNITS] are [TREATED].
[KEY OBSERVATION ABOUT THE DATA — e.g., variation, distribution].
```
---
### 2.5 实证策略节(Empirical Strategy)
**这是审稿人审查最仔细的节**,必须覆盖:
1. **主方程**:直接使用 `model-spec.md §2` 的 LaTeX 方程,不重新推导
2. **识别假设**:正式表述每个假设(来自 `identification-memo.md §4`)
3. **识别假设的可检验性**:说明如何检验并预告结果("In Section X, we show...")
4. **标准误策略**:类型 + 聚类层级 + 经济学理由(来自 `model-spec.md §4`)
**识别假设陈述模板**(以平行趋势为例):
```latex
% 识别假设正式陈述
The key identifying assumption is that, in the absence of treatment,
treated and control units would have followed parallel trends:
\begin{equation*}
E[Y_{it}(0) - Y_{it'}(0) \mid D_i = 1]
= E[Y_{it}(0) - Y_{it'}(0) \mid D_i = 0], \quad \forall\, t \neq t'.
\end{equation*}
This assumption would be violated if [SPECIFIC THREAT].
We provide evidence supporting this assumption
in Section~\ref{sec:robustness}, where [BRIEF PREVIEW OF TEST RESULT].
```
**语言强度规范**:
| 识别可信度(来自 results-memo.md §4)| 识别假设表述 | 结果表述 |
|--------------------------------------|------------|---------|
| 高(所有假设 ✅)| "The causal effect of D on Y..." | "causes", "increases", "reduces" |
| 中(含 ⚠️)| "The effect of D on Y, as estimated by [STRATEGY]..." | "is associated with", "predicts" |
| 低(含 ❌)| "We document a correlation between D and Y..." | "is correlated with", "we observe" |
---
### 2.6 结果节(Results)
**Lead with numbers**(Top-5 期刊的核心规范):
> 结果节第一句话必须包含主系数数值。不得写"Table 2 shows our main results"而不给出具体数字。
**节内结构**:
```
§ 主要结果(引用 Table 2 主回归表)
- 第一句:β̂ = [值],显著性,量级解读
- 逐列介绍:Baseline → +控制变量 → +FE → 首选规格
- 经济显著性讨论(必须!)
[§ 事件研究 / 动态效应(若 DiD,引用 Fig. 1 事件研究图)]
[§ 主要异质性发现(若重要,引用 Table 3 异质性表)]
```
**经济显著性讨论格式**(来自 `results-memo.md §2`):
```latex
% 经济显著性段落模板
To assess economic significance, note that the sample mean of
[Y] is [MEAN]. Our estimate of [COEFFICIENT] implies that
[D] is associated with a [MAGNITUDE PERCENT]\% [increase/decrease]
in [Y] relative to the mean, or approximately [SD-UNIT] standard deviations.
[COMPARISON TO LITERATURE: This is [larger/smaller] than the estimate
of \citet{CLOSEST PAPER}, who find [THEIR ESTIMATE] using [THEIR METHOD].]
```
**逐列介绍规范**:
```latex
% 逐列介绍模板
Column (1) presents the baseline specification without controls.
The coefficient on [D] is [β̂_1] (s.e. = [SE_1]),
suggesting [DIRECTION] of [Y]. Adding [CONTROL SET] in column (2)
[leaves the estimate stable / reduces the estimate to [β̂_2]],
consistent with [INTERPRETATION OF CHANGE].
Our preferred specification in column ([N]) includes [FULL CONTROLS]
and yields [FINAL ESTIMATE] ([SE], [SIGNIFICANCE]).
```
---
### 2.7 稳健性节(Robustness)
**节内结构**:
```
§ 开篇结论句(先给结论,再展示细节)
§ Panel A:推断方法稳健性(替代标准误)
§ Panel B:样本稳健性(子样本、异常值剔除)
§ Panel C:规格稳健性(函数形式、控制变量集合)
§ Panel D:识别假设检验(安慰剂、预趋势等,策略特异性)
[§ Oster δ 系数稳定性]
```
**开篇结论句**(直接使用 `robustness-report.md §5` 的建议语言):
```latex
% 稳健性节开篇句模板
Table~\ref{tab:robustness} reports a battery of robustness checks.
Our main finding is robust: [MAIN COEFFICIENT RANGE — e.g.,
"the coefficient on [D] ranges from [MIN] to [MAX] across specifications,
remaining statistically significant at the [X]\% level in all cases"].
```
**对任何系数发生实质性变化的规格**,必须主动解释(不得只呈现表格):
```latex
% 系数变化的解释模板
In column ([N]), [WHAT CHANGES — e.g., "we restrict the sample to..."].
The estimate [increases/decreases] to [NEW ESTIMATE], reflecting
[ECONOMIC INTERPRETATION — e.g., "the composition of the subgroup"].
This change is consistent with [MECHANISM/THEORY], and does not
challenge our main conclusion because [REASON].
```
---
### 2.8 异质性与机制节(Heterogeneity and Mechanisms)
此节在很多 Top-5 论文中与 Results 节合并,也可单独成节。按以下顺序叙述:
**异质性分析(来自 `robustness-report.md §2`)**:
1. 先陈述理论预测(为什么预期存在异质性)
2. 再呈现表格结果(引用 `tables/table_heterogeneity.tex`)
3. 结论需回应理论预测(是否符合预期,如不符合需解释)
**机制分析(来自 `robustness-report.md §3`)**:
语言强度按 `robustness-report.md §5` 建议:
- "直接机制证据"(有外生识别)→ "provides direct evidence for"
- "间接相容证据"(仅描述性)→ "is consistent with" / "provides suggestive evidence"
- 排除竞争性解释 → "rules out" / "cannot be explained by"
---
### 2.9 结论节(Conclusion)
**Top-5 期刊的结论节通常只有 1–2 页**,结构:
```
① 复述问题 + 方法(1句)
② 主要发现(2–3句,含量级)
③ 稳健性摘要(1句)
④ 政策含义或理论贡献(2–3句)
⑤ 局限性(1–2句,诚实但简洁)
⑥ 未来研究方向(1–2句)
```
**结论禁忌**:
```
❌ 引入任何在正文中未出现的新结果
❌ 重复复述整个论文(读者已经读了)
❌ 只说"我们做了X发现了Y"不谈 So what
❌ 结论超过 2 页(审稿人会注意到)
```
---
## Step 3:LaTeX 全文组装
### 3.1 文件目录结构
```
paper/
├── paper.tex # 主文件(\input 各节)
├── sections/
│ ├── abstract.tex
│ ├── introduction.tex
│ ├── literature.tex
│ ├── data.tex
│ ├── strategy.tex
│ ├── results.tex
│ ├── robustness.tex
│ ├── heterogeneity.tex # 异质性与机制节
│ ├── discussion.tex
│ ├── conclusion.tex
│ └── appendix.tex # 附录(附录图表统一放此处)
├── tables/ # 符号链接或复制自 ../tables/
├── figures/ # 符号链接或复制自 ../figures/
└── references.bib
```
**附录节(appendix.tex)的标准结构**(根据可用上游文件自动填充):
```latex
\appendix
\renewcommand{\thesection}{\Alph{section}} % 附录编号:A, B, C...
\renewcommand{\thetable}{A\arabic{table}} % 表编号:A1, A2...
\renewcommand{\thefigure}{A\arabic{figure}} % 图编号:A1, A2...
\setcounter{table}{0}
\setcounter{figure}{0}
\section{附加稳健性检验}
\label{sec:appendix_robustness}
% 来源:tables/table_robustness_*.tex
\section{平衡性检验}
\label{sec:appendix_balance}
% 来源:tables/table_balance.tex
\section{变量定义}
\label{sec:appendix_variables}
% 来源:data-report.md 中的变量定义表
\section{补充图形}
\label{sec:appendix_figures}
% 来源:figures/ 目录中非核心图形
```
> **附录图表引用规则**:正文中凡引用附录图表,需明确标注位置,例如:*"Appendix Table A1 reports the balance test results."* 或 *"(see Appendix Figure A1)"*。
### 3.2 主文件(paper.tex)
**LaTeX preamble 和前言页结构以 `paper-writing` skill 中的模板为准**(见该 skill 的"LaTeX Preamble (Authoritative)"节)。此处不再重复 preamble 代码,以 skill 为唯一来源,避免版本分歧。
组装时注意以下两点与内容规范直接相关:
1. **`\graphicspath{{../figures/}}`**:路径相对于 `paper/`,指向上级目录的 `figures/`,不要写成 `./figures/`(`paper/` 子目录下不存在 `figures/`)。
2. **`\input` 顺序**:依次为 introduction → literature → data → strategy → results → robustness → heterogeneity → discussion → conclusion,与 Step 3.1 目录结构一致。
### 3.3 参考文献(references.bib)
根据 `literature-review-report.md` 中引用的论文,生成标准 BibTeX 格式:
```bibtex
@article{Author2020,
author = {Author, First and Coauthor, Second},
title = {Title of the Paper},
journal = {American Economic Review},
year = {2020},
volume = {110},
number = {3},
pages = {101--130},
doi = {10.1257/aer.XXXXXXXX}
}
```
---
## Step 4:编译验证 + DOCX 导出
### 4.1 LaTeX → PDF 编译
```bash
cd [workspace]/paper/
pdflatex -interaction=nonstopmode paper.tex # 第1次:生成 .aux
bibtex paper # 处理参考文献
pdflatex -interaction=nonstopmode paper.tex # 第2次:嵌入书目
pdflatex -interaction=nonstopmode paper.tex # 第3次:确保交叉引用
# 检查常见错误
grep -i "overfull\|undefined\|missing\|error" paper.log
grep "Overfull .hbox" paper.log # > 10pt = 可见溢出,必须修复
grep "File.*not found" paper.log # 缺包
grep "Citation.*undefined" paper.log # 未定义引用
```
**常见错误自动修复**:
| 错误 | 原因 | 修复方式 |
|------|------|---------|
| `File 'siunitx.sty' not found` | 表格使用 `S` 列 | 改为 `c` 列或 `D{.}{.}{-1}` |
| `Unicode character U+XXXX` | Python 写入 `≥`、`→` | 替换为 `$\geq$`、`$\rightarrow$` |
| `Overfull \hbox (>10pt)` | 表格超出页宽 | 加 `\begin{landscape}` + `\footnotesize` |
| `Citation xxx undefined` | bib 条目缺失 | 补充到 `references.bib` 后重跑 bibtex |
### 4.2 LaTeX → DOCX 导出
PDF 编译成功后,使用 `pandoc` 将 LaTeX 转为 DOCX,供非 LaTeX 用户审阅和批注:
```bash
cd [workspace]/paper/
# 方式一:pandoc(推荐,保留公式和引用)
pandoc paper.tex \
--bibliography=references.bib \
--citeproc \
--reference-doc=../shared/docx_template.docx \ # 若存在自定义模板
-o paper.docx
# 若无自定义模板(通用方式)
pandoc paper.tex \
--bibliography=references.bib \
--citeproc \
-o paper.docx
echo "✅ DOCX 已导出:paper/paper.docx"
```
若 pandoc 不可用,回退方式:
```bash
# 方式二:LibreOffice headless(较低保真度,但不需要 pandoc)
libreoffice --headless --convert-to docx paper.pdf --outdir .
echo "✅ DOCX 已导出(via LibreOffice):paper/paper.docx"
```
安装 pandoc(若缺失):
```bash
# Ubuntu/Debian
apt-get install -y pandoc 2>/dev/null || true
# macOS
brew install pandoc 2>/dev/null || true
# 检查版本
pandoc --version | head -1
```
### 4.3 编译验证摘要
```
编译验证结果
─────────────────────────────────────────────────────
✅ pdflatex 编译成功(3 次,含 bibtex)
✅ 无 undefined references
✅ 无 missing citations
⚠️ Overfull \hbox (2.3pt) — 次要,可接受
─────────────────────────────────────────────────────
✅ DOCX 导出成功(pandoc / LibreOffice)
─────────────────────────────────────────────────────
输出文件:paper/paper.pdf([N] 页)
paper/paper.docx
```
---
## Step 5:产出文件清单
```
Write: [workspace]/paper/paper.tex
Write: [workspace]/paper/sections/abstract.tex
Write: [workspace]/paper/sections/introduction.tex
Write: [workspace]/paper/sections/literature.tex
Write: [workspace]/paper/sections/data.tex
Write: [workspace]/paper/sections/strategy.tex
Write: [workspace]/paper/sections/results.tex
Write: [workspace]/paper/sections/robustness.tex
Write: [workspace]/paper/sections/heterogeneity.tex
Write: [workspace]/paper/sections/discussion.tex
Write: [workspace]/paper/sections/conclusion.tex
Write: [workspace]/paper/sections/appendix.tex # 附录(附录图表)
Write: [workspace]/paper/references.bib
```
**版本管理**:每次修改后,在文件名中加版本号:
```
paper_v1.0_YYYYMMDD.tex # 初稿
paper_v1.1_YYYYMMDD.tex # 第一轮修改
paper_v2.0_YYYYMMDD.tex # 大幅修改
```
---
## Step 6:阶段确认与移交
向用户呈现阶段摘要:
> **Phase 9 产出摘要**
>
> ✅ 摘要:`paper/sections/abstract.tex`
> ✅ 引言:`paper/sections/introduction.tex`
> ✅ 文献综述:`paper/sections/literature.tex`
> ✅ 数据节:`paper/sections/data.tex`
> ✅ 实证策略节:`paper/sections/strategy.tex`
> ✅ 结果节:`paper/sections/results.tex`
> ✅ 稳健性节:`paper/sections/robustness.tex`
> ✅ 异质性与机制节:`paper/sections/heterogeneity.tex`
> ✅ 讨论节:`paper/sections/discussion.tex`
> ✅ 结论节:`paper/sections/conclusion.tex`
> ✅ 附录:`paper/sections/appendix.tex`(含附录图表)
> ✅ 主文件 + 参考文献:`paper/paper.tex` + `paper/references.bib`
> ✅ PDF:`paper/paper.pdf`([N] 页)
> ✅ DOCX:`paper/paper.docx`(供审阅批注)
>
> 完成初稿后,可运行 `/present` 将论文转化为学术报告幻灯片。hooks
Event hooks configuration
{
"hooks": {
"SessionStart": [
{
"matcher": "",
"hooks": [
{
"type": "command",
"command": "bash -c 'echo \"\\n╔══════════════════════════════════════════╗\"; echo \"║ 📄 经济学论文工作流引导系统 ║\"; echo \"╚══════════════════════════════════════════╝\"; echo \"\\n完整工作流:\\n Phase 1: /question 明确研究问题与贡献\\n Phase 2: literature-review 技能 文献综述\\n Phase 3: /analyze 设计识别策略\\n Phase 4: /data 获取与清洗数据\\n Phase 5: /model 构建计量模型\\n Phase 6: /code 执行回归分析\\n Phase 7: /plot 结果分析与图表精修\\n Phase 8: /robustness 稳健性检验\\n Phase 9: /write 撰写完整论文\\n [可选]: /present 生成学术幻灯片\\n\\n运行 /question 开始你的论文之旅!\"'",
"timeout": 10
}
]
}
],
"Stop": [
{
"matcher": "",
"hooks": [
{
"type": "prompt",
"prompt": "You are a workflow guide for an econometrics paper writing pipeline. Review the conversation that just ended.\n\nThe pipeline has these ordered steps:\n- Phase 1: /question — Research question scoping (produces: research-question.md)\n- Phase 2: literature-review skill — Literature review (produces: literature-review-report.md)\n- Phase 3: /analyze — Identification strategy (produces: identification-memo.md)\n- Phase 4: /data — Data fetch & clean (produces: data-report.md)\n- Phase 5: /model — Model specification (produces: model-spec.md)\n- Phase 6: /code — Run regressions (produces: regression output/tables)\n- Phase 7: /plot — Results analysis & figure polish (produces: publication-ready tables/figures)\n- Phase 8: /robustness — Robustness checks (produces: robustness-report.md)\n- Phase 9: /write — Full paper writing (produces: paper/ directory)\n- Optional (no phase number): /present — Beamer slides, only relevant after Phase 9 is complete\n\nIf the assistant's last response clearly completed one of these steps (e.g. produced a key output document, finished a major analysis phase, or explicitly said a phase is done), output ONLY the following in Chinese:\n\n---\n✅ **[阶段名称]已完成!**\n\n📌 **下一步:运行 `/[command]`**\n[一句话说明下一步做什么,以及为什么现在是合适时机]\n\n💡 提示:[一句可选的小建议,帮助用户更好地进入下一阶段]\n---\n\nIf no major step was completed in this response (e.g. the user asked a question, had a discussion, or only partially worked on something), output absolutely nothing. Do not add commentary. Do not explain your reasoning.",
"timeout": 45
}
]
}
]
}
}