Codegen-to-Documentation Pipeline#

I developed a pipeline that converts raw UI codegen (captured from screen interactions) into working end-to-end test files, then transforms those test files into full user-facing documentation (HTML and PDF) with minimal manual intervention. What follows is a generalized description of the methodology, architecture, and results.

Problem#

The core question: how can one person keep up with the testing and documentation needs for an organization’s software, while producing both HTML and PDF output simultaneously, reusing as much as possible, and keeping the whole system reproducible?

Approach: JSON Configs as a Compression Layer#

Re-reading full documentation every time a test file is created is wasteful. The solution was JSON config files: a compact list of every function in the repository with a one-to-two-line definition of what it does. This lets the agent convert codegen into working test files without error and without re-reading extensive documentation each time.

{
  "functions": {
    "clickSubmitButton": "Clicks the primary submit button on the current form",
    "waitForConfirmationModal": "Waits for the confirmation dialog to appear",
    "verifySuccessMessage": "Asserts the success toast notification is visible",
    "navigateToSection": "Navigates to a named section via the sidebar menu",
    "fillFormField": "Enters a value into a form field by label text"
  }
}

The same config-driven approach extends to documentation content (column definitions, status types, form fields). One config update propagates to tests and docs simultaneously, keeping everything in sync and easy for both humans and automated tools to work with.

Pipeline Overview#

Each phase saves a checkpoint, so if something fails partway through, earlier phases don’t need to re-run. This makes the pipeline resumable and keeps iteration fast.

┌───────────────────────────┐
│  Capture & Normalize      │
│  Raw codegen → cleaned    │
│  input via rule sets      │
└─────────────┬─────────────┘
              │
              ▼
┌───────────────────────────┐
│  Test Generation          │
│  Agent + JSON config →   │
│  structured test files    │
└─────────────┬─────────────┘
              │
              ▼
┌───────────────────────────┐
│  Execution & Auto-Fix     │
│  Run tests, fix failures, │
│  capture screenshots      │
└─────────────┬─────────────┘
              │
              ▼
┌───────────────────────────┐
│  Documentation Generation │
│  Scaffold from tests →    │
│  multi-pass content fill  │
└─────────────┬─────────────┘
              │
              ▼
┌───────────────────────────┐
│  Quality Assurance        │
│  Completeness, style,     │
│  human review, PDF layout │
└─────────────┬─────────────┘
              │
              ▼
┌───────────────────────────┐
│  Final Output             │
│  HTML + PDF               │
└───────────────────────────┘

Pipeline Phases#

Execution & Auto-Fix

tests run against the live application and enter an auto-fix loop on failure: adjust waits for timeouts, try alternative selectors, retry with backoff. The loop runs up to five iterations before escalating to a human. (Claude can now access the running web application directly, identify what went wrong, and attempt to correct the test case itself.)

Real-world results across 376 test files for a multi-role application with four user manuals:

Metric	Value
Passed on first run	288 (76.6%)
Auto-fixed by agent	70 (18.6%)
Escalated to human	18 (4.8%)
Total pipeline time	~15 hours 43 minutes

95.2% resolved without human intervention. The 18 escalated files involved application-specific state the agent couldn’t replicate without additional context.

Quality Gates#

Gate	Focus
First Passthrough	Does the generated doc match what the codegen describes?
Multi-Pass Audit	Side-by-side verification, style conformance, known-issue detection
Human Review	Final accuracy check, layout and formatting quality

Next Steps#

The next steps are to perform regression testing and statistical modeling to determine the true percentage increase in productivity for the pipeline.