All articles
EngineeringApr 18, 20266 min read

Validation matters more than slogans

The AINative README already documents six concrete validation checks — unit tests, smoke tests, and Playwright end-to-end coverage. Here is why rigorous testing is a first-class design requirement.

A
AINative Studio
Engineering

A framework that streams LLM output into a live UI has many more failure modes than a traditional REST API wrapper. Tokens arrive out of order. Tool calls can timeout mid-stream. Network interruptions happen mid-response. The only way to build confidence in such a system is exhaustive, automated validation.

The six validation checks

  1. Monorepo build — TypeScript strict, zero type errors across all packages.
  2. Client unit tests — jest with JSDOM, covering the streaming state machine.
  3. Node server tests — supertest hitting a real Express app with an OpenAI mock.
  4. Python server smoke test — verifies adapter startup and basic 200 response.
  5. Playwright E2E — a real browser opens basic-chat and receives a streamed reply.
  6. CLI smoke checks — help and doctor commands exit cleanly.

Why we test the CLI separately

CLI tools fail in ways that unit tests never catch: missing binaries in PATH, wrong Node version, environment variables leaking from parent shells. Smoke-testing the compiled binary in a clean subprocess environment catches an entire class of bugs that would otherwise only surface in end-user issue reports.

yaml
name: CI
on: [push, pull_request]
jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: pnpm/action-setup@v3
      - run: pnpm install --frozen-lockfile
      - run: pnpm build
      - run: pnpm test
      - run: pnpm exec playwright test
      - run: ainative help && ainative doctor
.github/workflows/ci.yml — simplified

Testing streaming — the hard part

Streaming responses are difficult to test because the assertion must happen while the response is still in flight. We use a custom jest matcher that subscribes to the stream observable and collects all deltas before comparing them to a snapshot.

typescript
import { collectStream } from "@hari7261/ainative-client/testing";

test("accumulates tokens correctly", async () => {
  const mockStream = createMockStream(["Hello", " world", "!"]);
  const result = await collectStream(mockStream);
  expect(result.finalText).toBe("Hello world!");
  expect(result.deltaCount).toBe(3);
});
Custom matcher for streaming assertions

"Tests are not a tax on development velocity. They are the reason you can ship confidently at 11 pm on a Friday."

AINative engineering principle