Testlash
OpenClaw uchta Vitest to‘plamiga ega (unit/integration, e2e, live) va kichik Docker runnerlar to‘plamiga ega. Ushbu hujjat — “biz qanday test qilamiz” qo‘llanmasi:- Har bir to‘plam nimani qamrab oladi (va ataylab nimani qamrab olmaydi)
- Keng tarqalgan ish jarayonlari uchun qaysi buyruqlarni ishga tushirish kerak (lokal, pre-push, nosozliklarni tuzatish)
- Live testlar autentifikatsiya ma’lumotlarini qanday aniqlaydi va model/provayderlarni qanday tanlaydi
- Haqiqiy model/provayder muammolari uchun regressiya testlarini qanday qo‘shish kerak
Tezkor boshlash
Most days:- Full gate (expected before push):
pnpm build && pnpm check && pnpm test
- Coverage gate:
pnpm test:coverage - E2E suite:
pnpm test:e2e
- Live suite (models + gateway tool/image probes):
pnpm test:live
Test suites (what runs where)
Think of the suites as “increasing realism” (and increasing flakiness/cost):Unit / integration (default)
- Buyruq:
pnpm test - Config:
vitest.config.ts - Files:
src/**/*.test.ts - Scope:
- Pure unit tests
- In-process integration tests (gateway auth, routing, tooling, parsing, config)
- Deterministic regressions for known bugs
- Expectations:
- Runs in CI
- Haqiqiy kalitlar talab qilinmaydi
- Should be fast and stable
- Pool eslatmasi:
- OpenClaw tezroq unit shard’lar uchun Node 22/23 da Vitest
vmForksdan foydalanadi. - Node 24+ da OpenClaw Node VM bog‘lanish xatolaridan (
ERR_VM_MODULE_LINK_FAILURE/module is already linked) qochish uchun avtomatik ravishda oddiyforksga o‘tadi. OPENCLAW_TEST_VM_FORKS=0(majburiyforks) yokiOPENCLAW_TEST_VM_FORKS=1(majburiyvmForks) bilan qo‘lda bekor qiling.
- OpenClaw tezroq unit shard’lar uchun Node 22/23 da Vitest
E2E (gateway smoke)
- Command:
pnpm test:e2e - Config:
vitest.e2e.config.ts - Files:
src/**/*.e2e.test.ts - Runtime standart sozlamalari:
- Fayllarni tezroq ishga tushirish uchun Vitest
vmForksdan foydalanadi. - Moslashuvchan worker’lardan foydalanadi (CI: 2-4, local: 4-8).
- Konsol I/O yukini kamaytirish uchun sukut bo‘yicha silent rejimida ishlaydi.
- Fayllarni tezroq ishga tushirish uchun Vitest
- Foydali override’lar:
- Worker sonini majburan belgilash uchun
OPENCLAW_E2E_WORKERS=<n>(maksimal 16 bilan cheklangan). - Haqiqiy kalitlar talab qilinmaydi
- Worker sonini majburan belgilash uchun
- Scope:
- Multi-instance gateway end-to-end behavior
- WebSocket/HTTP surfaces, node pairing, and heavier networking
- Expectations:
- Runs in CI (when enabled in the pipeline)
- Haqiqiy kalitlar talab qilinmaydi
- More moving parts than unit tests (can be slower)
Live (real providers + real models)
- Command:
pnpm test:live - Config:
vitest.live.config.ts - Files:
src/**/*.live.test.ts - Default: enabled by
pnpm test:live(setsOPENCLAW_LIVE_TEST=1) - Scope:
- “Does this provider/model actually work today with real creds?”
- Catch provider format changes, tool-calling quirks, auth issues, and rate limit behavior
- Expectations:
- Not CI-stable by design (real networks, real provider policies, quotas, outages)
- Costs money / uses rate limits
- Prefer running narrowed subsets instead of “everything”
- Live runs will source
~/.profileto pick up missing API keys - Anthropic key rotation: set
OPENCLAW_LIVE_ANTHROPIC_KEYS="sk-...,sk-..."(orOPENCLAW_LIVE_ANTHROPIC_KEY=sk-...) or multipleANTHROPIC_API_KEY*vars; tests will retry on rate limits
Which suite should I run?
Use this decision table:- Editing logic/tests: run
pnpm test(andpnpm test:coverageif you changed a lot) - Touching gateway networking / WS protocol / pairing: add
pnpm test:e2e - Debugging “my bot is down” / provider-specific failures / tool calling: run a narrowed
pnpm test:live
Live: model smoke (profile keys)
Live tests are split into two layers so we can isolate failures:- “Direct model” bizga provayder/model berilgan kalit bilan umuman javob bera olishini bildiradi.
- “Gateway smoke” tells us the full gateway+agent pipeline works for that model (sessions, history, tools, sandbox policy, etc.).
Layer 1: Direct model completion (no gateway)
- Test:
src/agents/models.profiles.live.test.ts - Goal:
- Enumerate discovered models
- Use
getApiKeyForModelto select models you have creds for - Har bir model uchun kichik yakunlashni ishga tushiring (zarur bo‘lsa, maqsadli regressiyalar bilan)
- Qanday yoqish:
pnpm test:live(orOPENCLAW_LIVE_TEST=1if invoking Vitest directly)
- Bu to‘plamni haqiqatan ham ishga tushirish uchun
OPENCLAW_LIVE_MODELS=modern(yokiall, modern uchun alias) ni o‘rnating; aks holda u o‘tkazib yuboriladi vapnpm test:livegateway smoke testlariga yo‘naltirilgan bo‘lib qoladi - Modellarni qanday tanlash:
- Zamonaviy ruxsat etilgan ro‘yxatni ishga tushirish uchun
OPENCLAW_LIVE_MODELS=modern(Opus/Sonnet/Haiku 4.5, GPT-5.x + Codex, Gemini 3, GLM 4.7, MiniMax M2.1, Grok 4) OPENCLAW_LIVE_MODELS=all— zamonaviy ruxsat etilgan ro‘yxat uchun alias- yoki
OPENCLAW_LIVE_MODELS="openai/gpt-5.2,anthropic/claude-opus-4-6,..."(vergul bilan ajratilgan allowlist)
- Zamonaviy ruxsat etilgan ro‘yxatni ishga tushirish uchun
- Provayderlarni qanday tanlash:
OPENCLAW_LIVE_PROVIDERS="google,google-antigravity,google-gemini-cli"(vergul bilan ajratilgan allowlist)
- Kalitlar qayerdan olinadi:
- Sukut bo‘yicha: profil ombori va muhit (env) fallback’lari
- Faqat profil omborini majburiy qilish uchun
OPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1ni o‘rnating
- Nega bu mavjud:
- “Provayder API buzilgan / kalit yaroqsiz” holatini “gateway agent pipeline buzilgan” holatidan ajratadi
- Kichik, izolyatsiyalangan regressiyalarni o‘z ichiga oladi (masalan: OpenAI Responses/Codex Responses mantiqiy qayta ijro + tool-call oqimlari)
2-qavat: Gateway + dev agent smoke (“@openclaw” aslida nima qiladi)
- Test:
src/gateway/gateway-models.profiles.live.test.ts - Goal:
- Jarayon ichidagi gateway’ni ishga tushiring
agent:dev:*sessiyasini yaratish/patch qilish (har ishga tushirishda modelni override qilish)- Kalitlari bor modellar bo‘ylab aylanish va quyidagilarni tekshirish:
- “mazmunli” javob (toolsiz)
- haqiqiy tool chaqiruvi ishlashi (read probe)
- ixtiyoriy qo‘shimcha tool probelari (exec+read probe)
- OpenAI regressiya yo‘llari (faqat tool-call → follow-up) ishlashda davom etadi
- Probe tafsilotlari (nosozliklarni tez tushuntira olish uchun):
readprobe: test ish maydonida nonce fayl yozadi va agentdan unireadqilib nonce’ni qayta aks ettirishni so‘raydi.exec+readprobe: test agentdan nonce’ni vaqtinchalik faylgaexecorqali yozishni, so‘ng unireadqilib qaytarishni so‘raydi.- image probe: test yaratilgan PNG’ni (mushuk + tasodifiy kod) biriktiradi va modeldan
cat <CODE>qaytarishini kutadi. - Implementatsiya uchun havola:
src/gateway/gateway-models.profiles.live.test.tsvasrc/gateway/live-image-probe.ts.
- Qanday yoqish:
pnpm test:live(orOPENCLAW_LIVE_TEST=1if invoking Vitest directly)
- Modellarni qanday tanlash:
- Odatiy: zamonaviy ruxsat etilgan ro‘yxat (Opus/Sonnet/Haiku 4.5, GPT-5.x + Codex, Gemini 3, GLM 4.7, MiniMax M2.1, Grok 4)
OPENCLAW_LIVE_GATEWAY_MODELS=all— zamonaviy ruxsat etilgan ro‘yxat uchun alias- Yoki toraytirish uchun
OPENCLAW_LIVE_GATEWAY_MODELS="provider/model"(yoki vergul bilan ajratilgan ro‘yxat) ni o‘rnating
- Provayderlarni qanday tanlash (“OpenRouter everything”dan qochish):
OPENCLAW_LIVE_GATEWAY_PROVIDERS="google,google-antigravity,google-gemini-cli,openai,anthropic,zai,minimax"(vergul bilan ajratilgan allowlist)
- Tool + image probelari bu live testda har doim yoqilgan:
readprobe +exec+readprobe (tool stress)- Model image input qo‘llab-quvvatlashini e’lon qilganda image probe ishga tushadi
- Oqim (yuqori darajada):
- Test “CAT” + tasodifiy kod bilan kichik PNG yaratadi (
src/gateway/live-image-probe.ts) - Uni
agentorqaliattachments: [{ mimeType: "image/png", content: "<base64>" }]bilan yuboradi - Gateway biriktirmalarni
images[]ga ajratadi (src/gateway/server-methods/agent.ts+src/gateway/chat-attachments.ts) - Ichki agent multimodal foydalanuvchi xabarini modelga uzatadi
- Tekshiruv: javob
cat+ kodni o‘z ichiga oladi (OCR tolerantligi: kichik xatolarga ruxsat beriladi)
- Test “CAT” + tasodifiy kod bilan kichik PNG yaratadi (
provider/model ID’larini) ko‘rish uchun quyidagini ishga tushiring:
Live: Anthropic setup-token smoke
- Test:
src/agents/anthropic.setup-token.live.test.ts - Goal: verify Claude Code CLI setup-token (or a pasted setup-token profile) can complete an Anthropic prompt.
- Enable:
pnpm test:live(orOPENCLAW_LIVE_TEST=1if invoking Vitest directly)OPENCLAW_LIVE_SETUP_TOKEN=1
- Token sources (pick one):
- Profile:
OPENCLAW_LIVE_SETUP_TOKEN_PROFILE=anthropic:setup-token-test - Raw token:
OPENCLAW_LIVE_SETUP_TOKEN_VALUE=sk-ant-oat01-...
- Profile:
- Model override (optional):
OPENCLAW_LIVE_SETUP_TOKEN_MODEL=anthropic/claude-opus-4-6
Live: CLI backend smoke (Claude Code CLI or other local CLIs)
- Test:
src/gateway/gateway-cli-backend.live.test.ts - Goal: validate the Gateway + agent pipeline using a local CLI backend, without touching your default config.
- Enable:
pnpm test:live(orOPENCLAW_LIVE_TEST=1if invoking Vitest directly)OPENCLAW_LIVE_CLI_BACKEND=1
- Defaults:
- Model:
claude-cli/claude-sonnet-4-5 - Buyruq:
claude - Args:
["-p","--output-format","json","--dangerously-skip-permissions"]
- Model:
- Overrides (optional):
OPENCLAW_LIVE_CLI_BACKEND_MODEL="claude-cli/claude-opus-4-6"OPENCLAW_LIVE_CLI_BACKEND_MODEL="codex-cli/gpt-5.3-codex"OPENCLAW_LIVE_CLI_BACKEND_COMMAND="/full/path/to/claude"OPENCLAW_LIVE_CLI_BACKEND_ARGS='["-p","--output-format","json","--permission-mode","bypassPermissions"]'OPENCLAW_LIVE_CLI_BACKEND_CLEAR_ENV='["ANTHROPIC_API_KEY","ANTHROPIC_API_KEY_OLD"]'OPENCLAW_LIVE_CLI_BACKEND_IMAGE_PROBE=1to send a real image attachment (paths are injected into the prompt).OPENCLAW_LIVE_CLI_BACKEND_IMAGE_ARG="--image"to pass image file paths as CLI args instead of prompt injection.IMAGE_ARGo‘rnatilganda rasm argumentlari qanday uzatilishini boshqarish uchunOPENCLAW_LIVE_CLI_BACKEND_IMAGE_MODE="repeat"(yoki"list").OPENCLAW_LIVE_CLI_BACKEND_RESUME_PROBE=1to send a second turn and validate resume flow.
OPENCLAW_LIVE_CLI_BACKEND_DISABLE_MCP_CONFIG=0to keep Claude Code CLI MCP config enabled (default disables MCP config with a temporary empty file).
Recommended live recipes
Narrow, explicit allowlists are fastest and least flaky:-
Single model, direct (no gateway):
OPENCLAW_LIVE_MODELS="openai/gpt-5.2" pnpm test:live src/agents/models.profiles.live.test.ts
-
Single model, gateway smoke:
OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.2" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts
-
Tool calling across several providers:
OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.2,anthropic/claude-opus-4-6,google/gemini-3-flash-preview,zai/glm-4.7,minimax/minimax-m2.1" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts
-
Google focus (Gemini API key + Antigravity):
- Gemini (API key):
OPENCLAW_LIVE_GATEWAY_MODELS="google/gemini-3-flash-preview" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts - Antigravity (OAuth):
OPENCLAW_LIVE_GATEWAY_MODELS="google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-pro-high" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts
- Gemini (API key):
google/...uses the Gemini API (API key).google-antigravity/...uses the Antigravity OAuth bridge (Cloud Code Assist-style agent endpoint).google-gemini-cli/...mashinangizdagi lokal Gemini CLI’dan foydalanadi (alohida autentifikatsiya + asboblar o‘ziga xosliklari).- Gemini API va Gemini CLI taqqoslanishi:
- API: OpenClaw Google’ning joylashtirilgan Gemini API’sini HTTP orqali chaqiradi (API kaliti / profil autentifikatsiyasi); ko‘pchilik foydalanuvchilar “Gemini” deganda shuni nazarda tutadi.
- CLI: OpenClaw lokal
geminibinarini ishga tushiradi; uning o‘z autentifikatsiyasi bor va u boshqacha ishlashi mumkin (striming/asboblar qo‘llab-quvvatlashi/versiya nomuvofiqligi).
Live: model matritsasi (biz qamrab oladiganlari)
Doimiy “CI model ro‘yxati” yo‘q (live — ixtiyoriy), ammo bular kalitlari mavjud bo‘lgan dev mashinada muntazam qamrab olish uchun tavsiya etilgan modellardir.Zamonaviy smoke to‘plam (asbob chaqirish + rasm)
Bu biz ishlashini saqlab qolishni kutadigan “umumiy modellar” yugurishi:- OpenAI (Codex emas):
openai/gpt-5.2(ixtiyoriy:openai/gpt-5.1) - OpenAI Codex:
openai-codex/gpt-5.3-codex(ixtiyoriy:openai-codex/gpt-5.3-codex-codex) - Anthropic:
anthropic/claude-opus-4-6(yokianthropic/claude-sonnet-4-5) - Google (Gemini API):
google/gemini-3-pro-previewvagoogle/gemini-3-flash-preview(eski Gemini 2.x modellaridan qoching) - Google (Antigravity):
google-antigravity/claude-opus-4-6-thinkingvagoogle-antigravity/gemini-3-flash - Z.AI (GLM):
zai/glm-4.7 - MiniMax:
minimax/minimax-m2.1
OPENCLAW_LIVE_GATEWAY_MODELS="openai/gpt-5.2,openai-codex/gpt-5.3-codex,anthropic/claude-opus-4-6,google/gemini-3-pro-preview,google/gemini-3-flash-preview,google-antigravity/claude-opus-4-6-thinking,google-antigravity/gemini-3-flash,zai/glm-4.7,minimax/minimax-m2.1" pnpm test:live src/gateway/gateway-models.profiles.live.test.ts
Asosiy: asbob chaqirish (Read + ixtiyoriy Exec)
Har bir provayder oilasidan kamida bittasini tanlang:- OpenAI:
openai/gpt-5.2(yokiopenai/gpt-5-mini) - Anthropic:
anthropic/claude-opus-4-6(yokianthropic/claude-sonnet-4-5) - Google:
google/gemini-3-flash-preview(yokigoogle/gemini-3-pro-preview) - Z.AI (GLM):
zai/glm-4.7 - MiniMax:
minimax/minimax-m2.1
- xAI:
xai/grok-4(yoki mavjud eng so‘nggi) - Mistral:
mistral/… (sizda yoqilgan “tools” imkoniyatiga ega bitta modelni tanlang) - Cerebras:
cerebras/… (agar ruxsatingiz bo‘lsa) - LM Studio:
lmstudio/… (lokal; asbob chaqirish API rejimiga bog‘liq)
Vision: rasm yuborish (ilova → multimodal xabar)
OPENCLAW_LIVE_GATEWAY_MODELS ichiga kamida bitta rasmni qo‘llab-quvvatlaydigan modelni kiriting (Claude/Gemini/OpenAI’ning vision imkoniyatli variantlari va h.k.) rasm tekshiruvini ishga tushirish uchun.
Aggregatorlar / muqobil gateway’lar
Agar kalitlaringiz yoqilgan bo‘lsa, biz quyidagilar orqali ham testlashni qo‘llab-quvvatlaymiz:- OpenRouter:
openrouter/...(yuzlab modellar; asbob+rasm imkoniyatiga ega nomzodlarni topish uchunopenclaw models scandan foydalaning) - OpenCode Zen:
opencode/...(OPENCODE_API_KEY/OPENCODE_ZEN_API_KEYorqali autentifikatsiya)
- O‘rnatilgan:
openai,openai-codex,anthropic,google,google-vertex,google-antigravity,google-gemini-cli,zai,openrouter,opencode,xai,groq,cerebras,mistral,github-copilot models.providersorqali (maxsus endpoint’lar):minimax(cloud/API), shuningdek OpenAI/Anthropic’ga mos keladigan istalgan proksi (LM Studio, vLLM, LiteLLM va h.k.)
discoverModels(...) nimani qaytarsa + mavjud kalitlar.
Credential’lar (hech qachon commit qilmang)
Live testlar credential’larni CLI qanday topса, xuddi shunday topadi. Amaliy oqibatlar:- Agar CLI ishlasa, live testlar ham xuddi shu kalitlarni topishi kerak.
-
Agar live test “no creds” desa,
openclaw models list/ model tanlashni qanday debug qilsangiz, xuddi shunday debug qiling. -
Profil ombori:
~/.openclaw/credentials/(afzal; testlarda “profile keys” degani shu) -
Konfiguratsiya:
~/.openclaw/openclaw.json(yokiOPENCLAW_CONFIG_PATH)
~/.profile), run local tests after source ~/.profile, or use the Docker runners below (they can mount ~/.profile into the container).
Deepgram live (audio transcription)
- Test:
src/media-understanding/providers/deepgram/audio.live.test.ts - Enable:
DEEPGRAM_API_KEY=... DEEPGRAM_LIVE_TEST=1 pnpm test:live src/media-understanding/providers/deepgram/audio.live.test.ts
Docker runners (optional “works in Linux” checks)
These runpnpm test:live inside the repo Docker image, mounting your local config dir and workspace (and sourcing ~/.profile if mounted):
- Direct models:
pnpm test:docker:live-models(script:scripts/test-live-models-docker.sh) - Gateway + dev agent:
pnpm test:docker:live-gateway(script:scripts/test-live-gateway-models-docker.sh) - Onboarding wizard (TTY, full scaffolding):
pnpm test:docker:onboard(script:scripts/e2e/onboard-docker.sh) - Gateway networking (two containers, WS auth + health):
pnpm test:docker:gateway-network(script:scripts/e2e/gateway-network-docker.sh) - Plugins (custom extension load + registry smoke):
pnpm test:docker:plugins(script:scripts/e2e/plugins-docker.sh)
OPENCLAW_CONFIG_DIR=...(default:~/.openclaw) mounted to/home/node/.openclawOPENCLAW_WORKSPACE_DIR=...(default:~/.openclaw/workspace) mounted to/home/node/.openclaw/workspaceOPENCLAW_PROFILE_FILE=...(default:~/.profile) mounted to/home/node/.profileand sourced before running testsOPENCLAW_LIVE_GATEWAY_MODELS=.../OPENCLAW_LIVE_MODELS=...to narrow the runOPENCLAW_LIVE_REQUIRE_PROFILE_KEYS=1to ensure creds come from the profile store (not env)
Docs sanity
Run docs checks after doc edits:pnpm docs:list.
Offline regression (CI-safe)
These are “real pipeline” regressions without real providers:- Gateway tool calling (mock OpenAI, real gateway + agent loop):
src/gateway/gateway.tool-calling.mock-openai.test.ts - Gateway wizard (WS
wizard.start/wizard.next, writes config + auth enforced):src/gateway/gateway.wizard.e2e.test.ts
Agent reliability evals (skills)
We already have a few CI-safe tests that behave like “agent reliability evals”:- Mock tool-calling through the real gateway + agent loop (
src/gateway/gateway.tool-calling.mock-openai.test.ts). - End-to-end wizard flows that validate session wiring and config effects (
src/gateway/gateway.wizard.e2e.test.ts).
- Decisioning: when skills are listed in the prompt, does the agent pick the right skill (or avoid irrelevant ones)?
- Compliance: does the agent read
SKILL.mdbefore use and follow required steps/args? - Workflow contracts: multi-turn scenarios that assert tool order, session history carryover, and sandbox boundaries.
- A scenario runner using mock providers to assert tool calls + order, skill file reads, and session wiring.
- A small suite of skill-focused scenarios (use vs avoid, gating, prompt injection).
- Optional live evals (opt-in, env-gated) only after the CI-safe suite is in place.
Adding regressions (guidance)
When you fix a provider/model issue discovered in live:- Add a CI-safe regression if possible (mock/stub provider, or capture the exact request-shape transformation)
- If it’s inherently live-only (rate limits, auth policies), keep the live test narrow and opt-in via env vars
- Prefer targeting the smallest layer that catches the bug:
- provider request conversion/replay bug → direct models test
- gateway session/history/tool pipeline bug → gateway live smoke or CI-safe gateway mock test