System health dashboard
Worker, workflows, TLM, Neon, KV, queues, bundlers, and RPCs each contribute to a unified status grid.
Spells v2 Walkthrough
Operational readiness, testing gates, and launch decisions.
On This Page
Observability is treated as part of the financial safety model. Trade execution, position monitoring, trigger evaluation, and emergency close each have their own SLOs, metrics, dashboards, and alerting rules because the team needs to know when a degraded mode is still safe and when it has crossed into dangerous territory.
<30s
Target end-to-end on Arbitrum from cast request to finalized state.
<45s
Healthy position monitor cycle must finish inside its 60-second cadence.
<5s
Threshold breach to dispatch completion for critical alerts.
99.99%
Availability target for the safety net endpoint.
Worker, workflows, TLM, Neon, KV, queues, bundlers, and RPCs each contribute to a unified status grid.
Trade funnel, gas efficiency, failover events, paymaster health, and active exposure distribution.
Traffic, active users, active triggers, route latency, and user-facing error composition.
The stack uses static analysis, unit and integration tests, on-chain tests, E2E journeys, load tests, and chaos tests. The key point is not using every tool. It is matching the tool to the type of failure the system can actually suffer.
Protect shared business logic, schemas, query modules, worker contracts, and the TLM state machine.
Only the narrowest critical journeys: cast, trigger fire, emergency close, and DISARM.
Stress the monitor, edge API, queue replay, bundler/RPC failover, and database outage cascades.
Shared contracts and business logic lock first. Execution backbone and GMX integration follow. API and web come after the backend seams stabilize. Resilience and intelligence hardening are added before launch-readiness closes the loop.
Audit edge cases, scaffold the monorepo, lock schemas and shared constants, and stabilize package contracts.
Implement TLM, core workflows, registry, GMX calldata and wrapper path, and prove end-to-end execution on testnets.
Ship the user-facing surfaces, then harden queue replay, fallback reads, monitoring, TP/SL, and signal pipelines.
Close observability and runbooks, then leave cross-region Neon, stronger key infrastructure, and advanced nonce parallelism for justified later work.
The edge-case audit and GMX integration spec are effectively a list of things the team must not get wrong. These are not “nice-to-have hardening tasks.” They are the practical controls that prevent protocol quirks from becoming user-facing losses.
The spec marks a small set of questions as deferred rather than silently settled. They should stay visible so the team does not mistake present launch scope for a permanent architecture ceiling.
The constants reference exists so the team can audit drift between documents. It is where cadence, retry budgets, fee caps, wrapper bounds, alert thresholds, and quality gates become numerically explicit rather than narrative.
Cloudflare, Neon, and Vercel stay modest until monitoring scale, replay load, or observability storage grows materially.
Cadences, timeout windows, fee caps, alert thresholds, and wrapper bounds all live in one audit surface.
Trade latency, alert delivery, monitor cycle completion, and emergency-close availability remain the visible reliability promises.