Is Cowork actually working
for your users and clients?

Argus captures every session of your users across your team or organization, then reads across thousands of them at once — surfacing where skills are holding, where they're quietly breaking, and which patterns are pointing at the next thing you should build. The Argus Agent does the reading. You can talk to it in the web app, and soon, from inside Cowork itself.

Argusv1.0 · MMXXVIsample-studio
sara@studio · 27.V.MMXXVI
№0001 · Week of 14 May 2026

Dashboard

Sessions captured192▲ from 158 last week
Skills under watch14across 6 client projects
Regressions surfaced43 unreviewed
Skill candidates2from 14 unmet prompts
Skills under watch14 total →
portfolio-reportingv1.2.0
47 sessions78% first-turnregression
quarterly-deck7a3f9c
31 sessions94% first-turnholding
export-csvv0.4.1
24 sessions88% first-turnholding
variance-summaryv1.0.0
18 sessions67% first-turnreview
monthly-summaryv0.9.2
12 sessions91% first-turnholding
Argus AgentSurfaced this week
Regression

portfolio-reporting v1.2.0 · first-turn acceptance dropped 96% → 22% over 7 days. 5 sessions mark the break.

Pattern

“export to notion” recurring 14× across 8 phrasings · candidate skill drafted.

Stall

monthly-summary asks “which property?” 31× this month · under-specified input.

3 of 9 surfaced this week →
A view of the dashboard, week of 14 May 2026 — sample workspace.
The problem

Once it ships, you go blind.

Telemetry tells you that a skill ran. It can't tell you whether it worked — and it definitely can't tell you whether it stopped working halfway through the month.

№ 01

The counter that says nothing.

Cowork's built-in telemetry logs your skill as Skill: 3. Three invocations. Across 412 sessions for one client this month, you have a single number per skill — invocation count. Nothing about which versions ran, what they returned, whether the user accepted the answer or had to ask twice. The unit every observability stack is built on can't answer the question every shipped skill has to face: across N runs, was it any good?

observed412 sessionssurfaced1 aggregate counter
№ 02

The skill that broke quietly.

You shipped portfolio-reporting four weeks ago. Across the first 38 sessions, users accepted the answer on the first turn. Across the next 9, they re-asked, rephrased, switched tools. Something started failing on session 39. Nobody noticed — the cost line stayed flat and no single session looked broken on its own.

skillportfolio-reporting · v1.2.0patternfirst-turn 96% → 22%
№ 03

The skill that wasn't there yet.

Across the client's traffic this month, “export the variance table to Notion” came up fourteen times in eight different phrasings. Each one got a different ad-hoc answer; one user gave up. A skill is waiting to be written there. No telemetry surface — yours or anyone else's — will ever find it.

pattern“export to notion”sessions14 · 8 phrasings · no skill
The blind spot

What the counters can't see.

Argus listens through Cowork's hook system, not its telemetry exporter. Every prompt, every assistant message, every tool output, every follow-up — captured in plain text and stitched back into the conversation the user actually had. The qualitative layer that makes everything else possible.

№ 01

Did the user accept the answer?

A skill that works ends the conversation. A skill that doesn't gets re-prompted, rephrased, abandoned. Argus captures the user's exact follow-ups so the Agent can see, at a glance across hundreds of sessions, which versions of which skill are landing on the first turn — and which aren't.

Captured · used in: first-turn acceptance, follow-up patterns
№ 02

Did the assistant ask the user to do its job?

Skills should answer questions, not ask new ones. When a skill is under-specified, the assistant stalls — “could you clarify…”, “which one did you mean…” — and the user does the work the skill was meant to do. Argus captures the stalls so the Agent can show where they cluster.

Captured · used in: stall frequency, under-specified prompts
№ 03

What did the tool actually return?

The MCP call succeeded. Status 200. But the payload was empty, or a 400-row dump, or a JSON that didn't match what the skill asked for. Argus captures the tool's plain-text output beside the assistant's response, so the Agent can flag the sessions where the skill kept going on bad input.

Captured · used in: tool-output mismatches, silent failures
№ 04

What did users ask for that no skill could handle?

The prompts your customisations don't yet cover. Same export, same lookup, same wrangle — captured verbatim, even when nothing answered them. The Agent reads across the unmet-prompts corpus and surfaces patterns ready to become the next skill.

Captured · used in: unmet-prompt clustering, skill candidates
Four views

Your work,
made legible.

Once every session is captured qualitatively, four surfaces become possible: across your sessions, across your skills, across versions over time, and across the needs your team keeps surfacing. The Argus Agent runs the analysis underneath; you read the result.

№ 01

Sessions, organised.

Every Cowork session your team runs, captured and indexed across your organisation. Filterable per skill, per MCP, per client, per error class, per outcome — not a flat trace list, but a surface built for the questions you actually ask: “every session this week where portfolio-reporting fell back to a follow-up turn.”

Filter · group · drill down
№ 02

Usage, per skill and per MCP.

A dashboard for every skill, MCP, agent, and plugin in your catalogue — built on the qualitative signals captured per session, not on token counts. First-turn acceptance, follow-up rate, stall frequency, tool-output mismatches, hand-offs. The metrics that tell you whether the work is actually working.

Qualitative metrics · per artefact
№ 03

Quality across versions.

Every skill version, plotted against its sessions. v1.2.0's first-turn acceptance against v1.1.0. The week you shipped a refinement against the week things drifted. Quality timelines per skill, tied to the commits in the client's marketplace repo.

Version timelines · quality drift
№ 04

User needs, at scale.

What your team and your client's team keep asking Cowork to do — clustered, ranked, surfaced. Recurring prompts no skill answers. Phrasings that should cluster into one but don't. The candidate skills waiting to be written. The Agent runs the clustering; you decide what to ship next.

Unmet-prompt clustering · skill candidates
The Argus Agent is itself a Cowork plugin

Available today in the Argus web app. Coming as a Cowork plugin — invoke it during a working session, in the same surface where your team builds.

Argus helps you ship better Cowork customisations.
Its Agent is one of them.

A sample exhibit

A session, as it is filed.

One session is the receipt. The panel above it tells you where this session sits in the pattern.

PORTFOLIO-REPORTING · v1.2.0 · ACROSS 47 SESSIONS THIS WEEK
First-turn acceptance78% · was 96% three weeks ago
Median follow-ups1.4
Tool-output mismatches3 · this session is one
№ 0042 · sample-client · sara@studio
Quarterly variance report — north-region portfolio
3m 14s6 turns9 tools$0.4205
User · 14:23:12
Run the Q2 variance report for the north-region portfolio. Flag anything more than 12% off budget.
Assistant · 14:23:14
Pulling the budget and actuals now, then comparing against the 12% threshold.
skillportfolio-reporting · v1.2.0 · invoked
mcp · property-mgrlist_budgets · 412ms ✓
mcp · property-mgractuals_by_period · 318ms ✓
Assistant · 14:23:55
Two line items above the 12% threshold — repairs & maintenance (+18.4%) and utilities (+14.1%). The rest of the portfolio is within tolerance.
User · 14:25:01
Generate the deck for the owner. Note the two variances on the summary slide and add a short comment on each.
Assistant · 14:25:04
Deck rendered — 7 slides, variance summary on slide 2, line items detailed on slides 4–5.
skillquarterly-deck · 7a3f9c · 2.4s ✓
№ 0042 · margin

“The metric is in the response now — this is the version we certify.”

— Sara, 5 min after close

ReviewedAccepted 1st turnv1.2.0Pinned

Hover any turn to read its tokens. Press A to write into the margin yourself.

Data & privacy

Where the data goes.

Argus captures the most sensitive content in your client's organisation — real prompts, real responses, real tool output. The product is built around that.

№ 01

Secrets never leave the machine.

The capture plugin scrubs API keys, OAuth tokens, Bearer headers, and common password patterns at the source — before the envelope leaves the user's computer. Anthropic, OpenAI, Supabase, GitHub, AWS, Slack patterns are caught by default; you can add your own.

Built · plugin-side · pre-transit
№ 02

One word makes a session private.

Type /private in Cowork at any point and the plugin stops capturing that session — and deletes anything already shipped. The escape hatch is the user's, not the agency's. No support ticket, no admin approval.

Built · the /private command
№ 03

Redact names before review.

Per-workspace patterns for emails, names, and custom regex run beside the secret scrubber. The reviewer sees what they need to QA the skill; client employee identity stays at home.

Coming · per-workspace redaction patterns
№ 04

Encrypted, isolated, never trained on.

TLS to the ingestion worker, AES-256 at rest in the database, workspace-isolated by row-level security on every query. Argus never uses your captured sessions to train any model — ours, Anthropic's, or anyone else's.

Standing policy · enforced at the database
Audience

Three rooms, one Argus.

Argus serves the people delivering AI inside other people's organisations. Three roles share the same surface, scoped differently.

Primary · I

The agency, the forward-deployed engineer.

You ship Cowork customisations to a roster of clients. Skills, MCPs, plugins, agents — each delivery lives inside someone else's setup, where you don't have eyes on it. Argus gives you a workspace per client: every session captured, every skill versioned to a commit in their marketplace, every regression catchable before the renewal meeting.

  • Catch regressions in shipped skills
  • Refine versions from session insight
  • Invite the client into a scoped view of their data
Secondary · II

The internal IT lead.

You rolled out Cowork inside your company. Adoption is the number you can't fake — and the one Cowork's own dashboard doesn't surface. Argus shows you who's using it, what they ask for that no skill answers, and which destructive operations ran last night.

  • Adoption + cost per department
  • Lapsed users, champions, spikes
  • Recurring unmet prompts → next skill
Read-only · III

The client stakeholder.

Your agency is delivering Cowork customisations on your behalf. They invite you into a scoped view inside their Argus workspace — your own org's sessions only, white-labelled to your house style, read-only.

  • Scoped to your org's data only
  • White-labelled to your agency's style
  • No setup on your side · invitation handles it
Progress

Where we are in the work.

Three things known to be true about the current state of AI inside organisations.

95%
of enterprise GenAI pilots produce no measurable P&L impact.
Most failures aren't model quality. They're a missing iteration loop.
MIT Project NANDA · State of AI in Business 2025
70%
of forward-deployed AI engagements forecast to be abandoned by 2028.
Vendors can't continuously evidence value; clients can't evolve the work alone.
Gartner · via CIO.com · 2025
78%
of AI users at work bring their own tools — the sanctioned deployment doesn't fit the work.
Shadow AI is the tell that the deployment missed.
Microsoft Work Trend Index · 2024
Launch

Doors open in June 26.

Argus is private while we finish the work. Web app, capture plugin, and Argus Agent — all built first for Cowork. Open to what follows it.

LEAVE A CARD№0001
What do you ship?

No marketing emails. One letter when we open, one if anything changes about that.

What you're joining
  • The Argus web app — review, catalogue, audience
  • The capture plugin — hook-level session telemetry
  • The Argus Agent — analysis across every session, every skill, every version

Available at invitation, in batches. We open access once the workshop holds a session without our hand on it.

Argus · est. MMXXVIBuilt first for Cowork · open to what follows itSet in Source Serif 4 & IBM Plex Mono