Karmaflow

AI Workforce / Data Science Agents

Answers, with the work shown.

A data science agent that turns a plain-English question into a board-ready brief — executive summary, drivers, cohorts, recommended actions, and a methods appendix any analyst can audit. It reads your warehouse and your unstructured text in the same pass, and writes results back so the next analysis starts from the last one, not from zero.

Text-to-SQL chatbot

Answers the query

  • Generates a single query against a semantic layer.
  • Hands back a chart with no provenance and no narrative.
  • Analyst still has to write the brief, defend the method, and re-derive the cohorts next week.
Data science agent

Answers the question

  • Joins structured warehouse data with the unstructured text behind it.
  • Ships an exec summary, drivers, cohorts, and a methods appendix — every claim cited.
  • Writes findings back to graph memory so the next run starts from the last.

The output

A brief, not a chart.

Most “AI for data” tools stop at a chart and assume the analyst will finish the story. The agent ships the finished story by default — and the methods appendix that lets your reviewer trust it.

Brief · Q3 EU mid-market churnAsked 2 minutes ago · 4 sources · confidence 0.82

Executive summary

14 EU mid-market accounts are at elevated 30-day churn risk.

Risk is concentrated in accounts with two repeated billing failures in the last 60 days and a measurable drop in seat utilisation after their renewal anniversary. Sentiment in their last five support interactions trended from neutral to frustrated. Two of the fourteen have an open Gong call where price is the dominant topic.

The wedge

Joins structured data with the text behind it.

Warehouse-only tools answer the half of the question that fits in a SQL grammar. Most of the question lives in tickets, transcripts, and notes. The agent reads both in one pass, and tells you which sources shaped each claim.

The question

“Why is EU mid-market churning, and is it pricing or product?”

Structured

Joins finance.invoices, product.seat_events, and crm.opportunities across the EU mid-market segment for the trailing 90 days.

Unstructured

Reads the support tickets and sales-call transcripts attached to those accounts — extracts sentiment trend, dominant topics, and concrete quotes for citation.

The answer

Mostly product friction, not pricing — and the brief proves both.

Population-level drivers are billing failures and seat-utilisation drop after renewal. Pricing shows up in two specific accounts and is flagged separately so the CSM treats them as bespoke, not as evidence of a pattern.

  • Structured3 warehouse tables, 14 rows surfaced, joined on account_id
  • Unstructured27 tickets, 8 calls, 4 verbatim quotes cited inline
  • Confidence0.82 per-claim; 2 claims under 0.65 flagged as “needs review”

The compounding part

Insight that does not reset every week.

Most analyst work today is rediscovery — the same cohorts re-derived from scratch, the same assumptions re-typed into a new notebook. Briefs project back to a graph memory the agent can read on the next run, so yesterday's definitions are tomorrow's starting point.

Without compounding memory

Every run starts from zero.

An analyst writes the cohort, the thresholds, and the “exclude trial accounts” logic again. A reviewer compares two briefs and cannot tell whether the difference is the data or the definitions.

Run 2 · re-derived from scratchcohort: re-typed by analystthresholds: re-typed by analystexclusions: re-typed by analysttime-to-brief: hours
With compounding memory

Run 2 references Run 1 by name.

Cohorts, thresholds, and exclusions from the prior brief are stored as graph nodes the agent can call back. A new question against the same cohort inherits the definitions and any reviewer overrides.

Run 2 · references graphcohort: eu-mm-churn-2026Q3 (inherited)thresholds: inherited, audit-loggedexclusions: inherited from Run 1 reviewertime-to-brief: minutes

What it ships

Question in. Reviewable artifact out.

The agent does not produce a chat reply you have to copy-paste into a deck. It produces the artifact your team already uses to make decisions, with the citations attached.

The askSources it readsWhat it ships
Why is EU mid-market churning?SnowflakeSalesforceZendeskGongBoard-ready brief with drivers, cohort cuts, and a methods appendix any analyst can audit.
Watch NRR weekly — flag anything that drifts.WarehouseBillingSemantic layerScheduled monitor that posts an anomaly brief with the trigger, the cohort, and the suggested next look.
Build a Q3 pipeline dashboard.CRMProduct analyticsMarketingKPI tiles, trend views, top drivers, and an "if this changes, look here" panel — exported to your BI tool.
Which support trends are about to become product bugs?Ticket textRelease notesTelemetryRanked list of clusters tied to recent releases, with the underlying conversations cited and linked.

Why a reviewer can sign off

We do not ask you to trust the model. We give you something to check.

The reason “AI for analysts” rarely makes it past pilot is that the work is unauditable: a number appears in a slide and nobody can tell where it came from. Three things we treat as required, not optional.

Show the work

Every answer ships with a methods appendix

Model used, features, assumptions, thresholds, source rows, and graph paths traversed — attached to the brief, not hidden in a notebook your reviewer never opens.

How: appendix is generated alongside the narrative on every run, not a separate "explainability" view.

No silent extrapolation

The agent refuses when the data does not support the question

When confidence is below threshold or a join is ambiguous, the brief states what is missing and what would resolve it — instead of guessing in prose.

How: confidence is computed per claim, not per response, so partial answers are still useful.

Reviewable lineage

Every cohort, metric, and chart links back to a row

Click any number in the brief to land on the source query, the rows it returned, and the unstructured citations that shaped the interpretation.

How: lineage is captured at execution time and exported with the brief — not regenerated on demand.

First 30 days

What deployment actually looks like.

We do not promise an agent that ships board briefs on day one. We promise a four-week path from read-only connection to a brief your reviewers will sign off on — with shadow mode and a defined scope before anything goes live.

Week 1

Connect

Read-only access to your warehouse, semantic layer, and the unstructured sources you want the agent to read. Nothing writes back yet.

Week 2

Define questions

Pick the three to five questions your team actually answers every week. We pin them as recurring briefs with explicit success criteria.

Week 3

Shadow run

Agent produces briefs against the same questions an analyst is already working on. Side-by-side review surfaces gaps in scope or trust.

Week 4

Go live

Scheduled briefs, an ad-hoc Ask channel, and graph-memory projection turned on. Methods appendix is required on every output.

Talk to us

Bring us your hardest weekly question.

Send us the question your analyst answers the same way every week. We'll show you exactly what the agent would do on it — sources read, methods chosen, brief drafted — before any contract conversation.

We use this to scope the walkthrough. No newsletter. No retargeting pixel. Privacy policy.