Why We Backed Shift Technology

Astrid Holm

Insurance fraud detection spent decades inside a conceptual model that was, in retrospect, upside down. The dominant approach built explicit rules: if a claim exhibits these characteristics — short policy tenure, repeat repair shop, certain address patterns, specific injury codes — flag it for investigation. The rules were written by experienced claims handlers who knew what fraudulent claims looked like based on what they had seen before.

The problem is structural, not operational. Rules-based detection is an adversarial system where the defender must enumerate every fraud pattern before it can be caught. The attacker only needs to discover one pattern the defender has not yet written a rule for. At scale, this is not a game the rule-writers can win.

Why Pattern-Learning Changes the Equation

What Shift Technology built — and why we backed them in early 2020 — is a system that inverts the adversarial dynamic. Instead of encoding rules, the model learns the statistical signature of fraudulent claim patterns across the carrier's entire claims corpus. It does not need to know in advance what the next fraud pattern looks like. It needs to know what legitimate claims look like, and flag deviations.

This is not a new idea in machine learning. Anomaly detection has existed as a discipline for decades. What makes it hard in insurance is specific to the domain: the labeled data problem. Most insurance carriers have formally adjudicated fraud rates of 1–3% of claims volume. But fraud researchers generally estimate actual rates at 10–15% of premium, with many fraudulent claims settled without formal fraud designation because investigation costs exceed expected recovery. Your training data contains a substantial number of fraudulent claims labeled as legitimate.

The teams that handle this well use semi-supervised approaches, carrier-specific ground truth calibration, and heavy investment in label quality — which requires genuine domain expertise in how claims are handled operationally, not just ML engineering skill. When I was running fraud detection at a Nordic banking group, we had the same problem with transaction data. The interesting signal is in the unlabeled examples. The systems that learned to extract it were the ones that survived contact with real adversarial conditions.

The Carrier Partnership Problem

There is a second reason the investment was compelling that is less about the technology and more about the go-to-market structure of the problem.

Fraud detection in insurance is not a standalone product decision at the carrier level. It touches claims operations, legal, compliance, SIU (Special Investigations Unit), and actuarial — a committee of stakeholders with conflicting incentives. Claims operations wants fewer false positives because investigating clean claims is expensive and damages policyholder experience. SIU wants higher sensitivity. Legal wants explainable decisions for cases that proceed to litigation.

Building a product that serves all of these simultaneously requires deep knowledge of how the claims workflow actually operates, not just how the detection algorithm performs on test data. The teams we backed had spent meaningful time embedded in carrier claims operations before the product was even at prototype stage. That kind of pre-product domain immersion is something we look for specifically, and it is rare.

Network Effects in Claims Data

There is a genuine network effect in insurance fraud detection that is worth naming. A carrier operating in isolation has access to its own claims corpus. A platform that aggregates patterns across multiple carriers has access to cross-carrier fraud ring signatures — the same individual or organized group appearing in different carriers' claims data under slightly different presentations.

This cross-carrier signal is powerful. Organized fraud rings operate at scale precisely because they understand that carriers do not share data with each other. A platform that creates a privacy-preserving mechanism for pattern sharing across carriers — without exposing individual carrier data competitively — becomes structurally more valuable as each new carrier joins. The moat grows with deployment, not with engineering headcount.

We are not saying this makes fraud detection immune to competition. Other players have built similar approaches. The moat is not the algorithm; no algorithm holds competitive advantage for more than a few years. The moat is the accumulated carrier relationships, the proprietary labeled dataset, and the network effect of cross-carrier signal. That takes time to build. First-mover compounding is real here.

What We Got Wrong Initially

In our initial investment thesis, we underweighted the speed of carrier procurement cycles. Insurance procurement is slow — compliance reviews, security audits, data sharing agreements, pilot structuring, actuarial sign-off. A well-run pilot at a mid-size carrier can take 9–14 months from first conversation to signed contract. We had modeled a faster adoption curve.

The lesson for how we evaluate claims AI investments now is to ask directly: does this team have a go-to-market motion that accounts for long enterprise sales cycles? Do they have enough runway to survive three or four of these procurement cycles before revenue becomes significant? Are they building carrier relationships in parallel rather than sequentially?

The teams that have navigated this successfully — and Shift's trajectory over subsequent years is evidence — are the ones who treat carrier partnership development as a core operational discipline, not as something that happens after the technology is ready. That means dedicated BD infrastructure earlier than most founders think they need it.

Claims AI as Infrastructure

The broader observation is that claims fraud detection, done well, is infrastructure — not a feature. It operates at every claim, all the time, across every product line the carrier writes. The ROI is not in dramatic individual fraud catches. It is in the systematic shift of the false positive rate and false negative rate across millions of routine claims decisions.

That infrastructure framing is how we think about the investment thesis for claims AI generally: the value compounds in the operational layer, not in the headline numbers. A carrier that has trained its claims handlers to trust AI-assisted triage — and adjusted its investigation budget allocation accordingly — has baked a structural cost advantage into its loss ratio. That is a durable competitive position. That is the kind of change worth backing at seed.