Introduction

The enterprise AI market looks diverse on the surface. Hundreds of vendors sell AI-powered products across every major industry. But most of them don’t build their own AI. They build on top of someone else’s.

This project tried to find out who that someone else actually is. We studied 199 enterprise AI vendors across 10 sectors: cybersecurity, legal, healthcare, financial services, HR, sales, marketing, customer support, productivity, and developer tools.

We built a Python script that automatically scraped each vendor’s public materials including their website, engineering blog, job postings, press releases, and GitHub. That text was sent to the Anthropic API, which extracted any mentions of foundation model dependencies. Each result came back with a confidence score and a direct quote from the source as evidence.

After two rounds of validation and manual review, we confirmed 66 dependency relationships across 41 vendors. One vendor can have multiple relationships because many products integrate more than one foundation model. For example, a productivity tool might use OpenAI for text generation and Google for embeddings. The 66 relationships represent individual vendor-to-provider links, not unique vendors.

Limitations

Selection bias. The 41 vendors with confirmed dependencies are not a random sample. Vendors that disclose tend to be developer-facing or proud of their AI partnerships. The opaque 77% are likely more enterprise-focused and risk-averse, which means the visible market probably understates true concentration.

Interoperability vs dependency. The scraper captures all model mentions including integrations, optional connectors, and multi-model support. A vendor that lets users choose between GPT-4 and Claude is structurally different from one whose product is built entirely on GPT-4. The data does not always allow that distinction to be made from public signals alone.

Point-in-time snapshot. Model dependencies shift quickly. A vendor using Claude 3.5 Sonnet today may switch providers next quarter. This dataset reflects public information available in May 2026.

Unverifiable upstream dependencies. Several vendors claim proprietary models but may still rely on frontier model APIs or fine-tuning. True independence cannot be confirmed from public sources.

HHI is a lower bound. The concentration score is calculated only on the 21% of vendors who disclose. The true HHI across all 199 vendors is unknown and likely higher.

Sample size. This study examined 199 vendors across 10 sectors. AIMultiple, one of the larger AI vendor databases, lists over 8,000 AI vendors globally. The 199 vendors studied here represent roughly 2.5% of the known market. The sample was selected to represent well-known, funded enterprise vendors with public-facing products, but it is not random and cannot be treated as representative of the full ecosystem.


79% of the 199 enterprise AI vendors, do not publicly disclose what their product runs on. The charts below cover what we found in the remaining 21%.

199 enterprise AI vendors studied across 10 sectors. Each square represents one vendor.


Among vendors that do disclose, OpenAI and Anthropic together account for 85% of confirmed dependencies. The HHI score of 3,950 puts this well above the threshold economists use to define high concentration. This is measured only on the 23% who disclosed anything. The true concentration across all 199 vendors is unknown.


Concentration is not isolated to one sector. OpenAI and Anthropic appear across productivity, legal, marketing, cybersecurity, developer tools, and sales. The sectors with the least disclosure - healthcare and financial services - are also the sectors where model failures would carry the highest real-world cost.


3,950 is the HHI for the visible 23% of the market. For context, the Department of Justice considers any market above 2,500 highly concentrated. This number is a floor, not a ceiling. If the opaque 77% follows even a similar pattern, the true concentration score would be significantly higher. We have no way to know because the data does not exist.

Conclusion

The enterprise AI ecosystem has a transparency problem. Most vendors do not disclose what foundation models power their products. Among the minority that do, the market is highly concentrated around two providers. This matters because concentration risk is only manageable when it is visible. Financial regulators can stress test banks because they can see the exposure. AI infrastructure has no equivalent. A single provider outage, pricing change, or policy shift could cascade across healthcare, legal, cybersecurity, and finance simultaneously. Nobody would see it coming because nobody can see the dependencies. The CrowdStrike outage in 2024 showed what happens when a single vendor sits inside thousands of critical systems. AI foundation models are that vendor for the next generation of software. The difference is that CrowdStrike was at least a known dependency. Most AI dependencies are not. Better disclosure standards are the minimum necessary first step. Without them, concentration risk in AI infrastructure is not just unmanaged. It is unmeasurable.