Fintech

Putting Data in a Corner: LLMs and Financial Services Data

Joe Schmidt Posted March 28, 2024

Putting Data in a Corner: LLMs and Financial Services Data Table of Contents

Nobody puts Baby in a corner — especially if Baby happens to be fragmented financial services data. Historically, the key to building a successful financial services data company has been to extricate and analyze valuable but difficult-to-access resources like public filings, diligence materials, research reports, conference call notes, and news. Now, with the advent of large language models, such previously shackled information might soon be easily and widely available to everyone, ushering in a new LLM-powered era that could shift the financial services landscape.

That said, the financial services data market is currently dominated by several billion-dollar companies that, given their scale and other advantageous moats, are extremely entrenched. But Bloomberg, Morningstar, and Verisk — three prominent examples — didn’t become market leaders in a vacuum. All initially followed a specific three-step playbook at the dawn of the digital age, whereby they 1) identified large, fragmented pools of data in valuable markets; 2) found a way to bring that data into a relational database; and 3) charged for access to that information. In doing so, each built the beginnings of what are now category-defining businesses.  

For new entrants looking to take advantage of the advent of LLMs and disrupt the status quo by going upstream of these incumbents, we’ve done a deep dive into Bloomberg, Morningstar, and Verisk’s stories. Here we’ve pulled out several instructional lessons for startups, as well as posed a few questions about issues that you should be thinking about. Let’s get started:

Bloomberg

Michael Bloomberg famously founded Bloomberg L.P. in 1981 after he was fired by Salomon Brothers, where he had been responsible for analytical tools for sales and trading. Having seen the massive amount of paper the investment bank had gone through to update their daily trades and prices, Bloomberg used his $10 million severance package to build an independent IT solution, initially called Innovative Market Systems, to bring firms that needed it the same type of data and simple analytical tools. 

Prior to Bloomberg — and its aggregated financial data — entering the scene, firms had to collect data independently, often relying on calculators. By being the first one to accurately and quickly capture this data in a digital format, Bloomberg was able to gather users in a way that allowed his business to add on additional products and features, most important of which became communication.

Though Bloomberg was able to continue building his business by acquiring offline businesses like John Aubert’s Sinkers, which published corporate bond prices in a reference book, the big unlock came when the company took on Merrill Lynch as their first design partner. Not only was this association critical for brand trust and recognition, but by powering Merrill’s growing bond trading operation, Bloomberg was able to acquire real-time data from a market leader. And because this data ended up being better than what all the other platforms had at the time, Bloomberg ended up powering the daily bond prices for the Wall Street Journal, even though Dow Jones, the WSJ’s parent company, provided a competitive product. 

What Bloomberg realized, and what set it apart from competitors like Datastream (now Refinitiv), was that a data advantage built on information that cannot be cornered won’t last forever. So the company was quick to couple its early lead with additional services, such as news and communication tools. Without the data and distribution that the terminal provided, Bloomberg would never have been able to compete with Dow Jones and Reuters, the two largest incumbents at the time. 

Despite the incredibly rich data feeds, analytical models, and real-time news, the most valuable aspect of Bloomberg is the social aspect. The data and news brought all of the appropriate users onto the terminal and created a network effect lock-in that rivals any other social network.

Lessons for Startups

Bloomberg’s initial insight was to corner hard-to-access data and give it to customers in a usable format. To do so, it acquired a legacy offline business and used that to partner with a large player to further its data advantage. Bloomberg’s greatest insight, however, was that beyond continuing to innovate in financial data, its most significant value came from owning workflows and creating network effects driven by its communication tool. 

Questions for startups:

  • What data is Bloomberg not capturing properly today? 
  • Is there a workflow that isn’t well captured by Bloomberg? 
  • Are there groups your product serves that might benefit from a closed network communication center? 

 

Morningstar

Morningstar was founded by Joe Mansueto in 1984 after he realized that mutual funds were on the rise but there was no good system to assess the quality of them. The holdings and reporting data needed to accurately assess the quality of mutual funds existed in various prospectuses, shareholder reports, and price histories, but acquiring this data was painful; analysts had to physically write to each fund’s managers to get access to their reports. After Mansueto realized that everyone who wanted to invest in mutual funds  — from retail investors to large limited partners (LPs) representing many smaller investors’ capital — had to go through the same process, Morningstar was born. 

Mansueto’s original idea was realizing that just aggregating all of this data into one, searchable place was a product in itself, and his first product was a book that was marketed in Barron’s and available 4 times a year; eventually, this database became the digital Morningstar product. Over time, Morningstar has leveraged its initial advantage in data aggregation and moved into additional verticals. This has predominantly happened through acquisitions, and Morningstar can immediately increase the value of these businesses as it has an established sales channel and an industry-leading brand.

One early driver of Morningstar’s success was mutual fund LPs insisting that their funds use Morningstar for their analytics and reporting. These LPs needed an easy way to explain to their investors what they were investing in and why, and they knew they could trust Morningstar’s research and data due to the business’ early expertise and reputation. Once LPs started driving this requirement, the industry standardized around it and Morningstar’s brand positioning was cemented. 

Despite not owning any core workflows or collaborative element, Morningstar has a near monopoly on fund characterizations, and it has used the higher profit margins that come from a powerful brand to continue to acquire other businesses that have cornered interesting pools of financial data. Perhaps the most interesting of these acquisitions is Pitchbook, which both has unique data and is a workflow tool that might help Morningstar create a more Bloomberg-like relationship with its customers. 

Lessons for Startups

Morningstar is another classic example of an early financial services player understanding the value of aggregating challenging-to-access financial data and making it available in real time. 

What is unique about their strategy is how they leveraged LPs to push the industry toward adopting their offerings. At the end of the day, mutual funds serve at the pleasure of LPs. When enough large, institutional LPs push for a certain reporting requirement, it happens and funds standardize, creating a brand moat.

Questions for startups: 

  • Is there a tailwind “why now” moment for the data you are capturing to become more valuable? 
  • Are there players (customers, regulators) in your industry that want to help drive adoption of your product? 
  • Is there whitespace for a brand around what you are offering?

Verisk

A non-profit consortium of 7 large property and casualty insurers founded the Insurance Services Office (ISO) in 1971 to be a neutral third party that could help with statistical and actuarial services, insurance programs, and state regulatory requirements. Realizing the significant value in standardizing and aggregating the large swath of insurance ratings bureaus, ISO, now a subsidiary of Verisk, was able to consolidate a massive amount of data. That gave carriers who accessed it greater economies of scale when it came to pricing and underwriting risk, and it was the start of Verisk’s data flywheel. 

To further solidify and grow its data advantage, ISO expanded beyond ratings bureaus themselves and began standardizing ratings and forms. Modifying ratings schedules was particularly important in hard-to-insure categories like fire risk, where ISO created a single nationwide fire rating schedule that is still used today. At the same time, ISO created a vastly simplified personal lines (home and auto) insurance policy document that made it easier for a policyholder to understand what was covered by a policy. This standardization increased brand awareness and trust, as well as gave Verisk ownership over the data ingestion format.

Outside the obvious financial incentives, the advent of the internet and the ability to share its information online was a main driver behind converting ISO from a non-profit organization to its current for-profit status as Verisk Analytics. In the late 1990s, ISO launched ClaimSearch and combined additional claims datasets with what it already had and became the industry’s largest claims database. In becoming the first provider of online claims and pricing information, ISO continued to acquire more users, and data, further increasing its positioning as the leading provider of insurance data solutions. 

Lessons for Startups

Verisk is unique as it started as a nonprofit dedicated to helping insurers better understand risk and claims data. At the time, the regulatory environment created the fragmentation of data, and building a new business was the best way to fix that problem. Offering better claims and risk data by creating a powerful give-to-get data exchange model solved the cold start problem of getting data that is valuable to carriers and regulators. In turn, carriers and regulators helped push the industry toward Verisk and cement its scale advantages.

Questions for startups:

  • Are there significant data issues in your space that require industry collaboration but need a neutral third party? 
  • Do regulators want to drive substantial change in your industry? How might you position your company to be their instrument to drive forward this change? 
  • Is there a way you can drive standardization of fragmented workflows that leads to interesting data exhaust? 

Conclusion

The rise of dominant market leaders Bloomberg, Morningstar, and Verisk has been characterized by a playbook centered around identifying large, fragmented pools of data and leveraging technology to provide valuable insights and solutions. 

These companies have built category-defining businesses by capturing historical data, standardizing it, and offering it to customers in a simplified, accessible format that is enhanced by additional services and features that drive home defensibility. The advent of LLMs heralds a new era in data analytics and financial services. With the potential to unlock previously inaccessible data sources, LLMs may create opportunities for new players to emerge and disrupt the status quo by going upstream of these incumbents.

 

*A special thanks goes to Tom Elnick, the Co-CEO at Tegus. Tegus is one of my favorite products and a fantastic new example of success in this mold.

Want More a16z Fintech?

Commentary and analysis on recent news, and compelling trends in the fintech space.

Learn More
Recommended For You
Enterprise

Can AI Help Save Lives?

Kimberly Tan and Michael Chime
Growth

Retention Is All You Need

Santiago Rodriguez and Alex Immerman
Growth

AI Will Supercharge Modelbusters

David George and Santiago Rodriguez

Want More Fintech?

Commentary and analysis on recent news, and compelling trends in the fintech space.

Sign Up On Substack

Views expressed in “posts” (including podcasts, videos, and social media) are those of the individual a16z personnel quoted therein and are not the views of a16z Capital Management, L.L.C. (“a16z”) or its respective affiliates. a16z Capital Management is an investment adviser registered with the Securities and Exchange Commission. Registration as an investment adviser does not imply any special skill or training. The posts are not directed to any investors or potential investors, and do not constitute an offer to sell — or a solicitation of an offer to buy — any securities, and may not be used or relied upon in evaluating the merits of any investment.

The contents in here — and available on any associated distribution platforms and any public a16z online social media accounts, platforms, and sites (collectively, “content distribution outlets”) — should not be construed as or relied upon in any manner as investment, legal, tax, or other advice. You should consult your own advisers as to legal, business, tax, and other related matters concerning any investment. Any projections, estimates, forecasts, targets, prospects and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Any charts provided here or on a16z content distribution outlets are for informational purposes only, and should not be relied upon when making any investment decision. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation. In addition, posts may include third-party advertisements; a16z has not reviewed such advertisements and does not endorse any advertising content contained therein. All content speaks only as of the date indicated.

Under no circumstances should any posts or other information provided on this website — or on associated content distribution outlets — be construed as an offer soliciting the purchase or sale of any security or interest in any pooled investment vehicle sponsored, discussed, or mentioned by a16z personnel. Nor should it be construed as an offer to provide investment advisory services; an offer to invest in an a16z-managed pooled investment vehicle will be made separately and only by means of the confidential offering documents of the specific pooled investment vehicles — which should be read in their entirety, and only to those who, among other requirements, meet certain qualifications under federal securities laws. Such investors, defined as accredited investors and qualified purchasers, are generally deemed capable of evaluating the merits and risks of prospective investments and financial matters.

There can be no assurances that a16z’s investment objectives will be achieved or investment strategies will be successful. Any investment in a vehicle managed by a16z involves a high degree of risk including the risk that the entire amount invested is lost. Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by a16z is available here: https://a16z.com/investments/. Past results of a16z’s investments, pooled investment vehicles, or investment strategies are not necessarily indicative of future results. Excluded from this list are investments (and certain publicly traded cryptocurrencies/ digital assets) for which the issuer has not provided permission for a16z to disclose publicly. As for its investments in any cryptocurrency or token project, a16z is acting in its own financial interest, not necessarily in the interests of other token holders. a16z has no special role in any of these projects or power over their management. a16z does not undertake to continue to have any involvement in these projects other than as an investor and token holder, and other token holders should not expect that it will or rely on it to have any particular involvement.

With respect to funds managed by a16z that are registered in Japan, a16z will provide to any member of the Japanese public a copy of such documents as are required to be made publicly available pursuant to Article 63 of the Financial Instruments and Exchange Act of Japan. Please contact compliance@a16z.com to request such documents.

For other site terms of use, please go here. Additional important information about a16z, including our Form ADV Part 2A Brochure, is available at the SEC’s website: http://www.adviserinfo.sec.gov.