Every company tracks certain success metrics—commonly accepted criteria for the health of a business. But when it comes to marketplaces, those measurements can often be imprecisely defined or muddled in their interpretation. Of course, as marketplaces vary widely in their product category and customer base, so do their benchmarks. But the following list serves as a primer for the key metrics marketplace founders should be aware of, both to calibrate their performance and evaluate future potential.
The job of any marketplace is to facilitate the matching of supply with demand. It’s therefore important to measure your successful “match rate” — the rate at which buyers can find sellers, and vice versa. How to define this metric depends on the unique business.
Match rates examples for particular businesses include:
A related metric is to measure “zeros”, or unsuccessful matches. For ridesharing, what percentage of users open the app but don’t end up requesting a ride? Those “zeros” could be due to too long of a wait time, surge pricing, or something else — all instances the marketplace was unable to clear demand. Marketplace operators should identify reasons why matches don’t happen and take steps to remove or reduce these blockers through growing and incentivizing the more constrained side of the marketplace, improving product design, and other mechanisms.
This metric is also closely related to the concept of multi-tenanting described above. If match rate is low, then users will naturally be incentivized to go elsewhere and use other products. For instance, it’s common for employers to post their job listings on a variety of sites — their own website, LinkedIn, Indeed, as well as other networks — simply because no single network has a high enough match rate. If there’s even incremental revenue potential or even just minimum utility, multi-tenanting will take place; just think about all of the delivery marketplace stickers you see in any given restaurant’s window.
The concept of “offer depth” or market depth originated from financial markets, where it’s defined as the market’s ability to sustain relatively large orders without price movements. The higher the number of buy and sell orders at each price, the greater the depth of the market.
For consumer marketplaces, it’s important to measure market depth because it directly impacts the user experience. For heterogeneous supply marketplaces (where each supplier is different), market depth determines whether users will be able to find a match. When users open products like OfferUp or Airbnb, how many listings will they see, and how likely will they be to find an item they want to buy or home they want to rent? For homogenous supply marketplaces, market depth impacts ease of use. When users open Lime, how many bikes/scooters will they see near them? The greater the market depth, the easier (and less user effort required, in terms of walking) it is to use Lime.
One of the primary jobs of any marketplace business is to reduce search costs — making it easy for participants to find and match with the other side. Failing to do this can result in a marketplace with negative network effects, where too much supply actually causes challenges in discovery. As consumers, we experience this as decision fatigue, or a paradox of choice. Conversion rates could fall in this scenario.
A note on heterogeneous vs. homogeneous supply: “homogeneous supply” marketplaces typically hit an asymptote in network effects, where the value to users eventually plateaus with greater market depth. For instance, if there were 6 Lime scooters on a city block near me, this is no more valuable than if there were only 4 or 5 scooters available for me to use in my vicinity — user value is unchanged despite the addition of more supply. On the other hand, for heterogeneous marketplaces, there is no asymptote because every node on the supply side is different and potentially can add greater value. In the Airbnb example, a user’s tastes may be quite specific, so every additional listing on the platform is useful to see.
Typically, marketplaces have a curve for match rate: over a long timespan, a greater share of inventory clears. For product marketplaces, this is commonly referred to as inventory turnover.
The inverse is days to turn, and this metric is more applicable for more traditional marketplaces, where the matching happens via users opting in — one side creates a listing and the other responds — in contrast to on-demand marketplaces, which do matching in a centralized, algorithmic (and less visible to users) way.
For instance, for job marketplaces, how long does it take an employer to find an employee? How long does it take to receive the first application? For P2P marketplaces, how long does it take for each side to engage in a transaction? For Thumbtack, how long does it take users to receive the first quote? How long does it take on OfferUp for a seller to sell their product?
Marketplaces where there is greater fragmentation on the supply and demand sides are more valuable and defensible. This means no participants on the demand or supply sides disproportionately account for a high share of transactions, which makes the business more sustainable and diversified. If demand or supply is too concentrated on a marketplace, there’s risk that a large buyer or seller can take a large share of transactions with them if they decide to leave the platform.
There’s also greater value when a marketplace aggregates fragmented goods or providers, as those would otherwise have been more difficult to discover and access. This is basically like taking the advantages of a long tail (more variety and niches) and making it as easy to find as the head of the tail (beyond just popular hits).
Marketplaces can gauge concentration by measuring the % of GMV the top X sellers or buyers account for (e.g. the share of GMV each grocery chain contributes, in the case of Instacart).
In marketplace businesses, gross merchandise volume (GMV) and revenue are frequently used interchangeably. But GMV does not equal revenue.
Gross merchandise volume is the total sales dollar volume of merchandise transacting through the marketplace in a specific period. It’s the real top line, what the consumer side of the marketplace is spending. It is a useful measure of the size of the marketplace and can be useful as a “current run rate” measure based on annualizing the most recent month or quarter.
Revenue is the portion of GMV that the marketplace “takes.” Revenue consists of the various fees that the marketplace gets for providing its services; most typically these are transaction fees based on GMV successfully transacted on the marketplace, but can also include ad revenue, sponsorships, etc. These fees are usually a fraction of GMV.
The take rate suggests the value of the marketplace itself.
Improved network effects often appear in improved unit economics over time. This is a result of declining incentives that businesses need to offer to different sides of the market, lower share of paid users, and overall improvement in pricing power.
For businesses with local network effects, the impact of network effects should show up in unit economics over time, on a market-by-market basis. This is because in a given market, CAC should decrease and the organic share of users should grow over time. For businesses like Thumbtack or Instacart, which have network effects at the local level, tracking the unit economics over time per market is helpful because you’ll see the relationship between market age, network density, and profitability.
It’s important to understand whether your users are also using similar services, including related services where the functionality may not be exactly the same.
We’ve often observed that if a company is able to replicate a network, it can also layer on functionality that can obviate the need for another product. Even if it doesn’t wipe out the target company, such multi-tenanting can reduce usage and compress margins for all competitors. A marketplace for dog walkers and pet owners, for example, has the opportunity to move into pet health or food or other adjacent products, given it has built a network of pet owners from the core business. Facebook developed ephemeral Stories and added this feature into their various apps, including Instagram, in turn stymying the growth of Snapchat.
Measuring such multi-tenanting can be tricky — it might mean polling your users and asking whether they use another service; digging deeper into churn or declines in usage (and figuring out whether those users are moving to a different service); or simply brute-force searching for users’ profiles on other platforms! But once you see how many users are multi-tenanting, there are ways to shore up your product so users are less tempted to go somewhere else. In ride-sharing, for example (which had high multi-tenanting on both sides), companies rolled out subscriptions on the rider side and bonuses on the driver side to boost retention and reduce usage of competitors’ services.
Finally, even if you have a good sense of the overlap between your user base and that of another service’s, it’s important to consider how active your users are: are they merely maintaining a profile, or actively using your product?
Beyond the availability of substitutes, how easy it is for users of one network to sign up and complete the onboarding process for a competing network?
The friction involved in signing up and becoming an active user varies from product to product. Products that have an onboarding process that requires high upfront investment may find it challenging to activate prospective new users — but it also serves as a moat against competitors, because once those users are active, they’re less likely to multi-tenant. Looking at the landscape of online personal styling services, a Stitch Fix customer for instance may find it tedious to try out a different service because of the upfront investment in explaining her preferences to a new stylist; inputting information around her taste and sizing; calibrating various styles received and returned; and so on.
Conversely, if a product has a lower activation energy required of new users, it can more easily wedge its way into a market by getting users to multi-tenant and switch over: Because Uber already had millions’ of users’ credit card information for ride-sharing purposes, a user who was previously using another food delivery network could easily start using Uber Eats without much friction.
Another important consideration here is how much value can users get at the beginning when they join a new network — what’s the user experience with a cold start? For Facebook, even though users can easily join other social networks, their data, content, and networks are all on Facebook, so there’s high switching costs to inviting their network and rebuilding their social graph. On the other hand, for job listing marketplaces, an employer can easily upload their hiring specs to multiple sites and start receiving candidate applications from the get-go.
Distilling switching or multi-homing costs into a quantifiable metric can be tricky, and any metric will be quite specific to that exact business and market. Potential metrics could be the time required to complete a competitor’s onboarding flow; or the ease of getting to the minimum threshold or “magic number” for a product to be useful (e.g. 10 friends for Facebook); and so on.
The classic definition of a network effect is that the value of a product or service to a user increases with the number of other users using the same product or service. This increase in user value should therefore be reflected in user retention cohorts: newer cohorts (who experience a product when the network is larger and more useful) should have better retention for any given time period than older cohorts that joined when the network was smaller.
However, theory often differs from reality here, and we often see businesses that have declining cohort retention over time. This is because a major confounding factor to consider when evaluating user retention is that the oldest user cohorts — especially for social network/community-based products — tend to be early adopters who are the most “ideal customers” for a product/service. Those early, often highly motivated users naturally translate into better retention cohorts for the oldest customers, rather than the newest.
Other circumstances can also change the analysis of this metric: the presence of a competitor; network effects that are hyperlocal and thus “reset” for new users in every new geography; or even negative network effects, where value to users actually decreases at a certain threshold (perhaps due to crowding or contaminants in the network).
Digging deeper into the engagement funnel, you want to see if more users are taking the “core action” of your product. The core action can be one that actually corresponds to users deriving value from your product, and/or something that maps closely to your business model.
For instance, if the core action of OpenTable is users booking a restaurant, then as the network density grows, they should expect to see improving retention as anchored on this core action. This core action retention is more telling of network effects than just measuring top-level logins or app opens.
Subscription and paid products need to pay attention to dollar retention and paid user retention. New user cohorts should be better retained — in terms of cohort revenue — than older cohorts. Why? Because paying for a product indicates how much users value that product, a product with network effects — which becomes more valuable over time — should have increasing dollar retention and paid user retention among newer cohorts.
For instance, as the network coverage of Angie’s List — a home services directory — improves, we’d expect to see that new user subscriber cohorts are better retained, both in terms of dollar retention as well as the number of users who remain subscribed, given the greater utility of the site.
For local network effect businesses, the network effects exist on a per-market basis, and “resets” for new geographies. For Care.com users in Charlotte, for example, the presence of more babysitters available in New York City doesn’t impact the user experience; but having more babysitters available locally does improve the usefulness of the network there.
As each geography matures and builds network density, retention should improve in those markets. Thus, the oldest or most established markets tend to have better retention than newer markets. We see this in practice in data shared by almost every local network effect business.
Power users drive some of the most successful companies, by contributing a ton of value to the network. While DAU/MAU — dividing daily active users by monthly active users — is a common metric for measuring engagement, it has its shortcomings, and power user curves provide a more nuanced way to understand user engagement.
In short, power user curves (commonly called L30 charts for 30 days of use, or L7 charts for 7 days of use) are histograms of users’ engagement, showing the total number of days users were active in doing a particular action in a given timeframe. In analyzing network effect businesses, seeing how often users take a specific action on a cohort basis allows you to see whether a product is really gaining utility with more users — aka the network effect. If a product is indeed more valuable with more users, then that should be reflected in a growing share of users shifting to higher-frequency engagement buckets or an increasingly right-leaning power user curve over time.
This piece is part of the annual a16z Marketplace 100 series, which is a ranking of the largest and fastest-growing consumer-facing marketplace startups and private companies. See the full index and analysis here, and visit a16z.com/marketplace-100 for more marketplace-related content.
Jeff Jordan is an a16z General Partner focused on consumer companies.
D'Arcy Coolican Prior to joining a16z, he co-founded Frank, a social lending platform that used behavioral economics to make it easy to lend and borrow money with friends and family. He began his career at McKinsey & Co, where he was an engagement manager in the TMT practice.
Andrew Chen is a General Partner and founding investor at A16Z GAMES, focused on metaverse, VR/AR, and gaming investments.