Tachyon: A File and Storage System for the Future of Computing

In the expansive world of software, I have a particular fondness and passion for the arcane yet critically important area of system software. For my entire career, whether as a programmer, kernel hack, or CEO, I have been involved with building companies and selling system software.

Generally speaking, system software is the entire operating system and related services that directly interface with the operating system. Software for networking, storage, compute, and (sometimes) security are examples of system software. You might think of system software as the plumbing of computing — and to extend the analogy, applications are the fixtures we all see and use. Not to put too fine a point on it, but without the plumbing, shit would be everywhere!

Innovations in system software tend to follow broad trajectories, given the dependence on hardware innovation and adoption of compute architectures in organizations. One of the most fundamental building blocks in system software is a piece of software called a file system, which is essentially a “file cabinet” of data. Without a file system, modern computers would not operate. At some point, all I/O and most data on a computer finds its way into a file system — making it one of the most indispensable system software components.

In 1990, I joined a company called Veritas Software that pioneered a new file system. The file system was so innovative for its time that it became the de-facto standard in nearly all datacenters. Of course, no one believed that building and selling a file system would amount to much. Boy were they wrong! The broad adoption of the Veritas File System (affectionately called VxFS) enabled Veritas to become a multi-billion-dollar revenue company.

In the years since VxFS, however, there have been only incremental file system developments. File systems have always been designed for a disk-drive-centric storage environment, so today’s disk-based computing systems are well tuned and perfected. Little new innovation is needed. It’s a fundamental paradigm that has existed since the beginning of computing.

But the world is changing. As we confront unprecedented amounts of data, much of it in real-time, there is a shift from disk-centric to memory-centric computing. This shift requires a complete re-think in file — and storage — system design.

Memory-centric computing assumes that large working sets of data live in system memory (or RAM) as opposed to disk drives or solid-state disks (SSDs). We can reliably predict, with the cost of system memory decreasing, that memory for both storage and compute will be the exact same thing. Until now, we have always had tiered memory (RAM, disk drives, tape), which served to trade off cost and performance, and led to an entire computing architecture that has persisted for the past 40 years. Memory-centric computing will flatten this memory hierarchy and completely up-end compute architectures. It’s a revolutionary concept, one that has never existed before… and we are finally close to achieving this milestone.

This shift to memory-centric computing requires an entirely new file and storage system. And that’s where Tachyon comes in to play.

Tachyon is a memory-centric, fault-tolerant, distributed storage system, which enables reliable data sharing at memory-speed across a datacenter. Tachyon already supports many applications, with more to come, but will be especially valuable for big data applications. Think of Tachyon as the pioneer of next-generation file and storage systems, much like VERITAS was in the 1990s with VxFS.

So I am pleased to announce our investment in the company created by Haoyuan Li (“HY”), the co-inventor and lead of Tachyon and a Berkeley Ph.D. candidate (advised by top computer scientists Ion Stoica and Scott Shenker). HY drove the open source development of Tachyon at the U.C. Berkeley AMPLab, the birthplace of other industry-defining technologies such as Spark and Mesos.

HY’s insight and expertise — amplified by the incredible community working on the project — is bringing the next generation storage system to market. In less than two years, its industry traction is nothing short of phenomenal:

— Tachyon has more than 70 contributors from over 30 companies.

— Tachyon is deployed at more than 50 companies, some of which have deployed Tachyon with more than 100 nodes in production.

— It is the fastest-growing project in AMPLab history. Data also shows it is the fastest-growing project in the big data ecosystem.

One reason the IT community is going nuts over Tachyon is its ability to scale and perform with enterprise-class reliability. See for example this chart comparing Tachyon with today’s de-facto big data file system, HDFS:


As you can see, Tachyon linearly scales and blows HDFS away (even with HDFS running entirely in memory). Think of what this +100X performance boost will do for big data applications!

But that’s not all. As a general-purpose file system, Tachyon’s architecture supports data in traditional storage formats (e.g. HDFS, NFS) and modern workloads to power big data analytics.

Tachyon’s memory-centric architecture represents the future of storage. In fact, it’s the first new file system that was built from the ground up to take advantage of memory-centric compute architectures. With Tachyon as the “memory-centric” storage layer, our investments (including Databricks and Mesosphere) in the memory-centric infrastructure out of the “badass” Berkeley Data Analytics Stack (BDAS) — from big data storage to compute — is complete.

I am thrilled to be joining the board of HY’s company, currently dubbed Tachyon Nexus, based on the open source project Tachyon. We look forward to seeing this company become the next great software franchise delivering file and data management platform capabilities.




The views expressed here are those of the individual AH Capital Management, L.L.C. (“a16z”) personnel quoted and are not the views of a16z or its affiliates. Certain information contained in here has been obtained from third-party sources, including from portfolio companies of funds managed by a16z. While taken from sources believed to be reliable, a16z has not independently verified such information and makes no representations about the enduring accuracy of the information or its appropriateness for a given situation.

This content is provided for informational purposes only, and should not be relied upon as legal, business, investment, or tax advice. You should consult your own advisers as to those matters. References to any securities or digital assets are for illustrative purposes only, and do not constitute an investment recommendation or offer to provide investment advisory services. Furthermore, this content is not directed at nor intended for use by any investors or prospective investors, and may not under any circumstances be relied upon when making a decision to invest in any fund managed by a16z. (An offering to invest in an a16z fund will be made only by the private placement memorandum, subscription agreement, and other relevant documentation of any such fund and should be read in their entirety.) Any investments or portfolio companies mentioned, referred to, or described are not representative of all investments in vehicles managed by a16z, and there can be no assurance that the investments will be profitable or that other investments made in the future will have similar characteristics or results. A list of investments made by funds managed by Andreessen Horowitz (excluding investments and certain publicly traded cryptocurrencies/ digital assets for which the issuer has not provided permission for a16z to disclose publicly) is available at https://a16z.com/investments/.

Charts and graphs provided within are for informational purposes solely and should not be relied upon when making any investment decision. Past performance is not indicative of future results. The content speaks only as of the date indicated. Any projections, estimates, forecasts, targets, prospects, and/or opinions expressed in these materials are subject to change without notice and may differ or be contrary to opinions expressed by others. Please see https://a16z.com/disclosures for additional important information.