The cloud killed open source as a distribution model
December 28th, 2020
I’ve been struggling to make sense of open source since college. I remember installing Linux for the first time and marveling that I could get an operating system for free. Why are people giving away something so valuable?
This came to the fore when I joined Sentry. I was the first business hire in 2015. Open source seems ‘proven’ today but back then Mongo and Elastic had both not yet IPO’d and Red Hat seemed stagnant. Yes, Sentry was doing well as a business, but it wasn’t obvious that it could become a billion dollar business. Red Hat had “made it”, but a lot of their margin stemmed from licenses…something that made vastly less sense with Sentry, where the product was wholy open source and actively being used at a lot of the tech giants. At the time, no one at the company believed that anyone would ever pay us more than $100,000/year. I hardly believed it myself.1
I previously posited that open source helps ideas spread faster. The unexpectedly rapid migration of enterprise to the cloud in the last two years has made this a lot less relevant, massively driving down the adoption cost of new software. It’s hard to overstate just how onerous the traditional procurement was before the cloud. Purchasing was a VP level decision and individual developers had almost zero chance to getting any third party software.
Previously, open source circumvented the normal procurement/assessment/deployment process. Even when a company bought Oracle, it would take a year before they actually used it, two years to completely switch over. In that time, half of your engineering team might have turned over! With open source, engineers could get a good-enough solution working in a few hours. Movement to the cloud has simplified the assessment/deployment/procurement whereas the increasing importance of security and uptime over the last few decades has added process to adopting open source software. Finally, software is becoming less the wild west of script kiddies governed by clueless suits. Management is increasingly sophisticated and savvy to monitoring software being added to the stack.
Open source’s “free” advantage has also erroded as the value of software has always become much more self-evident. IT budgets have grown immensely as CEO after CEO attempts to prove they are not an obsolete incumbent and try to lead digital transformations of their business. From corporate credit cards to end-of-year expense budgets, there’s plenty of appetite to pay for software. Finally, as software is professionalized, there’s decreased appetite for liability. Do you really want to be on the hook for maintaining that shitty tool you built two years ago? Your boss doesn’t care that you saved the company $50k/year, but he does care that it could break and make the team look bad. Going with a vendor isn’t just outsourcing the work but also the blame.
This is really underlined for me by Snowflake. I once assumed that of all the categories, the data warehouse space would be dominated by open source vendors. All major transactional database are open source after all. Like Linux or git or React, a data warehouse is a pretty universal company need, so a popular project (like Hadoop or Spark) would benefit from contributors from other companies. Hosting data is tricky business, so the transparency of open source would engender trust. Finally, companies would always compare their costs to the base S3 costs, so open source companies would be best positioned to offer price-competitive services. Fast forward a few years and Snowflake has become the market leader, both because of its unique model of splitting storage and compute costs and its efficient sales machine.
Open source isn’t the untapped distribution channel it used to be. For any niche, there are multiple open source contenders, all with shiny marketing pages and VC money. IaaS players have made it clear they’re not afraid of offering directly competitive services using your code. Multiple players have now switched to BSL-like licenses. In Elastic’s case, AWS has responded by forking and continuing to work on the code under the original BSD license. This is the worst possible case for a company like Elastic. AWS has the resources to maintain the database and now potentially, the community. Most importantly, they have the incentive to do this right. AWS itself was organized under the principle that small teams interfacing through clearly defined and public-facing services would be the answer to Amazon’s growing pains, there’s not much risk to them maintaining the open source version of their products and significant benefit in terms of customer trust and community engagement.
Open source has started to feel a bit like “organic” food. Good for you and all else being equal, preferred. But McDonald’s isn’t going out of business and in fact, McDonald’s can and will start to figure out how to integrate organic into their own menu, all while serving billions of meals a year. Open source can be useful when a developer needs to dive into the code to understand edge cases or bugs, but it’s not the core reason why people choose a product/service. Most people don’t need or even want to know how the sausage is made; they’d prefer a reliable product with good documentation and support.
Distribution used to be just a side effect of open source, not a core reason. Lowering an individual actor’s development cost, like git.2 Helping build trust and community, like mongodb. Sharing control of critical infrastructure between frenemies, like Android. Collaboration across companies building towards a similar standard, like Webkit. Avoiding compliance concerns for security-sensitive industries, particularly finance, government, and military, like Hashicorp. In addition to these practical benefits, open source is great ferment of ideas. And while most of those ideas are bad, the public sphere gives the best ideas the opportunity to be discussed and adopted. Large companies will pay hundreds of thousands of dollars to send their developers to conferences in the hopes of incorporating some of the latest industry thinking, but it’s incredibly hard to actually incorporate those ideas once you’re back. The engineer discipline need to modularize and open source internal code speaks to a strong engineering culture. Leadership is able to weigh a quantitative cost against a qualitative benefit. A engineering culture has become especially in B2B services. As most business processes get automated, having a robust API, good security practice, and near-perfect uptime is as important a selling point as feature or price, often even more.
Not all code should be open source. In fact, unless code explicitly benefits from being open source, it’s still probably better to default to closed source. For the most part, open source is best for tightly focused, single-purpose functions, not sprawling products where leadership and complex decision-making is required. Because while open source has made it easier to distribute and even collaborate on software, it has not fundamentally improved how people decision-make and govern themselves.
Open source was critical to the success of companies like MongoDB, but going forward, I don’t believe that it’s vital for all companies, even in developer tooling, to open source their products. This is the same calculation made by Mongo, Sentry, Redis Labs, and so many other open source companies in recent years in adopting less permissive licenses to defend their businesses from AWS. Startups like Render or Vercel or Replit or Darklang haven’t bothered to be open source, despite being PaaSes. This isn’t to say that it’s a bad idea for a company to open source some or all of your code. It’s valuable to the wider community and like being open-minded in life, usually has indirect and unexpected benefits. But the shift to the cloud has means open source is no longer an innovative distribution model for disrupting proprietary incumbents.
This actually affected my early work at Sentry. We assumed that 100k/year was a soft cap on customer ACV. Counting backwards, you figure your average ACV will probably never be more than $10k, so to support a billion dollar business, assuming normal 6x multiplier, you’d want about 1b / 10k / 6, you’d need 17k customers. At the time, we had 1k, so I concluded that getting broad adoption was our only path to success. This turned out to be wrong (Sentry has since gone on to build up a good-size list of 100k+ customers) and Snowflake has made it obvious that a hosted service can command eight-figure contract sizes. ↩
Yes, git was not “created” to share costs between companies, but Linus disliked the cheap VCS and could not afford to keep using BitKeeper. ↩