Anti-Aggregation Hypothesis

  1. The absolute number of niche software businesses with a potential market of a few million users and total potential revenue of under ten million will grow over time.
  2. Automation and consumerization of IT means B2B businesses will, over time, more resemble B2C business ratios. The number of users / employee ratio will only grow with time. B2C have 1 employee to 100,000 active users. B2B will have 1 to 10,000. This will, over time, close the gap between gross margins and operating margins.

Until you have 10 TB of data, you shouldn’t care. Scaling app servers is remarkably straightforward, which means the only limitation is really the database. Google Cloud provides 10 TB managed Postgres servers1. 10 TB means you have tens of millions of monthly active users or millions of daily active users and need access to historical data beyond the last year, so vacuuming isn’t an option.

Tiny: You know most of your users by name. Don’t analyze your user base; talk to them. Small: You’ve gotten some word of mouth and have a few hundred to a few thousand users. Maybe you’re experimenting with marketing and want to know your ROI. Medium: You’ve hit your stride. You’ve got tens if not hundreds of thousands of users. You have a <10 person team, depending on per-user revenue. Large: You have millions of users. It’s a real business, though not necessarily a cash-flow positive one. Massive: You have tens to hundreds of millions of users. You’re a unicorn. Your data team is yelling that they need Hadoop. You have a data team.

Pipet works like a dream for tiny to medium sized companies. With some config, it should work fine for large companies. If you’re massive, you’ve already switched to Hadoop, so none of this matters.

10 GB for 100,000 users

Pipet monthly costs

Sufficient for 100,000 users

  • Redash instance: $24.27
  • Pipet instance: $3.88
  • Postgres micro instance: $9.37
  • Total: $37.52

Sufficient for 1,000,000 users

  • Redash instance: $24.27
  • Pipet instance: $3.88
  • Postgres micro instance with 100 GB: $24.67
  • Total: $52.82

Sufficient for 10,000,000 users

  • Redash instance: $24.27
  • Pipet instance: $24.27
  • Postgres micro instance with 1 TB: $195.55
  • Total: $244.09

Segment monthly costs

  • Segment: $0
  • Redshift 160 GB: $187.20
  • Mode: $500/month
  • Total: $687.20

100,000 “MTUs”

  • Segment: $1,000
  • Redshift 160 GB: $187.20
  • Mode: $500/month
  • Total: $1,687.20

1,000,000 “MTUs”, assuming 75% discounts

  • Segment: $2,500
  • Redshift 2 TB: $529.20
  • Mode: $500/month = Total: $3,529.20

10,000,000 “MTUs”, assuming 90% discounts

  • Segment: $10,000
  • Redshift 2 TB: $529.20
  • Mode: $500
  • Total: $26,029.20

For heavy usage where you have multiple analysts and millions of daily active users, Postgres remains cheaper than Redshift

  • Redshift 16 TB: $4,230.72
  • Postgres 32 vCPU, 208 GB RAM, 10 TB storage: $2,028.10

If you’re a side project or startup without funding but hundreds of thousands of freemium users, reducing the COGS of your free users is critical. $0.01/month is probably the total product infrastructure cost of your free users (unless you’re “toothbrush” like Google or Snapchat). Your analytics infrastructure cost should be a fraction of that, not equal to it. The point at which Pipet “breaks” (10 million monthly users), the cost savings is more than an engineer’s salary. The flip side is that having analytics is, without question, valuable for decision making. When the cost of the analytics is dropped by two order of magnitude, it opens up its availability much earlier in a business’ development.

This plays into my anti-aggregation hypothesis that software without strong network effects are commodities. Strong network effects are found predominately in certain categories: communication, marketplaces, and data. These network effects give near-impenetrable moats and therefore give the firms near unlimited pricing power. 99% gross margins2 are not unheard of.

  • Communication: Facebook vs Google+
  • Marketplaces: Amazon vs Jet
  • Data: Google vs Bing

Most software has weak to no network effects. Things like “brand awareness” or “industry standardization” can help with adoption, but they do not prevent a well funded or scrapier3 competitor from taking the market. And if gross margins are too high, you’ll face an endless supply of such competitors. In Segment’s case, gross margins are 99%4. The difference in cost (and the relative ease of rolling our own) made it an easy decision to switch off. Segment, recognizing this, offered to cut our costs in half, reducing gross margins to 97%. Quite honestly, it would have required cutting the price to a fifth of the original (reducing margins to near 90%) to make it worth it.

My guess is that for software businesses without network effects, gross margins of ~90% should be the norm. So long as the team has a reasonably capital-efficient go-to-market5, it seems unlikely VCs would fund competitors to try and displace them.

  1. Google Cloud SQL for Postgres instance can be up to 10 TB

  2. Gross margins are not operating margins. Gross takes revenue versus COGS. Fixed costs, like R&D, are excluded. 

  3. Open source or more consumer-oriented plays can do this. MySQL/PostgreSQL displacing Oracle. Trello displacing Jira. 

  4. Sentry was paying well over $1,000/month when we switched off Segment. Our own “Segment” costs less than $10/month. 

  5. Assuming it’s a side project so R&D is “free”, CAC should be less than ACV * gross margins. Of course, assuming insanely great conversion rates (10%), gross margins of 90%, and signup conversions of 5%, you’re spending 0.5% of your ACV on a click. Given that clicks generally cost $3, you need an ACV of $600 to afford ads. My belief in Sentry’s eventual dominance comes from our seemingly ceaseless source of organic leads, largely acquired via open source. Sentry being open source both recognizes that our category is already a commodity and uses it to our advantage. Sentry software, which is normally seen a fixed cost moat, instead lowers our CAC while raising competitors. These CAC economics are the weak moat that most software businesses have: investors will do the napkin math and realize the cost of breaking into the market are not worth the potential revenues.