An engineer's SQL client
Someday, I’d love to build a SQL client for engineers.1 For the sake of this article, let’s call this hypothetical product “Potrero”.
First off, Potrero wouldn’t target the analyst. That market is done, captured. Instead, it’s engineers. Engineers already know the basics of SQL. Many startup engineers will write their own queries to satisfy occasional question. They generally don’t use Looker or Mode, but psql. And while it’s hard to compete with built-in solutions, people aren’t attached to psql in the way they’re attached to vim and even in that category, VS Code has emerged as the new dominate player in a decades-old market (though obviously subsidized by a massive company).
So, starting from a different audience, a very different product starts to evolve.
Instead of starting with Snowflake or Redshift, you start with localhost Postgres. Engineers need to be able to explore their local data. Sort of like requestb.in (RIP), it’s sometimes hard to know, within a complicated code base, what is actually being put in the database. Sure, you could fire up psql, but you really just want to lazily click around.
From there, you might annotate a column or table to remind your future self of weirdness in a particular table. But you’d also want to be able to share with your teammates. So you’d actually want a sort of collaborative document. You don’t really need/want it to be realtime, so something like git would be fine. You could make local modifications and when you’re ready, publish them to the team.
You’d also want some kind of data summary stats displayed alongside the human-editable parts. Postgres already has pg_stats, which tells you about NULL values, popular values, percentile distributions (if it’s a continuous value like numbers or datetime).
Now for running SQL, you have two parts: Query + Results. Yet at the same time, you want access to browse the schema objects as well as previous queries. For the quick answer, we can just look at other code editors: they use tabs and split views. I know, I know. No one except engineers know about split views. But remember? I told you Potero was for engineers. Cmd-P for GoTo is therefore acceptable.
Typing SQL is of course tiresome, so intensely great autocomplete is absolutely necessary. Luckily, because you’re only support Postgres, syntactical autocomplete is much easier. One of the clever tricks we used at Numeracy was mostly autocompleting within single clauses. Sometimes, a person might right the
from clause before the
select clause and autocomplete should still detect that it should prioritize table/view/table-returning functions first. “Perfect” autocomplete would be able to autocomplete not only keywords, tokens, and values based on the database, but from frequency queried (not for any user on the database, lest some machine is using the same connection, but for people using Potrero). Of course, the autocomplete should have references to the notes you commented on earlier.
Actually, since we’re talking about using query history for ordering autocomplete by frequency, you could add this data to the schema browser: what are common join’s, where’s, or group by’s for a table. What tables in a schema are most often queried. That sort of thing.
As another side note, some SQL clients sell the idea of real-time collaboration. But for collaborating on queries, you don’t want real-time collaboration because it’s impossible to do reasonable autocomplete if there are multiple cursors. Given the choice, I’d rather have autocomplete with presence (Eric is typing…) and git-style merging when needed. Even more importantly, we’re starting from localhost here, so the flow isn’t real-time, but compose-publish.
Eventually, you’d want to hook this up to your production app. Your schema comments and usage metadata would carry over, but now, it wouldn’t just be yours, but your entire team’s. This is where Potrero would actually solve a valuable but subtle problem: contextual knowledge transmission. Most SQL seems to be easy to write, but the devil’s always in the details. Understanding the subtleties (or just oddities) of SQL, your database, or your dataset is everything. Hopefully a product like Potrero would assist with that.
I don’t plan on building this myself anytime soon, because I don’t think this market is viable for a business (though it might be viable as a side project making beer money). The pain point isn’t sharp enough that anyone will pay enough to pay for CAC and isn’t general enough that a free product will spread virally enough to survive off VC money. ↩