SQL Passthrough

March 17th, 2022


Unlike most web apps, SQL clients should directly issue SQL to list, view, and modify database objects (i.e. listing tables in a schema). Otherwise, there sheer number of endpoints and arguments eventually result in the team creating SQL-to-REST endpoint generators and the API itself becomes a strange kind of DSL. Unfortunately, access commands for schema objects information is not uniform across databases, but this problem exists in both the generated endpoint and the SQL passthrough cases. The argument comes that you are still creating a translation layer, just one that translates JavaScript function calls made by frontend components into SQL, but this approach is much more stable. For instance, writing tests for how functions generate SQL strings is much easier.

For my own amusement and honestly, practice writing code again, I created sqlpassthru. Brian ended up doing something very similar when we were at Snowflake for Numeracy, so this is just my poor man’s imitation.

As a side note, connection and authorization management isn’t really addressed here. My guess is you’d probably want the ability to pass database and user/role via query args, but passwords or public keys should be stored in the database.

Request

POST / HTTP/1.1
Host: runsql.com
Content-Type: text/sql
Content-Length: 22
Accept: text/csv

SELECT * FROM users;

Response

Pass column types as MIME parameters. Keys and values are percent encoded.

HTTP/1.1 200 OK
Content-Range: rows 0-1/1
Content-Type: text/csv; id=int4; name=text
Content-Length: 24

id,name
1,Eric
2,Sherry

Or for partial content:

HTTP/1.1 206 Partial Content
Content-Range: rows 0-1/*
Content-Type: text/csv; id=int4; name=text
Content-Length: 24

id,name
1,Eric
2,Sherry

You can pass a custom header Max-Content-Length to change the maximum content length.

Ultimately, this lets us us execute code like this from the browser:

fetch('http://localhost:8090/query', {method: 'POST', headers: {'Content-Type': 'text/sql'}, body: 'select * from users'}).then(resp => console.log(resp.text()))

There should also probably be the option to Accept: application/json instead, but csv has the nice property of being streamable and also more compact

The Content-Range header exists because there should be a maximum number of bytes (i.e. 1MB) sent down to the client. Once the number of rows overflows this, the connection should be closed. Unfortunately, the total number of rows is included the CommandComplete tag, which is sent only after all DataRows, so total runs cannot be known until all results are fetched. In fact, it’s likely impossible to know beforehand, as Postgres server will start streaming rows down, even before the query “finishes”.

Content-Type includes coltypes in the same order as the columns. Postgres returns data types in a query responses’s RowDescription. You can see lib/pq’s parsing of RowDescription or pgx’s parsing of RowDescription. Tricky to get, but very possible.

Insert or updating rows would return no content (unless RETURNING clause is added).

Request

POST / HTTP/1.1
Host: runsql.com
Content-Type: text/sql
Accept: text/csv

INSERT INTO users(name) VALUES ('Emily')

Response

HTTP/1.1 204 OK
Content-Type: text/plain
Content-Length: 0
Rows-Affected: 1

Query arguments or parameters can be passed as query parameters.

Request

POST /?args=Emily,3 HTTP/1.1
Host: runsql.com
Content-Type: text/sql
Accept: text/csv

INSERT INTO users(name, login_attempts) VALUES ($1, $2)

Response

HTTP/1.1 204 OK
Content-Type: text/plain
Content-Length: 0
Rows-Affected: 1

And errors would be passed through as such:

Request

POST / HTTP/1.1
Host: runsql.com
Content-Type: text/sql
Accept: text/csv

INVALID SQL

Response

HTTP/1.1 400 Bad Request
Content-Type: text/plain
Content-Length: 67

ERROR:  syntax error at or near "INVALID SQL"
LINE 1: INVALID SQL;

As a sidenote, during my wanderings writing this, I found that sqlc parses expressions to determine types, but that’s only necessary because sqlc doesn’t want to have a dependency on the database to operate. It’s heuristic and therefore incomplete: select 'a' || 'b' just results in an interface{} instead of a string, but it doesn’t particularly matter.