Designing sharing for enterprise
March 1st, 2020
Enterprise sharing. When Numeracy was acquired by Snowflake, sharing was one of the core features they wanted to integrate. One year later
If we’re going to talk about sharing, we need to first talk about access control. In enterprise, the default is to lock everything down. Access is controlled and granted by admins. While sharing will always occur naturally out-of-band, you can’t build a sharing feature into your product without at least understanding how security thinks about access.
Access control models
There are a lot of access control models. The main five are ACL, DAC, MAC, RBAC, and ABAC. This is the best explanation that I’ve found, but for a short summary:
- ACL (Access control list) is a club guest list. If you’re on the list, you’re in.
- DAC (Discretionary access control) is a speakeasy passphrase. Knowing it lets you in, and you can share that passphrase with others.
- MAC (Mandatory access control) is like Top Secret classification. Secrets are classified Top Secret. To get acces, you need Top Secret clearance. Only a higher authority can grant top secret clearance.
- RBAC is a sheriff’s deputy. Only the sherriff can deputize a citizen, but once you’re deputy, you get to act like a lawman.
- ABAC (Attribute-based access control) is King Arthur and Excalibur. Our admin Merlin can create any set of rules: You’ve got to be Arthur, the sword must be in a stone, and Britian needs king, etc.
In an ideal world, access control systems are invisible to the user. If everyone is always granted the right level of privilege, no one ever has to think about access control. Because the world is neither static nor conformant, it’s impossible for an admin to design a set of rules that perfectly describe when who should have which privileges over what objects, including objects that do not yet exist. Because admins tend to error on the side of caution, users will inevitably face incorrect “access denied” screens and find themselves trying to understand why their intended recipient can’t access something.
It’s here that access control system can range from straightforward to fuck-it-I’ll-just-screenshot-it. Numeracy’s product history has used ACL/DAC (much like Google Docs). We had ACLs per querypad. When someone needs access, they ping to owner/editor. We also had secret, unguessable link for DAC. If you ever landed on a querypad that you didn’t have access to, you could almost always just ping the person who shared it with you (usually the owner or editor) and they’d just add to you the ACL list.
When we came to Snowflake, we ran headfirst into RBAC. This one can be nearly impossible to decipher. Yes, you could ask the admin to just grant your role the privilege, but most admins won’t want to create exceptions to their ruleset, instead preferring to review the needs and the role hierarchy over a few meetings. This is fair — they’re often answering to lawyers and worse yet, customer lawyers. Helping you view a bar chart is hardly their top priority.
Business intelligence tools take RBAC into account either. They treat database roles as part of the database connection and implement their own access control layer. This leads to everyone querying under the same role and over time, that role becoming over-privileged but so intrinsic to the operations of the company (hundreds of dashboards) that it’s near-impossible to migrate off.
Numeracy allows users to set role per querypad, much like a regular database session. However, because we’re the official UI, it’s crucial we respect Snowflake’s access rules, namely, RBAC. The problem is that when people run into that ‘Access denied’ screen, how can we provide a reasonable ‘Access denied’?
First and foremost, you can’t build a reasonable sharing experience with RBAC. While you can set up RBAC in a way that facilitates sharing (grant select on views to per-user-roles), you can’t guarantee that a team has set up their system that way. Worse yet, a lot of querypads are one-off’s. You don’t really need or even want to have them represented in the database. Sharing is fundamentally a UI concept and should be represented in the UI, not the database.
Document vs Data
Sharing therefore is tied to the document and not the data. Every querypad is thus split into the SQL text and the results. We’ve kept ACL and DAC on the SQL text, whereas we enforce Snowflake’s RBAC on the query results, including charts. When you lack the role to see a query result, you can either get that role granted to you, or you can re-run it under a role that you own.
Refreshable dashboards was another core part of Numeracy that we’ve had to greatly limit. Unless you have the same role tied to the dashboard, you can’t re-run the query. If you’re an editor of the dashboard, you can change the role and re-run it under that though.
It would have been nice to allow analysts to create dashboards that can be run by less privileged users, say engineering or sales. The problem is that we’d effectively be giving the analyst the MANAGE GRANTS privlege (effectively making them an admin). A key part of reasonable data governance is not having multiple access control models governing the same data. Instead of trying to circumvent RBAC, right way to improve this flow is to build a better RBAC role request/approval flow.