Gaining Traction on the Intractable: Optimizing a push notification service for minimum storage
Tim Winchester
Reading time: about 9 min
Topics:
Background
Lucidchart is a cloud-based collaborative diagramming application, so features that enable our users to collaborate effectively are a must. We've had a commenting system in place for some time, but it has been plagued by pain points like users having to manually check each document for new comments instead of being automatically notified. Our goal was to alleviate those kinds of problems with the commenting system by building an extensible notification system, which would initially focus on notifications about unread comments. It turns out that doing this is not trivial.The Problem
Let's imagine that all users are connected at all times. Whenever a user submits a new comment, we push a notification to all collaborating users about that new comment, and let's further assume that every user clicks the notification and reads the comment immediately. In this perfect world, the server can take a fire-and-forget approach to notifications, and we don't need to store any extra notification data on the server. Obviously, this is not the case in the real world. Back in reality, users log in and out at their leisure, and even if we send them a notification, they don't necessarily read it immediately. Instead, we have to store data about it for each user until they do read it. The O(nm) problem has begun to appear: we have to store a "seen/unseen" flag for each of n new comments, for each of m collaborators. This means that unless we find a cheat of some kind, we're stuck with a bad space complexity requirement. (As an aside, we could store those flags alongside the comment in the corresponding diagram file, but that would be a terrible mistake: the flags don't actually have anything to do with document state, so they bloat the diagram's file size unnecessarily, which kills our load time.)Staking out a Solution
Outside of the document itself, we have two other places we can store the notification data: in another table in a server-side database or in the browser’s local storage. Local storage is always alluring because it’s easy to use and it’s free, but it has the unfortunate drawback of being device-specific: if you read a bunch of comments on your work computer, your personal laptop won’t know anything about that and will display those comments as if they’ve never been read. With this in mind, we went with the database option, but we also used local storage to get the best of both worlds. We operated under 3 primary constraints:- Save as little as possible in local storage and the database
- Rely on real-time notifications as much as possible
- In the case of failure, err on the side of marking comments as unread
Handling Hiccups
You may be wondering at this point where constraint #3 (“In the case of failure, err on the side of marking comments as unread”) comes into play. There may be cases where the delete attempt of a read comment notification fails, which means that only the local storage of the machine on which it was read knows that it has been read. While it will keep attempting to tell the server that the comment has been read, the user may open the document on another device before it successfully does so. This should not be a common occurrence, but we designed the system such that a second device will see all remaining notifications just as the original device did. Viewed from this perspective, it becomes difficult to imagine designing the system any other way. (For a more in-depth explanation of why we had to make the decision one way or the other, see The Two Generals Problem.)Conclusion
Looking back, even though the upper bound of the storage requirement never changed from O(nm), we were able to pare down the storage cost in the common case quite a bit. We did so by capitalizing on certain properties of the problem, namely:- Notification data is user-specific. This allows us to store it outside the document, since no other user’s data (and, by extension, document data) will depend on it.
- Notification data is Boolean and has a black-hole state. This allows us to delete entries instead of storing false values.
- Both GET and DELETE are idempotent. This means that although we can never guarantee that the server and client have identical information, the client can send requests with impunity, without worrying about side effects.
- We can reduce the storage requirement even further by adopting a policy of not showing notifications older than x days, which effectively reduces the n in O(nm).
About Lucid
Lucid Software is a pioneer and leader in visual collaboration dedicated to helping teams build the future. With its products—Lucidchart, Lucidspark, and Lucidscale—teams are supported from ideation to execution and are empowered to align around a shared vision, clarify complexity, and collaborate visually, no matter where they are. Lucid is proud to serve top businesses around the world, including customers such as Google, GE, and NBC Universal, and 99% of the Fortune 500. Lucid partners with industry leaders, including Google, Atlassian, and Microsoft. Since its founding, Lucid has received numerous awards for its products, business, and workplace culture. For more information, visit lucid.co.