I recently launched
hiSHtory on reddit and HN, and have gotten some questions about exactly how it is able to support complex queries, and end-to-end encryption. So let’s do a guided tour of
the syncing code!
Installation and Initial Syncing
hishtory is installed, it generates a random secret key. In order to share a shell history, computers have to also share this secret key (this done via having the user manually copy the key). It then derives three additional secrets:
UserId = HMAC(SecretKey, "user_id")
EncryptionKey = HMAC(SecretKey, "encryption_key")
DeviceId = randomUuid()
UserId is shared between all devices that a user owns. The
(UserId, DeviceId) tuple uniquely identifies a computer owned by a particular user.
hishtory then registers itself with the backend by calling the
apiRegisterHandler which stores the
(UserId, DeviceId) tuple which represents a one-to-many relationship between user and devices.
All the other endpoints in the server can then be characterized as a message queue. In essence, clients can put messages into them and pull messages out of them keyed on their
The first message created after registration is a
DumpRequest to signify that a new device was created and it needs a copy of the existing shell history from another device. When another client receives this message (via polling
apiGetPendingDumpRequestsHandler) it responds to it by sending an encrypted copy of the shell history via the
apiSubmitDumpHandler. This is then received by the newly created client (which polls
At this point, the newly created client and the existing one both have a copy of your shell history. And the server never saw the contents of it (it only sees the number of entries).
Steady State Syncing
Now all that has to be done is to keep the histories in sync going forward. Whenever a command is run, it is encrypted and sent to the
apiSubmitHandler. It is then stored and received by the other clients via polling the same
apiQueryHandler endpoint as before.
Since hishtory is meant to be used on clients that may not always have a reliable internet connection, we also need to support re-syncing changes after a lost and then regained internet connection. Supporting this also has the advantage that if my backend ever goes down, syncing will resume as soon as the backend comes back up.
Whenever hishtory fails to call
apiSubmitDumpHandler, it stores locally the timestamp of this failed upload. And then every subsequent time it calls
apiSubmitDumpHandler, it sends all entries since that timestamp.
The last critical syncing feature that hishtory supports is the ability to delete history entries. For example, if someone accidentally records a command that contains a sensitive secret that they don’t want stored locally, they’d want to delete it from their history. The user selects which entries to delete (using the standard query syntax). First, these are deleted from the local DB. Then, the list of deleted entry IDs is sent to the server which:
- Deletes them from any pending queues
- Sends the list of IDs to all other clients, which then also delete them from their local DBs
Overall, this can be thought of as an eventually consistent distributed end-to-end encrypted set. A simple data structure that gets a lot more complex when we have to add syncing and encryption.