Self_update: In-place updates for Rust executables

jaemk · on Jan 30, 2020

Never thought I'd see someone link to one of my projects!

As others have pointed out, it's not a perfect solution and has a limited scope of use. I originally made this as an experiment to make it easy to distribute updates for cli apps that I was working on. Yes, a package manager is the "right" way to do this, but since I was testing and building images in my CI flow (using https://github.com/japaric/trust) I wanted a way to take advantage of that pipeline in a seamless way. The library exposes a couple predefined "backends" that handle common patterns like GitHub releases and s3, but the underlying steps (downloading, extracting, replacing) are also made available so you can build your own update pattern and add things like verification.

dathinab · on Jan 30, 2020

For a moment I thought it would do process takeover (hot reload) which is closer to impossible to get right.

Nice, through not production ready yet.

Through I'm a bit hard pressed to find any good use-case for it. In nosy installations the binary _should_ not have the right to override any binaries at all, including itself. In Linux (and I think windows too by know) there is a form of update functionality provided by the system. Even if you do your own update for client side software it should be a separate binary (as the main binary should not be able to so so). Maybe it would be ok in a container, but the restarting the container would likely instead the update.

(As a side note I said should because I know that it's sometimes not the case, at least not without putting additional work into it. Still using a update mechanism like that would make it harder to make it the case in the future).

dijit · on Jan 30, 2020

I thought this was going to be replacing the process, but it appears to be a tool for replacing the binary on disk? Am I reading the code right?

Why would you want to do this without restarting the process? Isn't hot-reload a much more desirable thing?

and, to that end, replacing a binary in place just ensures that what's running cannot be re-run on stop, so you could end up in a situation where you've overwritten a working binary with a nonworking one which could obfuscate critical information when trying to recover service.

adamnemecek · on Jan 30, 2020

It's for desktop application. It's an alternative to sparkle https://en.wikipedia.org/wiki/Sparkle_(software)

zamalek · on Jan 30, 2020

Hot reload is such a difficult goal, especially if there are kernel/out-of-process resources (such as network connections) involved.

I'm sure everyone believes that hot reload is the answer, there are probably fewer than 5 projects where it has been done, and all those projects likely have some seriously PHD-level code.

Beyond the most extreme manual and cognitive effort, hot reload is completely and utterly unsolved.

cperciva · on Jan 30, 2020

Network connections aren't the hardest problem; you can hand those across Unix sockets.

The fundamental problem with hot reload is that data structures can change. If the old process has a priority queue implemented as a linked list and the new process has a priority queue implemented as a heap, how on earth is any automated mechanism going to copy the state across?

yorwba · on Jan 30, 2020

Changing the running code but keeping the data in memory is equivalent to changing the code but keeping the data in the database. Changing data-structure definitions is equivalent to changing the database schema.

If you want to change the database schema without losing data, you need to write a migration. If you want to hot-reload code with different data-structure definitions, you'll have to write a migration.

A hot-reloading solution for a statically typed programming language will at minimum have to detect whether data structures have changed and either prevent the reload or require that the new code includes a function to migrate the data.

lkschubert8 · on Jan 30, 2020

I guess you could come up with some concept of migrations for data structures?

Joker_vD · on Jan 30, 2020

Obviously, you write a code_change/3 function that updates the old state to the new shape or, if asked to do so, downgrades the new state back to the old shape :)

pjmlp · on Jan 30, 2020

In languages like Common Lisp it is quite well solved, even in space.

http://www.flownet.com/gat/jpl-lisp.html

> The Remote Agent software, running on a custom port of Harlequin Common Lisp, flew aboard Deep Space 1 (DS1), the first mission of NASA's New Millennium program. Remote Agent controlled DS1 for two days in May of 1999. During that time we were able to debug and fix a race condition that had not shown up during ground testing. (Debugging a program running on a $100M piece of hardware that is 100 million miles away is an interesting experience. Having a read-eval-print loop running on the spacecraft proved invaluable in finding and fixing the problem. The story of the Remote Agent bug is an interesting one in and of itself.)

zamadatix · on Jan 30, 2020

These don't exclude each other, you can do both if you wanted.

This is true of any self contained updater.

wtmt · on Jan 30, 2020

It’s an updater for programs. Consider it like Firefox or any other application updating itself. The update check and actual update are enabled by this one for Rust programs. Any notifications to the user or restarts should be handled by the program itself.

blattimwind · on Jan 30, 2020

There doesn't seem to be support for signing in this.

sansnomme · on Jan 30, 2020

Does this do live process replacement too? E.g. transition smoothly from old web server to new web server while still handling requests without downtime.

microcolonel · on Jan 30, 2020

I'm pretty sure it doesn't.

That would be a quite tricky thing to do. You would probably need some very specific architecture in your application to allow for something like that.

It's not clear to me that this is a better architecture than just splitting your application into two or more processes, one of which is of such lasting value and high quality that it would never need replacing. Then still, you may be better off making your gateways/endpoints not need 100% uptime in order to maintain availability, or making your client applications not require total availability to behave well.

loeg · on Jan 30, 2020

For web serving (or TCP servers with short-lived connections) in particular, you could do an application-specific smooth transition. But this wouldn't be the responsibility of TFA's library.

You could have your program use SO_REUSEPORT (or SO_REUSEPORT_LB?) on the socket it listens on. When upgrading, the old version launches the new version, which begins accepting new requests on the configured addresses and ports. The old version closes its listening socket so all new connections are handled by the new version. The old version can then just wait for all existing clients to disconnect, or if the server handles long-poll style loads, maybe send a redirect of some kind to force the client to connect to the new instance. When clients are all exited or some acceptable timeout has expired, the old version can delete itself and exit.

sansnomme · on Jan 30, 2020

Technically, like Erlang, Go could probably do this by redirecting new I/O into the new process's channels and kill the old process when all channels are empty.

euank · on Jan 30, 2020

Go's channels aren't exposed outside a specific go process's runtime. The runtime doesn't give you any convenient way to redirect them. They're not like erlang's mailboxes at all in that regard.

Furthermore, channels aren't the primitive used for multiplexing IO / handling connections on a socket in go. You typically have a goroutine (e.g. 'http.ListenAndServe' spins up goroutines), and the gorutines are managed not by channels, but by the internal go runtime's scheduler and IO implementation (which internally uses epoll).

Because of all those things, replacing a running go process that's listening on sockets is no different from that same problem in C. You end up using SO_REUSEPORT and then passing the file-descriptors to the new process and converting them back into listeners. Channels don't end up factoring into it meaningfully.

If you're interested in what this looks like, cloudflare wrote a library called tableflip [0] which does this. I also forked that library [1] to handle file-descriptor handoff in a more generic way, so I've ended up digging pretty deeply into the details of how this works in go.

[0]: https://github.com/cloudflare/tableflip

[1]: https://github.com/ngrok/tableroll

andoriyu · on Jan 30, 2020

Hot-code reload like in erlang isn't really a requirement in modern world in most cases. It also goes directly against immutable infrastructure we all going towards to.

In case of rust batteries aren't included so each network stack will require different strategies, but in general it will be more like nginx (or unicorn if you come from ruby world) style of reload.

In other words - it's possible, but why?

cbolat · on Jan 30, 2020

What about integrity and security of downloaded binary? You should implement signature check as well to verify binary.

dickeytk · on Jan 30, 2020

this won't work on windows will it?

lpghatguy · on Jan 30, 2020

An issue was filed about what platforms the project supports[1] and it appears to work on Windows.

[1]: https://github.com/jaemk/self_update/issues/21

thijsvandien · on Jan 30, 2020

While Windows does not support replacing an open executable, it does support moving/renaming it, so you can get it out of the way and clean it up later.

asveikau · on Jan 30, 2020

Also: I solved this problem before by copying the binary to a different location, then launching in the alternate location it to copy itself over top the original. Then the newly launched one in the old location deleted the temporary copy. [Man, this sounds clumsy to describe. But it worked!]