Previously, if a remote build would fail due to lack of disk space, this
would be considered a permanent failure and thus cached as a build
failure if the local daemon runs with '--cache-failures'.
* guix/scripts/offload.scm (transfer-and-offload): Upon
'nix-protocol-error?' call 'node-free-disk-space' and return 1 instead
of 100 if the result if lower than 10 MiB.
Fixes <https://bugs.gnu.org/33378>.
* guix/scripts/offload.scm (node-free-disk-space): New procedure.
(%minimum-disk-space): New variable.
(choose-build-machine): Call 'node-free-disk-space' and take it into
account in addition to LOAD.
(check-machine-status): Display the free disk space.
* guix/scripts/offload.scm (machine-load): Remove.
(node-load, normalized-load): New procedures.
(choose-build-machine): Call 'open-ssh-session' and 'make-node' from
here; pass the node to 'node-load'.
(check-machine-status): Use 'node-load' instead of 'machine-load'. Call
'disconnect!' on SESSION.
* guix/scripts/offload.scm (call-with-timeout): New procedure.
(with-timeout): New macro.
(process-request): Use it around 'transfer-and-offload' call.
Previously we were looking at the load of the past 5 minutes, which
means that, after a build, we could end up waiting for 5 minutes for
that metric to be low enough.
* guix/scripts/offload.scm (machine-load): Compute RAW based on ONE, not
FIVE.
This fixes a regression in 'retrieve-files*' introduced in
896fec476f, whereby (guix scripts offload)
would not read the initial sexp now sent by the remote host via
'store-export-channel'. This would effectively prevent file retrieval
entirely when offloading.
* guix/ssh.scm (retrieve-files*): New procedure, like former
'retrieve-files' but with an extra #:import parameter.
(retrieve-files): Rewrite in terms of 'retrieve-files*'.
(file-retrieval-port): Make private.
* guix/scripts/offload.scm (transfer-and-offload): Pass #:import to
'retrieve-files*'.
(retrieve-files*): Remove.
* guix/scripts/offload.scm (check-machine-status): New procedure.
(guix-offload): Call it when the argument is "status".
* doc/guix.texi (Daemon Offload Setup): Document it.
* guix/scripts/offload.scm (build-machines): Comment out
'(set! %fresh-auto-compile #t)' since with Guile 2.2.3 it could lead to
an actual rebuild of everything that gets loaded from there on. See
<https://bugs.gnu.org/29226>.
* guix/ui.scm (load*): Likewise.
Previously we would call 'machine-load' once per machine, which was very
costly when there were many machines. Now we arrange to call it only
once on average (when all the machines have the same 'speed' value).
* guix/scripts/offload.scm (random-seed, shuffle): New procedures.
(choose-build-machine)[machines+slots+loads]: Rename to...
[machines+slots]: ... this. Remove load from the tuples therein.
[undecorate]: Adjust accordingly.
[machine-less-loaded-or-faster?]: Remove.
[machine-faster?]: New procedure.
Sort MACHINES+SLOTS according to 'machine-faster?'. Call
'machine-load?' as the last thing.
The '%slots' list could grow indefinitely; in practice though,
guix-daemon is likely to restart 'guix offload' often enough.
* guix/scripts/offload.scm (%slots): Remove.
(choose-build-machine): Don't 'set!' %SLOTS. Return the acquired slot
as a second value.
(process-request): Adjust accordingly. Release the returned slot after
'transfer-and-offload'.
This fixes a memory leak that can be seen by running:
(map (lambda _ (machine-load m)) (iota 1000))
* guix/scripts/offload.scm (machine-load): Add call to 'disconnect!'.
This avoids the open/fstat/close syscalls upon a cache hit that we had
with the previous idiom:
(call-with-input-file file read-derivation)
where caching happened in 'read-derivation' itself.
* guix/derivations.scm (%read-derivation): Rename to...
(read-derivation): ... this.
(read-derivation-from-file): New procedure.
(derivation-prerequisites, substitution-oracle)
(derivation-prerequisites-to-build):
(derivation-path->output-path, derivation-path->output-paths):
(derivation-path->base16-hash, map-derivation): Use
'read-derivation-from-file' instead of (call-with-input-file …
read-derivation).
* guix/grafts.scm (item->deriver): Likewise.
* guix/scripts/build.scm (log-url, options->things-to-build): Likewise.
* guix/scripts/graph.scm (file->derivation): Remove.
(derivation-dependencies, %derivation-node-type): Use
'read-derivation-from-file' instead.
* guix/scripts/offload.scm (guix-offload): Likewise.
* guix/scripts/perform-download.scm (guix-perform-download): Likewise.
* guix/scripts/publish.scm (load-derivation): Remove.
(narinfo-string): Use 'read-derivation-from-file'.
This allows 'guix publish' threads as well as 'guix substitute' and
'guix offload' processes to be properly labeled in 'top', 'pstree', etc.
* guix/workers.scm (worker-thunk): Add #:thread-name parameter and honor it.
(make-pool): Likewise.
* guix/scripts/publish.scm (http-write): Add calls to 'set-thread-name'
in bodies of 'call-with-new-thread'.
(guix-publish): Call 'set-thread-name'. Pass #:thread-name to 'make-pool'.
* guix/scripts/offload.scm (guix-offload): Call 'set-thread-name'.
* guix/scripts/substitute.scm (guix-substitute): Likewise.
* guix/scripts/offload.scm (connect-to-remote-daemon)
(store-import-channel, store-export-channel, send-files)
(retrieve-files): Move to (guix ssh).
(nonce): Add optional 'name' parameter and use it.
(retrieve-files*): New procedure.
(transfer-and-offload): Use it instead of 'retrieve-files', and add
first parameter to 'send-files'.
(assert-node-can-import): Likewise.
(assert-node-can-export): Use 'retrieve-files' instead of
'store-export-channel'.
* guix/ssh.scm: New file.
* configure.ac: Use 'GUIX_CHECK_GUILE_SSH' and define 'HAVE_GUILE_SSH'
Automake conditional.
* Makefile.am (MODULES) [HAVE_GUILE_SSH]: Add guix/ssh.scm.
* guix/scripts/offload.scm (check-machine-availability): Add 'pred'
parameter and honor it.
(guix-offload): for the "test" sub-command, accept an extra 'regexp'
parameter. Pass a second argument to 'check-machine-availability'.
This fixes a regression introduced in
21531add32 whereby the build log would no
longer be sent to FD 4, thereby leading the daemon to not see the build
log.
* guix/scripts/offload.scm (transfer-and-offload): Parameterize
CURRENT-BUILD-OUTPUT-PORT.
* guix/scripts/offload.scm (assert-node-repl, assert-node-has-guix)
(nonce, assert-node-can-import, assert-node-can-export)
(check-machine-availability): New procedures.
(%random-state): New variable.
(guix-offload): Add case for "test".
* doc/guix.texi (Daemon Offload Setup): Document it. Remove obsolete
bit about remote invocation of 'guix build'.
This fixes a longstanding issue where 'choose-build-machine' would make
on average O(N log(N)) calls to 'machine-load', plus an extra call for
the selected machine, instead of N calls.
* guix/scripts/offload.scm (machine-load): Add comment.
(machine-power-factor, machine-less-loaded-or-faster?): Remove.
(choose-build-machine)[machines+slots]: Rename to...
[machines+slots+loads]: ... this.
[undecorate]: Adjust accordingly.
[machine-less-loaded-or-faster?]: New procedure.
Remove extra 'machine-load' call in body.
* guix/scripts/offload.scm (<build-machine>)[daemon-socket]: New field.
(connect-to-remote-daemon): New procedure.
(%gc-root-file, register-gc-root, remove-gc-roots, offload): Remove.
(transfer-and-offload): Rewrite using 'connect-to-remote-daemon' and
RPCs over SSH.
(store-import-channel, store-export-channel): New procedures.
(send-files, retrieve-files): Rewrite using these.
* guix/scripts/offload.scm (<build-machine>)[ssh-options]: Remove.
[host-key, host-key-type]: New fields.
(%lsh-command, %lshg-command, user-lsh-private-key): Remove.
(user-openssh-private-key, private-key-from-file*): New procedures.
(host-key->type+key, open-ssh-session): New procedures.
(remote-pipe): Remove 'mode' parameter. Rewrite in terms of
'open-ssh-session' etc. Update users.
(send-files)[missing-files]: Rewrite using the bidirectional channel
port.
Remove call to 'call-with-compressed-output-port'.
(retrieve-files): Remove call to 'call-with-decompressed-port'.
(machine-load): Remove exit status logic.
* doc/guix.texi (Requirements): Mention Guile-SSH.
(Daemon Offload Setup): Document 'host-key' and 'private-key'. Show the
default value on each @item line.
* m4/guix.m4 (GUIX_CHECK_GUILE_SSH): New macro.
* config-daemon.ac: Use 'GUIX_CHECK_GUILE_SSH'. Set
'HAVE_DAEMON_OFFLOAD_HOOK' as a function of that.
Suggested by Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>.
* guix/scripts/offload.scm (register-gc-root)[script]: Replace
'false-if-exception' with a finer-grain 'system-error handler.
Provide the name of MACHINE in 'leave' error message.
Suggested by Ricardo Wurmus <ricardo.wurmus@mdc-berlin.de>.
* guix/scripts/offload.scm (remote-pipe): Remove unneeded 'catch'.
(machine-load): Check the exit value upon (close-pipe pipe). Call
'warning' when it is non-zero.
* guix/scripts/offload.scm (machine-less-loaded?, machine-faster?):
Remove.
(machine-power-factor): New procedure.
(machine-less-loaded-or-faster?): Use it.