Key Generation

This guide provides detailed instructions for generating SSH key pairs and securely deploying public keys to remote worker machines in various operating system configurations within the AGILab environment.

In the rest of the guide, I would refer to:

  • Manager: The local machine from which SSH connections are initiated.

  • Worker: The remote machine that accepts SSH connections.

  • Remote account: The user account on the worker machine used for SSH login.

1. Generate the keys

manager$ ssh-keygen -a 100 -t ed25519

You will be prompted for a passphrase. don’t enter one (double return).

If you have not changed the default path, the public key will be stored in ~/.ssh/id_ed25519.pub and the private key in ~/.ssh/id_ed25519.

2. Loading the private key in SSH Agent

2.1 Load the private key

manager$ ssh-add ~/.ssh/id_ed25519

2.2 Verify the key Addition

manager$ ssh-add -l

It should display the public key (not private). To manually check the public key:

manager$ cat ~/.ssh/id_ed25519.pub

If you have set a passphrase, you will be asked to enter it. If you encounter any permission-related errors, refer to the Permissions section.

On Linux, if a window titled “Enter password to unlock the private key” appears when trying to establish an SSH connection, enter the passphrase and check the box “Automatically unlock this key whenever I’m logged in”.

3. Copy the public key to the server

3.1 Allow your key

Follow these steps to add your key to the authorized_keys file of each workers:

Worker Linux:

manager$ ssh-copy-id -i ~/.ssh/id_ed25519 <remote account>@<worker ip>

Worker Windows:

manager$ cat ~/.ssh/id_ed25519.pub | ssh <remote account>@<worker ip> powershell -NoProfile -Command "Add-Content -Encoding ascii -Path 'C:\\Users\\<remote account>\\.ssh\\authorized_keys' -Value '([Console]::In.ReadToEnd())'"

3.2 Verification

manager$ ssh <remote account>@<worker ip>

Success

It should connect without asking the account password !

Bidirectional trust between worker Macs

When two macOS workers must talk to each other (for example <worker_a_ip> and <worker_b_ip>), install the same SSH key pair on both hosts so either side can ssh without a password prompt. Assuming the login user is <remote account> (adjust if yours differs):

  1. Enable Remote Login on each Mac if it is not already on:

    sudo systemsetup -setremotelogin on
    
  2. From <worker_a_ip> push the public key to <worker_b_ip>:

    ssh-copy-id -i ~/.ssh/id_ed25519 <remote account>@<worker_b_ip>
    
  3. From <worker_b_ip> push the same key back to <worker_a_ip>:

    ssh-copy-id -i ~/.ssh/id_ed25519 <remote account>@<worker_a_ip>
    
  4. Verify both directions once so the host keys land in ~/.ssh/known_hosts:

    ssh <remote account>@<worker_b_ip> hostname
    ssh <remote account>@<worker_a_ip> hostname
    

Each command should print the remote hostname without asking for a password. If either side still prompts, re-run ssh-copy-id and make sure ~/.ssh/authorized_keys on the target contains the public key content.

Node reinstalled or host key changed

If a worker was reinstalled or its SSH host keys changed, fix the host-key trust first, then restore user-key authentication:

  1. Verify the new SSH host key fingerprint out of band.

  2. On the manager, remove the stale host key and register the new one:

    ssh-keygen -R <worker_ip>
    ssh-keyscan -H -t ed25519 <worker_ip> >> ~/.ssh/known_hosts
    ssh-keygen -F <worker_ip> -f ~/.ssh/known_hosts
    
  3. Re-push the manager public key to the rebuilt worker:

    ssh-copy-id -i ~/.ssh/id_ed25519 <remote account>@<worker_ip>
    
  4. If ssh-copy-id is unavailable, recreate ~/.ssh/authorized_keys on the worker manually and keep strict permissions:

    mkdir -p ~/.ssh
    chmod 700 ~/.ssh
    printf '%s\n' '<public key content>' >> ~/.ssh/authorized_keys
    chmod 600 ~/.ssh/authorized_keys
    
  5. Verify passwordless access again before relaunching AGILAB:

    ssh <remote account>@<worker_ip> hostname
    

If the worker also lost its AGILAB cluster mount, restore ~/.agilab/.env and remount the shared clustershare path before rerunning cluster installs or pipelines.

Troubleshooting

SSHD Service

Check the service status:

sudo systemctl status ssh  # Check SSH Server status
ssh-add -L  # Check if the SSH agent is running

Check the configuration

Check the SSH server configuration in the sshd_config file:

  • Windows Server: C:\ProgramData\ssh\sshd_config

  • Unix Server: /etc/ssh/sshd_config

Ensure the following configuration is set:

PubkeyAuthentication yes
PasswordAuthentication no

To modify, open the file in an elevated text editor, update the lines as shown above, and restart the SSH server (see Restart the SSHD service section).

Restart the SSHD service

workers$ sudo systemctl restart ssh  # Restart SSH Server
workers$ eval "$(ssh-agent -s)"  # Restart SSH Agent

Permissions

chmod 700 ~/.ssh
chmod 600 ~/.ssh/id_ed25519

To verify:

ls -l "~/.ssh"