Skip to content

Auth (pubkeys and deprecate password auth)#62

Merged
brainstorm merged 23 commits into
mainfrom
auth
Mar 12, 2026
Merged

Auth (pubkeys and deprecate password auth)#62
brainstorm merged 23 commits into
mainfrom
auth

Conversation

@brainstorm

@brainstorm brainstorm commented Feb 24, 2026

Copy link
Copy Markdown
Owner

Fixes issues #20 #21.

  • Allows any and all usernames while ssh-ing (zssh@, root@, user@...), but keys/passwords must be right.
  • Removes unimplemented UART pin assignment logic via runtime ENV variables (other required subsystems for this feature are not ready yet anyway).
  • Password authentication be-gone (in favor of PubKeyAuth). We are in 2026 and we should know better, according to research:

“An average human brain is capable of remembering relatively simple passwords (~20-30 bits of entropy) that are much weaker than keys used for modern block ciphers (128/256 bits)”

How to test this?

On one console, run as usual:

$ espflash erase-flash # SSHStampConfig has changed and there's no migration path available, sorry!
$ cargo run-esp32c6

(...)
INFO - Waiting for SSH client...

On the other console, export the SSH_STAMP_PUBKEY variable and SendEnv it along:

 ssh -o SendEnv=SSH_STAMP_PUBKEY zssh@192.168.4.1

If all goes well one should see the following on the first console:

(...)
INFO - Checking pubkey string passed through ENV: ssh-ed25519 AAAAC3NzaC1lZDI1NTE5AAAAICWwC2CWtve93K0BubV0gf74kvzDG9WM5SfXAAcr+5dy rvalls@Romans-MBP.lan
INFO - Public key format valid, continuing to parse
INFO - Parsed Ed25519 public key, adding to config
INFO - Added new pubkey from ENV
(...)

As an extra, to make sure that any password authentication is ineffective (we only support PubKey auth), try connecting with the following cmdline, it should deny and not establish the SSH<-->Serial bridge at all:

$ ssh -o PubkeyAuthentication=no -o PreferredAuthentications=password zssh@192.168.4.1

And yet another extra test, generate another priv-pubkey pair and try to authenticate after first provisioning:

$ ssk-keygen -h
$ ssh -i ~/.ssh/<name_of_new_throwaway_private_key> -vvvv zssh@192.168.4.1

It should NOT establish the SSH<-->UART bridge, but reject (HSM reboot for now, needs refinement).

…e don't really have (need?) users but password must be right). Cleanup of remainders of dynamic UART pins allocations, minor import organization
@brainstorm

brainstorm commented Mar 1, 2026

Copy link
Copy Markdown
Owner Author

I'm currently trying to provision the client's public keys to the device on ServEvent::FirstAuth event via the following command:

$ ssh-copy-id -o PubkeyAuthentication=yes -o PasswordAuthentication=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=publickey root@192.168.4.1

The rationale behind using ssh-copy-id is that it's widely available and known (versus using SCP, manually copy/pasting public keys around or shuttling them using env vars (see issue #23).

The core provisioning code of ssh-copy-id, which is just a bash wrapper looks like this:

# installkey_sh [target_path]
#    produce a one-liner to add the keys to remote $TARGET_PATH
installkeys_sh() {
  # In setting INSTALLKEYS_SH:
  #    the tr puts it all on one line (to placate tcsh)
  #      (hence the excessive use of semi-colons (;) )
  # then in the command:
  #    cd to be at $HOME, just in case;
  #    the -z `tail ...` checks for a trailing newline. The echo adds one if was missing
  #    the cat adds the keys we're getting via STDIN
  #    and if available restorecon is used to restore the SELinux context
  # OpenWrt has a special case for root only.
  INSTALLKEYS_SH=$(tr '\t\n' ' ' <<-EOF
        $SET_X
        cd;
        umask 077;
        AUTH_KEY_FILE="${TARGET_PATH}";
        [ -f /etc/openwrt_release ] && [ "\$LOGNAME" = "root" ] &&
                AUTH_KEY_FILE=/etc/dropbear/authorized_keys;
        AUTH_KEY_DIR=\`dirname "\${AUTH_KEY_FILE}"\`;
        mkdir -p "\${AUTH_KEY_DIR}" &&
                { [ -z "\`tail -1c "\${AUTH_KEY_FILE}" 2>/dev/null\`" ] ||
                        echo >> "\${AUTH_KEY_FILE}" || exit 1; } &&
                cat >> "\${AUTH_KEY_FILE}" || exit 1;
        if type restorecon >/dev/null 2>&1; then
                restorecon -F "\${AUTH_KEY_DIR}" "\${AUTH_KEY_FILE}";
        fi
        EOF
  )

  # to defend against quirky remote shells: use 'exec sh -c' to get POSIX;
  printf "exec sh -c '%s'" "${INSTALLKEYS_SH}"
}

Which, due to the nature of our embedded system, fails because there's no underlying filesystem to store the public keys on, just a thin flash abstraction:

Terminated

/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: 196: cannot create /home/rvalls/.ssh/ssh-copy-id.0slarCAOI3/popids_tmp_id: Directory nonexistent
/usr/bin/ssh-copy-id: 202: cannot create /home/rvalls/.ssh/ssh-copy-id.0slarCAOI3/popids_output: Directory nonexistent
grep: /home/rvalls/.ssh/ssh-copy-id.0slarCAOI3/popids_output: No such file or directory
/usr/bin/ssh-copy-id: 215: cannot open /home/rvalls/.ssh/ssh-copy-id.0slarCAOI3/popids_output: No such file
cat: /home/rvalls/.ssh/ssh-copy-id.0slarCAOI3/popids_tmp_id: No such file or directory
/usr/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
                                                                (if you think this is a mistake, you may want to use -f option)

So the current task/approach is to "fake" those filesystem events/syscalls (returning success on each), capture the public key material and store it on SSHStampConfig flash struct... wish me luck.

EDIT: Shelved this idea in a separate branch: https://github.com/brainstorm/ssh-stamp/tree/ssh-copy-id

Some valuable feedback from sunset upstream:

I don't think it makes sense, but it probably wouldn't work at that layer anyway. the ssh-copy-id shell commands are just channel data, which could be spread over arbitrary many packets. you'd need a buffer in sunset core which would be messy

normal commands are passed as a packet field, not as channel data

Probably I can revisit this once shuttling-pubkeys-in-ssh-env-vars is finished... that'll give me a way to implement that feature in ssh-copy-id itself upstream (not so crazy strategy as there's already hardcoded vendor-specific logic for NetScreen (acquired by Juniper Networks)).

…keys. SECURITY: Allow FirstAuth unconditionally by default for now until we find a better provisioning mechanism
@brainstorm brainstorm marked this pull request as ready for review March 5, 2026 19:27
…ext up is making sure that pubkey is only added on FirstAuth and FirstBoot
@brainstorm brainstorm requested review from jubeormk1 and mmalenic March 5, 2026 21:07
@brainstorm

brainstorm commented Mar 5, 2026

Copy link
Copy Markdown
Owner Author

Fixed the pubkey SSH validation... pending before merge: make sure that one can only add a pubkey during FIRST BOOT and FIRST AUTH and block any attempts to overwrite/add further keys afterwards (see SECURITY comments).

@mmalenic Adding you in here (for the preliminary code review) because it'll eventually be very relevant piece of code with your MLKEM work.

TL;DR: I'm only considering Ed25519 keys for now, but we'll have to make it more flexible w.r.t key types when you are done with your magic ;)

brainstorm and others added 4 commits March 6, 2026 21:26
…4)...

INFO - Stored pubkey: [37, 176, 11, 96, 150, 182, 247, 189, 220, 173, 1, 185, 181, 116, 129, 254, 248, 146, 252, 195, 27, 213, 140, 229, 39, 215, 0, 7, 43, 251, 151, 89]
INFO - Presented pubkey: [37, 176, 11, 96, 150, 182, 247, 189, 220, 173, 1, 185, 181, 116, 129, 254, 248, 146, 252, 195, 27, 213, 140, 229, 39, 215, 0, 7, 43, 251, 151, 114]
…y taste, but will do for now... most obvious auth flows covered, but needs much closer inspection and refinement, specially at the HSM flow level /cc @Autofix
@brainstorm

brainstorm commented Mar 7, 2026

Copy link
Copy Markdown
Owner Author

Quite happy about how this auth preview looks like... needs refinement for sure and might be wrong somewhere in the HSM flow (I hope not, but one never knows). Auth flows seem to behave as they should now, from provisioning to runtime normal use (see how to test instructions on this PR description box).

SSHStamp's first_boot critical boolean flag is based on the SSHStampConfig struct which lives in regular flash and we don't yet have the proper "security features enablement workflows" (secure boot, glitching protection, encrypted flash and friends)... but time will come when we can spare silicon to go through those eFuse-blowing moves, early days for now ;)

Please test and give me some feedback, planning to merge it by next week, before the 10th of March.

/cc @ramrunner

@brainstorm brainstorm changed the title Auth (pubkeys and password auth) Auth (pubkeys and deprecate password auth) Mar 7, 2026
Comment thread src/settings.rs
Comment thread src/config.rs
Comment thread src/serve.rs Outdated
Comment thread src/serve.rs Outdated
Comment thread src/serve.rs
#[allow(unreachable_patterns)]
match ev {
// #[cfg(feature = "sftp-ota")]
ServEvent::SessionSubsystem(a) => {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be overkill but what about a check for authenticated when receiving a SessionSubsystem or sessionShell? This way we could catch unexpected auth workaround

Comment thread src/serve.rs Outdated
Comment thread src/serve.rs Outdated
Comment thread src/serve.rs Outdated
} else {
info!("No matching pubkey slot found");
a.reject()?;
software_reset(); // TODO: Handle better HSM-flow-wise.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Off topic: Maybe connection_loop can return an error that can be used to determine what action should be taken eg. reconfigure wifi, uart parameters, etc.

Comment thread src/serve.rs Outdated
_ => {
// Only Ed25519 keys supported
a.reject()?;
software_reset(); // TODO: Handle better HSM-flow-wise.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to do a software_reset() Does the communication hang?

Comment thread src/serve.rs
}
}
ServEvent::OpenSession(a) => {
info!("ServEvent::OpenSession");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check if the auth has been performed before processing any post auth event?

Comment thread src/serve.rs
// Only allow adding a pubkey via ENV on first-boot-like configs.

if !config_guard.first_boot {
warn!("SSH_STAMP_PUBKEY env received but not first-boot; rejecting");

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a way to replace the client pubkey?

@jubeormk1 jubeormk1 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments. I am going to test it now in hardware

@jubeormk1

jubeormk1 commented Mar 9, 2026

Copy link
Copy Markdown
Collaborator

Running on esp32c6. Once the AuthKey is set, the server rejects any password or unauthenticated login.

I have some questions that are more than bug, user handling decisions:

  • It is not possible to replace the client public key. If that is what we want all good but we have no way to replace it.
  • It is possible to start a bridge/sftp-ota without setting the client key. Shall we disable any Session/subsystem while first_boot is set?

With this change, only a Shell or Subsystem (sftp-ota) can be started unless the user has set a authentication key. This way we only allow secure uses of ssh-stamp by default
Apologies, I should have checked the warnings.
@brainstorm brainstorm force-pushed the auth branch 2 times, most recently from 6dcc2b6 to 396f496 Compare March 9, 2026 20:12
Before now we where cloning the stdio for the shell channel. We used one stdio to read and the other (stdio2) for writing. This causes some internal issue in sunset and trying to read on a channel that has received an EOF would not return Ok(0) or error.

Splitting the stdio instead of cloning fixes the behavior.

for the shake of exploring the situation I have tried the next combinations:

- cloning stdio, reading from the cloned stdio: No Ok(0)
- cloning stdio and splitting it. Using split stdio for read/write: No Ok(0)

In summary, when cloned, we loose the Ok(0) expected after the channel has received EOF

This time I have checked the warnings so the CI should be happy ;-)

@mmalenic mmalenic left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with @jubeormk1, I think it's worthwhile doing a check to ensure that no actions can be performed after auth, if auth has not been run. E.g. using an is_authed flag check at the start of each post-auth enum variant with a reject()? if it fails?

Otherwise looks good!

@brainstorm

brainstorm commented Mar 10, 2026

Copy link
Copy Markdown
Owner Author

@jubeormk1

It is not possible to replace the client public key. If that is what we want all good but we have no way to replace it.

Well, my thoughts on this (at least before we adopt a more serious, enterprise-y security posture) is to just run espflash erase-flash and add the keys you want/need?

Our current dev-friendly threat model is relatively relaxed right now (which is currently at: "physical access means game over anyway")... that will most probably change in the future, but I just don't want to have a silicon graveyard yet with eFuse-blowing tests for security... YET :-S

Happy to be convinced otherwise!

It is possible to start a bridge/sftp-ota without setting the client key. Shall we disable any Session/subsystem while first_boot is set?

Good point, yes, I believe so, we should disable any session/subsystem while first_boot is set!

@jubeormk1

Copy link
Copy Markdown
Collaborator

@jubeormk1

It is not possible to replace the client public key. If that is what we want all good but we have no way to replace it.

Well, my thoughts on this (at least before we adopt a more serious, enterprise-y security posture) is to just run espflash erase-flash and add the keys you want/need?

Our current dev-friendly threat model is relatively relaxed right now (which is currently at: "physical access means game over anyway")... that will most probably change in the future, but I just don't want to have a silicon graveyard yet with eFuse-blowing tests for security... YET :-S

Happy to be convinced otherwise!

It is possible to start a bridge/sftp-ota without setting the client key. Shall we disable any Session/subsystem while first_boot is set?

Good point, yes, I believe so, we should disable any session/subsystem while first_boot is set!

Hi there and good push! It is very close to be done!

I would like to go back to the client public key, I think that I might have created some confusion.

What I meant about the client public key not being updatable is that after the client provides a public key for authentication, while first boot flag is true, there is no way to replace it if the user decided to do so even when authenticated and physical access and erase-flash would be required.

Before there was a provision for replacing the client public key but now this is denied.

It is not a big deal and I would not delay this pull request for this.

brainstorm and others added 3 commits March 11, 2026 21:05
…fe6b2a gets fully addressed by @Autofix, wip/broken ATM"

This reverts commit 9ef5d39.
The software_reset() calls were masking an issue handling the borrow of `a` in the ev match for `ServEvent::SessionSubsystem(a)` and `ServEvent::SessionShell(a)`

The changes in this commit make the borrow exclusive using a simple if else if else cases handling.

As addition I have reduced the scope of the config_guard to check the first_boot to a minimum

Handling the auth on first boot and subsequent boots.

The behavior is:
- First boot:
	- Good key provided: Store key and open session
	- Other: No session. bye
- Not first boot (client pub key in config):
	- Correct key: session open
	- Other: No session. bye
The software resets where masking borrowing issues. I have removed them and added a WIP message.

In the future it would be desirable having the FSM dealing with the depth of the system closed instead of reverting each state until it get back to main.

Maybe a return value in main.rs::bridge_connected select3 requesting the right level of steps back?
- session closed/bad auth: could stay in that function
- config changes that imply hardware: Reset or better granularity
- server finishes: Not sure. Reset until decided
@brainstorm brainstorm merged commit 3427a32 into main Mar 12, 2026
8 checks passed
@brainstorm brainstorm deleted the auth branch March 12, 2026 16:27
@jubeormk1 jubeormk1 mentioned this pull request Mar 17, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants