Skip to content

CPD-11979: Replace SSH with Teleport (tsh ssh) for instance access#337

Open
shrikantbhosaleacquia wants to merge 5 commits intoproxy-cli-hook-aws-v3from
CPD-11979
Open

CPD-11979: Replace SSH with Teleport (tsh ssh) for instance access#337
shrikantbhosaleacquia wants to merge 5 commits intoproxy-cli-hook-aws-v3from
CPD-11979

Conversation

@shrikantbhosaleacquia
Copy link

  • Add TeleportConfig value object to compute proxy URL, account ID, hostname, and identity file from stack name and SSH user
  • Update SSHCommandBuilder to produce tsh ssh commands instead of ssh
  • Update Controller#ssh to pass TeleportConfig to SSHCommandBuilder
  • Update rotate_asg_instances SSH class to accept resources and build TeleportConfig from stack name at runtime
  • Fix rotate_asg_instances.rb to require individual aws-sdk gems
  • Add rexml to Gemfile (required by newer aws-sdk-core)
  • Update all affected specs for new command format and SSH constructor

Motivation

Fixes #NNN

Proposed changes

Alternatives considered

Testing steps

- Add TeleportConfig value object to compute proxy URL, account ID,
  hostname, and identity file from stack name and SSH user
- Update SSHCommandBuilder to produce tsh ssh commands instead of ssh
- Update Controller#ssh to pass TeleportConfig to SSHCommandBuilder
- Update rotate_asg_instances SSH class to accept resources and build
  TeleportConfig from stack name at runtime
- Fix rotate_asg_instances.rb to require individual aws-sdk gems
- Add rexml to Gemfile (required by newer aws-sdk-core)
- Update all affected specs for new command format and SSH constructor
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR replaces direct SSH access with Teleport (tsh ssh) for connecting to AWS EC2 instances. The change introduces a new TeleportConfig class that determines the appropriate Teleport proxy URL, AWS account ID, and identity file based on the stack name, SSH user, and region. The SSHCommandBuilder has been updated to generate tsh ssh commands instead of traditional ssh commands, and all affected code including the rotate_asg_instances plugin has been updated accordingly.

Changes:

  • Introduced TeleportConfig value object to compute Teleport connection parameters from stack name, SSH user, and AWS region
  • Updated SSHCommandBuilder to generate tsh ssh commands with Teleport-specific flags instead of traditional ssh commands
  • Modified rotate_asg_instances plugin to accept resources parameter and build TeleportConfig at runtime
  • Replaced monolithic aws-sdk dependency with individual gems (aws-sdk-autoscaling, aws-sdk-ec2) in rotate_asg_instances plugin
  • Added rexml gem to test dependencies (required by newer aws-sdk-core)

Reviewed changes

Copilot reviewed 11 out of 11 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
lib/moonshot/teleport_config.rb New class that encapsulates Teleport SSH configuration logic, determining proxy URLs, account IDs, and identity files based on environment detection
lib/moonshot/ssh_command_builder.rb Refactored to accept TeleportConfig and build tsh ssh commands instead of traditional ssh commands
lib/moonshot/controller.rb Updated ssh method to create TeleportConfig from stack name and region, passing it to SSHCommandBuilder
lib/plugins/rotate_asg_instances/ssh.rb Updated to accept resources parameter in constructor and build TeleportConfig for each command
lib/plugins/rotate_asg_instances/asg.rb Updated to pass resources when initializing SSH class
lib/plugins/rotate_asg_instances.rb Replaced monolithic aws-sdk require with specific gem requires
Gemfile Added rexml gem to test dependencies
spec/moonshot/teleport_config_spec.rb Comprehensive test coverage for TeleportConfig including prod/dev environments and bot users
spec/moonshot/ssh_spec.rb Updated tests to expect tsh ssh commands and Teleport hostnames
spec/moonshot/plugins/rotate_asg_instances/ssh_spec.rb Updated to pass resources parameter to SSH constructor
spec/moonshot/plugins/rotate_asg_instances/asg_spec.rb Updated shutdown_instance tests to expect tsh ssh commands with Teleport parameters

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

cmd << "-l #{@config.ssh_user}" if @config.ssh_user
cmd << instance_ip
cmd << "--proxy=#{@teleport_config.proxy_url}"
cmd << "-ti #{@teleport_config.identity_file}" if @teleport_config.bot_user?
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The identity file flag syntax is incorrect. The code constructs a string like '-ti tbot-auth-us-east-1/identity' which is then added as a single command argument. This means the shell will interpret this as a single token '-ti tbot-auth-us-east-1/identity', not as flags '-t' and '-i' with an argument. The correct approach is either: (1) add them as separate array elements before joining: cmd << '-t' << '-i' << @teleport_config.identity_file, or (2) use tsh's long-form option: cmd << "--identity-file=#{@teleport_config.identity_file}" << '-t'. Note that '-tA' on the next line already includes '-t', so you may only need '-i' plus the identity file for bot users.

Suggested change
cmd << "-ti #{@teleport_config.identity_file}" if @teleport_config.bot_user?
cmd << "--identity-file=#{@teleport_config.identity_file}" if @teleport_config.bot_user?

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +7
# Encapsulates Teleport SSH configuration derived from the stack name,
# SSH user, and AWS region. Determines the correct proxy URL, account ID,
# and identity file path for both normal users and bot users.
class TeleportConfig
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The class documentation states it 'Encapsulates Teleport SSH configuration' but doesn't explain the key decision logic: how prod vs dev environments are determined (by checking if 'prod' appears in the stack name) or what account IDs these constants represent. Consider adding more detailed documentation explaining the environment detection logic, the meaning of the hardcoded account IDs, and example stack names that would be classified as prod vs dev.

Copilot uses AI. Check for mistakes.
Comment on lines +189 to +190
teleport_config = TeleportConfig.new(
stack.name, @config.ssh_config.ssh_user, ENV['AWS_REGION']
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AWS_REGION environment variable is accessed without validation. If ENV['AWS_REGION'] is nil or empty, this will create a TeleportConfig with an empty string for the region parameter (due to .to_s conversion), which could lead to incorrect proxy URLs in production environments (e.g., ".teleport.cloudservices.acquia.io") and malformed hostnames. Consider adding validation to raise a clear error message if AWS_REGION is not set, similar to how CodeDeploySetup handles this case.

Suggested change
teleport_config = TeleportConfig.new(
stack.name, @config.ssh_config.ssh_user, ENV['AWS_REGION']
region = ENV['AWS_REGION'].to_s.strip
if region.empty?
raise 'AWS_REGION environment variable must be set (for example, "us-east-1") to establish a Teleport SSH connection.'
end
teleport_config = TeleportConfig.new(
stack.name, @config.ssh_config.ssh_user, region

Copilot uses AI. Check for mistakes.
Comment on lines +36 to +39
teleport_config = Moonshot::TeleportConfig.new(
@resources.controller.stack.name,
Moonshot.config.ssh_config.ssh_user,
ENV['AWS_REGION']
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AWS_REGION environment variable is accessed without validation. If ENV['AWS_REGION'] is nil or empty, this will create a TeleportConfig with an empty string for the region parameter (due to .to_s conversion), which could lead to incorrect proxy URLs in production environments (e.g., ".teleport.cloudservices.acquia.io") and malformed hostnames. Consider adding validation to raise a clear error message if AWS_REGION is not set, similar to how CodeDeploySetup handles this case.

Suggested change
teleport_config = Moonshot::TeleportConfig.new(
@resources.controller.stack.name,
Moonshot.config.ssh_config.ssh_user,
ENV['AWS_REGION']
region = ENV['AWS_REGION']
if region.nil? || region.empty?
raise 'AWS_REGION environment variable must be set for SSH teleport configuration'
end
teleport_config = Moonshot::TeleportConfig.new(
@resources.controller.stack.name,
Moonshot.config.ssh_config.ssh_user,
region

Copilot uses AI. Check for mistakes.
Comment on lines +189 to +190
teleport_config = TeleportConfig.new(
stack.name, @config.ssh_config.ssh_user, ENV['AWS_REGION']
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If @config.ssh_config.ssh_user is nil (for example, if neither the environment variable MOONSHOT_SSH_USER nor a configuration value is set), it will be converted to an empty string by TeleportConfig. This will result in an invalid Teleport hostname format like '@i-xxx.region.account' (missing username) and bot_user? will always return false (since '' != 'clouddatabot'). Consider validating that ssh_user is set before constructing the TeleportConfig, or add validation within TeleportConfig.initialize.

Suggested change
teleport_config = TeleportConfig.new(
stack.name, @config.ssh_config.ssh_user, ENV['AWS_REGION']
ssh_user = @config.ssh_config.ssh_user
if ssh_user.nil? || ssh_user.to_s.empty?
raise 'SSH user is not configured. Please set MOONSHOT_SSH_USER or configure ssh_user in your Moonshot configuration.'
end
teleport_config = TeleportConfig.new(
stack.name, ssh_user, ENV['AWS_REGION']

Copilot uses AI. Check for mistakes.
private

def prod?
@stack_name.include?('prod')
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The prod? method uses a simple string inclusion check (include?('prod')) to determine the environment. This could lead to false positives if 'prod' appears anywhere in the stack name, not just as the environment identifier. For example, a stack named 'product-dev' or 'reproduce-test' would incorrectly be identified as production. Consider using a more robust pattern match or explicit environment detection.

Suggested change
@stack_name.include?('prod')
@stack_name.match?(/\bprod\b/)

Copilot uses AI. Check for mistakes.

attr_reader :proxy_url, :account_id, :region, :ssh_user

def initialize(stack_name, ssh_user, region)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The initialize method converts nil parameters to empty strings via .to_s without validation. This leads to multiple issues: (1) empty region produces malformed URLs like '.teleport.cloudservices.acquia.io' and invalid identity paths like 'tbot-auth-/identity', (2) empty ssh_user produces invalid Teleport hostnames like '@instance.region.account', (3) empty stack_name will cause incorrect environment detection (will default to dev since 'prod' won't be found). Add validation to ensure all three parameters are present and non-empty before proceeding.

Suggested change
def initialize(stack_name, ssh_user, region)
def initialize(stack_name, ssh_user, region)
missing_params = []
missing_params << 'stack_name' if stack_name.to_s.strip.empty?
missing_params << 'ssh_user' if ssh_user.to_s.strip.empty?
missing_params << 'region' if region.to_s.strip.empty?
unless missing_params.empty?
raise ArgumentError, "TeleportConfig requires non-empty: #{missing_params.join(', ')}"
end

Copilot uses AI. Check for mistakes.
cmd << "-l #{@config.ssh_user}" if @config.ssh_user
cmd << instance_ip
cmd << "--proxy=#{@teleport_config.proxy_url}"
cmd << "-ti #{@teleport_config.identity_file}" if @teleport_config.bot_user?
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no test coverage for the bot user path in SSHCommandBuilder. When bot_user? returns true, the command includes '-ti' followed by the identity file path, but this code path is not tested. Consider adding test cases that verify the correct command format when using bot users (clouddatabot) to ensure the identity file flag is properly included in the tsh ssh command.

Suggested change
cmd << "-ti #{@teleport_config.identity_file}" if @teleport_config.bot_user?
cmd << "-ti #{@teleport_config.identity_file}" if @teleport_config.identity_file

Copilot uses AI. Check for mistakes.
def initialize(ssh_config, instance_id)
@config = ssh_config
@instance_id = instance_id
def initialize(ssh_config, instance_id, teleport_config)
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SSHCommandBuilder constructor signature has changed from 2 parameters (ssh_config, instance_id) to 3 parameters (ssh_config, instance_id, teleport_config). This is a breaking API change that could affect any external code or plugins that construct SSHCommandBuilder instances directly. While the code in this PR has been updated accordingly, consider whether this warrants a major version bump or additional documentation about the breaking change.

Copilot uses AI. Check for mistakes.
Comment on lines +8 to +9
PROD_ACCOUNT_ID = '546349603759'
DEV_ACCOUNT_ID = '672327909798'
Copy link

Copilot AI Feb 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoded AWS account IDs (PROD_ACCOUNT_ID and DEV_ACCOUNT_ID) are exposed in the code. While account IDs are generally not considered highly sensitive secrets on their own, hardcoding them makes it difficult to support additional environments or accounts without code changes. Consider whether these should be configurable via environment variables or configuration files to improve flexibility and follow the principle of configuration over hardcoding.

Suggested change
PROD_ACCOUNT_ID = '546349603759'
DEV_ACCOUNT_ID = '672327909798'
PROD_ACCOUNT_ID = ENV['TELEPORT_PROD_ACCOUNT_ID'] || '546349603759'
DEV_ACCOUNT_ID = ENV['TELEPORT_DEV_ACCOUNT_ID'] || '672327909798'

Copilot uses AI. Check for mistakes.
Change: minor
Purpose: feature
Change: minor
Purpose: feature
Change: minor
Purpose: feature
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +10 to 14
def initialize(ssh_config, instance_id, teleport_config)
@config = ssh_config
@instance_id = instance_id
@teleport_config = teleport_config
end
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSHCommandBuilder's initializer now requires a teleport_config, but there is still at least one in-repo caller using the old 2-arg form (lib/moonshot/tools/asg_rollout/hook_exec_environment.rb:16). This will raise ArgumentError at runtime; either update that caller to construct/pass a TeleportConfig, or make teleport_config optional with a backwards-compatible fallback.

Copilot uses AI. Check for mistakes.
Comment on lines 9 to 14
group :test do
gem 'codeclimate-test-reporter'
gem 'pry'
gem 'rexml'
gem 'rubocop'
end
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rexml is added only in the :test group, but the PR description indicates it's required by newer aws-sdk-core (a runtime dependency). If this gem is needed at runtime, it should be added as a normal dependency (e.g., in the gemspec) or outside the :test group so non-test installs don't break.

Copilot uses AI. Check for mistakes.
Comment on lines +15 to +16
PROD_IDENTITY_TEMPLATE = '/opt/machine-id/%<region>s/identity'
DEV_IDENTITY = '/opt/machine-id/dev-us-east-1/identity'
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TeleportConfig's identity file constants use '/opt/machine-id/...', but the new spec expects 'tbot-auth-.../identity' paths for the bot user. As written, TeleportConfig#identity_file will not match the spec (and likely the intended runtime path). Please align the constants/logic and the spec so they agree on the actual identity file location.

Suggested change
PROD_IDENTITY_TEMPLATE = '/opt/machine-id/%<region>s/identity'
DEV_IDENTITY = '/opt/machine-id/dev-us-east-1/identity'
PROD_IDENTITY_TEMPLATE = '/opt/tbot-auth-%<region>s/identity'
DEV_IDENTITY = '/opt/tbot-auth-dev-us-east-1/identity'

Copilot uses AI. Check for mistakes.
Comment on lines +20 to +24
def initialize(stack_name, ssh_user, region)
@stack_name = stack_name.to_s
@ssh_user = ssh_user.to_s
@region = region.to_s

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TeleportConfig coerces a nil region to an empty string via region.to_s, which can silently build an invalid proxy URL/hostname when AWS_REGION is unset. Previously this would fail earlier via AWS SDK missing-region behavior; consider validating region is present/non-empty and raising an ArgumentError when it isn't.

Copilot uses AI. Check for mistakes.
Comment on lines +17 to +21
cmd = ['tsh', 'ssh']
cmd << @config.ssh_options if @config.ssh_options
cmd << "-i #{@config.ssh_identity_file}" if @config.ssh_identity_file
cmd << "-l #{@config.ssh_user}" if @config.ssh_user
cmd << instance_ip
cmd << "--proxy=#{@teleport_config.proxy_url}"
cmd << "-i #{@teleport_config.identity_file}" if @teleport_config.bot_user?
cmd << "#{@teleport_config.ssh_user}@#{instance_host}"
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SSHCommandBuilder no longer adds any default TTY/agent-forwarding flags, but the updated specs expect -tA to always be present (similar to the old ssh -t behavior). If -tA is required for Teleport access, add it to the generated command (in the same position/order the specs expect) or ensure it is set consistently via configuration.

Copilot uses AI. Check for mistakes.
Change: minor
Purpose: feature
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants