Skip to content

Commit 24a476f

Browse files
authored
Merge pull request #45 from default50/elb_improvements
Improvements for ELB functionality
2 parents e907cba + 99c91fe commit 24a476f

File tree

4 files changed

+264
-91
lines changed

4 files changed

+264
-91
lines changed

load-balancing/elb/README.md

Lines changed: 43 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,52 +1,55 @@
11
# ELB and ASG lifecycle event scripts
22

3-
Often when running a web service, you'll have your instances behind a load balancer. But when
4-
deploying new code to these instances, you don't want the load balancer to continue sending customer
5-
traffic to an instance while the deployment is in progress. Lifecycle event scripts give you the
6-
ability to integrate your AWS CodeDeploy deployments with instances that are behind an Elastic Load
7-
Balancer or in an Auto Scaling group. Simply set the name (or names) of the Elastic Load Balancer
8-
your instances are a part of, set the scripts in the appropriate lifecycle events, and the scripts
9-
will take care of deregistering the instance, waiting for connection draining, and re-registering
10-
after the deployment finishes.
3+
Often when running a web service, you'll have your instances behind a load balancer. But when deploying new code to these instances, you don't want the load balancer to continue sending customer traffic to an instance while the deployment is in progress. Lifecycle event scripts give you the ability to integrate your AWS CodeDeploy deployments with instances that are behind an Elastic Load Balancer or in an Auto Scaling group. Simply set the name (or names) of the Elastic Load Balancer your instances are a part of, set the scripts in the appropriate lifecycle events, and the scripts will take care of deregistering the instance, waiting for connection draining, and re-registering after the deployment finishes.
114

125
## Requirements
136

14-
The register and deregister scripts have a couple of dependencies in order to properly interact with
15-
Elastic Load Balancing and AutoScaling:
16-
17-
1. The [AWS CLI](http://aws.amazon.com/cli/). In order to take advantage of
18-
AutoScaling's Standby feature, the CLI must be at least version 1.3.25. If you
19-
have Python and PIP already installed, the CLI can simply be installed with `pip
20-
install awscli`. Otherwise, follow the [installation instructions](http://docs.aws.amazon.com/cli/latest/userguide/installing.html)
21-
in the CLI's user guide.
22-
1. An instance profile with a policy that allows, at minimum, the following actions:
23-
24-
```
25-
elasticloadbalancing:Describe*
26-
elasticloadbalancing:DeregisterInstancesFromLoadBalancer
27-
elasticloadbalancing:RegisterInstancesWithLoadBalancer
28-
autoscaling:Describe*
29-
autoscaling:EnterStandby
30-
autoscaling:ExitStandby
31-
autoscaling:UpdateAutoScalingGroup
32-
```
33-
34-
Note: the AWS CodeDeploy Agent requires that an instance profile be attached to all instances that
35-
are to participate in AWS CodeDeploy deployments. For more information on creating an instance
36-
profile for AWS CodeDeploy, see the [Create an IAM Instance Profile for Your Amazon EC2 Instances]()
37-
topic in the documentation.
38-
1. All instances are assumed to already have the AWS CodeDeploy Agent installed.
7+
The register and deregister scripts have a couple of dependencies in order to properly interact with Elastic Load Balancing and AutoScaling:
8+
9+
1. The [AWS CLI](http://aws.amazon.com/cli/). In order to take advantage of AutoScaling's Standby feature, the CLI must be at least version 1.3.25. If you have Python and PIP already installed, the CLI can simply be installed with `pip install awscli`. Otherwise, follow the [installation instructions](http://docs.aws.amazon.com/cli/latest/userguide/installing.html) in the CLI's user guide.
10+
11+
2. An instance profile with a policy that allows, at minimum, the following actions:
12+
13+
elasticloadbalancing:Describe*
14+
elasticloadbalancing:DeregisterInstancesFromLoadBalancer
15+
elasticloadbalancing:RegisterInstancesWithLoadBalancer
16+
autoscaling:Describe*
17+
autoscaling:EnterStandby
18+
autoscaling:ExitStandby
19+
autoscaling:UpdateAutoScalingGroup
20+
autoscaling:SuspendProcesses
21+
autoscaling:ResumeProcesses
22+
23+
**Note**: the AWS CodeDeploy Agent requires that an instance profile be attached to all instances that are to participate in AWS CodeDeploy deployments. For more information on creating an instance profile for AWS CodeDeploy, see the [Create an IAM Instance Profile for Your Amazon EC2 Instances](http://docs.aws.amazon.com/codedeploy/latest/userguide/how-to-create-iam-instance-profile.html) topic in the documentation.
24+
25+
3. All instances are assumed to already have the AWS CodeDeploy Agent installed.
3926

4027
## Installing the Scripts
4128

4229
To use these scripts in your own application:
4330

4431
1. Install the AWS CLI on all your instances.
45-
1. Update the policies on the EC2 instance profile to allow the above actions.
46-
1. Copy the `.sh` files in this directory into your application source.
47-
1. Edit your application's `appspec.yml` to run `deregister_from_elb.sh` on the ApplicationStop event,
48-
and `register_with_elb.sh` on the ApplicationStart event.
49-
1. Edit `common_functions.sh` to set `ELB_LIST` to contain the name(s) of the Elastic Load
50-
Balancer(s) your deployment group is a part of. Make sure the entries in ELB_LIST are separated by space.
51-
1. Deploy!
32+
2. Update the policies on the EC2 instance profile to allow the above actions.
33+
3. Copy the `.sh` files in this directory into your application source.
34+
4. Edit your application's `appspec.yml` to run `deregister_from_elb.sh` on the ApplicationStop event, and `register_with_elb.sh` on the ApplicationStart event.
35+
5. If your instance is not in an Auto Scaling Group, edit `common_functions.sh` to set `ELB_LIST` to contain the name(s) of the Elastic Load Balancer(s) your deployment group is a part of. Make sure the entries in ELB_LIST are separated by space.
36+
Alternatively, you can set `ELB_LIST` to `_all_` to automatically use all load balancers the instance is registered to, or `_any_` to get the same behaviour as `_all_` but without failing your deployments if the instance is not part of any ASG or ELB. This is more flexible in heterogeneous tag-based Deployment Groups.
37+
6. Optionally set up `HANDLE_PROCS=true` in `common_functions.sh`. See note below.
38+
7. Deploy!
39+
40+
## Important notice about handling AutoScaling processes
41+
42+
When using AutoScaling with CodeDeploy you have to consider some edge cases during the deployment time window:
43+
44+
1. If you have a scale up event, the new instance(s) will get the latest successful *Revision*, and not the one you are currently deploying. You will end up with a fleet of mixed revisions.
45+
2. If you have a scale down event, instances are going to be terminated, and your deployment will (probably) fail.
46+
3. If your instances are not balanced accross Availability Zones **and you are** using these scripts, AutoScaling may terminate some instances or create new ones to maintain balance (see [this doc](http://docs.aws.amazon.com/autoscaling/latest/userguide/as-suspend-resume-processes.html#process-types)), interfering with your deployment.
47+
4. If you have the health checks of your AutoScaling Group based off the ELB's ([documentation](http://docs.aws.amazon.com/autoscaling/latest/userguide/healthcheck.html)) **and you are not** using these scripts, then instances will be marked as unhealthy and terminated, interfering with your deployment.
48+
49+
In an effort to solve these cases, the scripts can suspend some AutoScaling processes (AZRebalance, AlarmNotification, ScheduledActions and ReplaceUnhealthy) while deploying, to avoid those events happening in the middle of your deployment. You only have to set up `HANDLE_PROCS=true` in `common_functions.sh`.
50+
51+
A record of the previously (to the start of the deployment) suspended process is kept by the scripts (on each instance), so when finishing the deployment the status of the processes on the AutoScaling Group should be returned to the same status as before. I.e. if AZRebalance was suspended manually it will not be resumed. However, if the scripts don't run (failed deployment) you may end up with stale suspended processes.
52+
53+
Disclaimer: There's a small chance that an event is triggered while the deployment is progressing from one instance to another. The only way to avoid that completely whould be to monitor the deployment externally to CodeDeploy/AutoScaling and act accordingly. The effort on doing that compared to this depends on the each use case.
5254

55+
**WARNING**: If you are using this functionality you should only use *CodeDepoyDefault.OneAtATime* deployment configuration to ensure a serial execution of the scripts. Concurrent runs are not supported.

load-balancing/elb/common_functions.sh

Lines changed: 159 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,10 @@
1414
# permissions and limitations under the License.
1515

1616
# ELB_LIST defines which Elastic Load Balancers this instance should be part of.
17-
# The elements in ELB_LIST should be seperated by space.
17+
# The elements in ELB_LIST should be separated by space. Safe default is "".
18+
# Set to "_all_" to automatically find all load balancers the instance is registered to.
19+
# Set to "_any_" will work as "_all_" but will not fail if instance is not attached to
20+
# any ASG or ELB, giving flexibility.
1821
ELB_LIST=""
1922

2023
# Under normal circumstances, you shouldn't need to change anything below this line.
@@ -34,6 +37,12 @@ WAITER_INTERVAL=1
3437
# AutoScaling Standby features at minimum require this version to work.
3538
MIN_CLI_VERSION='1.3.25'
3639

40+
# Create a flagfile for each deployment
41+
FLAGFILE="/tmp/asg_codedeploy_flags-$DEPLOYMENT_GROUP_ID-$DEPLOYMENT_ID"
42+
43+
# Handle ASG processes
44+
HANDLE_PROCS=false
45+
3746
# Usage: get_instance_region
3847
#
3948
# Writes to STDOUT the AWS region as known by the local instance.
@@ -49,6 +58,123 @@ get_instance_region() {
4958

5059
AWS_CLI="aws --region $(get_instance_region)"
5160

61+
# Usage: set_flag <flag> <value>
62+
#
63+
# Writes <flag>=<value> to FLAGFILE
64+
set_flag() {
65+
if echo "$1=$2" >> $FLAGFILE; then
66+
return 0
67+
else
68+
error_exit "Unable to write flag \"$1=$2\" to $FLAGFILE"
69+
fi
70+
}
71+
72+
# Usage: get_flag <flag>
73+
#
74+
# Checks for <flag> in FLAGFILE. Echoes it's value and returns 0 on success or non-zero if it fails to read the file.
75+
get_flag() {
76+
if [ -r $FLAGFILE ]; then
77+
local result=$(awk -F= -v flag="$1" '{if ( $1 == flag ) {print $2}}' $FLAGFILE)
78+
echo "${result}"
79+
return 0
80+
else
81+
# FLAGFILE doesn't exist
82+
return 1
83+
fi
84+
}
85+
86+
# Usage: check_suspended_processes
87+
#
88+
# Checks processes suspended on the ASG before beginning and store them in
89+
# the FLAGFILE to avoid resuming afterwards. Also abort if Launch process
90+
# is suspended.
91+
check_suspended_processes() {
92+
# Get suspended processes in an array
93+
local suspended=($($AWS_CLI autoscaling describe-auto-scaling-groups \
94+
--auto-scaling-group-name "${asg_name}" \
95+
--query 'AutoScalingGroups[].SuspendedProcesses' \
96+
--output text | awk '{printf $1" "}'))
97+
98+
if [ ${#suspended[@]} -eq 0 ]; then
99+
msg "No processes were suspended on the ASG before starting."
100+
else
101+
msg "This processes were suspended on the ASG before starting: ${suspended[*]}"
102+
fi
103+
104+
# If "Launch" process is suspended abort because we will not be able to recover from StandBy. Note the "[[ ... =~" bashism.
105+
if [[ "${suspended[@]}" =~ "Launch" ]]; then
106+
error_exit "'Launch' process of AutoScaling is suspended which will not allow us to recover the instance from StandBy. Aborting."
107+
fi
108+
109+
for process in ${suspended[@]}; do
110+
set_flag "$process" "true"
111+
done
112+
}
113+
114+
# Usage: suspend_processes
115+
#
116+
# Suspend processes known to cause problems during deployments.
117+
# The API call is idempotent so it doesn't matter if any were previously suspended.
118+
suspend_processes() {
119+
local -a processes=(AZRebalance AlarmNotification ScheduledActions ReplaceUnhealthy)
120+
121+
msg "Suspending ${processes[*]} processes"
122+
$AWS_CLI autoscaling suspend-processes \
123+
--auto-scaling-group-name "${asg_name}" \
124+
--scaling-processes ${processes[@]}
125+
if [ $? != 0 ]; then
126+
error_exit "Failed to suspend ${processes[*]} processes for ASG ${asg_name}. Aborting as this may cause issues."
127+
fi
128+
}
129+
130+
# Usage: resume_processes
131+
#
132+
# Resume processes suspended, except for the one suspended before deployment.
133+
resume_processes() {
134+
local -a processes=(AZRebalance AlarmNotification ScheduledActions ReplaceUnhealthy)
135+
local -a to_resume
136+
137+
for p in ${processes[@]}; do
138+
if ! local tmp_flag_value=$(get_flag "$p"); then
139+
error_exit "$FLAGFILE doesn't exist or is unreadable"
140+
elif [ ! "$tmp_flag_value" = "true" ] ; then
141+
to_resume=("${to_resume[@]}" "$p")
142+
fi
143+
done
144+
145+
msg "Resuming ${to_resume[*]} processes"
146+
$AWS_CLI autoscaling resume-processes \
147+
--auto-scaling-group-name "${asg_name}" \
148+
--scaling-processes ${to_resume[@]}
149+
if [ $? != 0 ]; then
150+
error_exit "Failed to resume ${to_resume[*]} processes for ASG ${asg_name}. Aborting as this may cause issues."
151+
fi
152+
}
153+
154+
# Usage: remove_flagfile
155+
#
156+
# Removes FLAGFILE. Returns non-zero if failure.
157+
remove_flagfile() {
158+
if rm $FLAGFILE; then
159+
msg "Successfully removed flagfile $FLAGFILE"
160+
return 0
161+
else
162+
msg "WARNING: Failed to remove flagfile $FLAGFILE."
163+
fi
164+
}
165+
166+
# Usage: finish_msg
167+
#
168+
# Prints some finishing statistics
169+
finish_msg() {
170+
msg "Finished $(basename $0) at $(/bin/date "+%F %T")"
171+
172+
end_sec=$(/bin/date +%s.%N)
173+
elapsed_seconds=$(echo "$end_sec" "$start_sec" | awk '{ print $1 - $2 }')
174+
175+
msg "Elapsed time: $elapsed_seconds"
176+
}
177+
52178
# Usage: autoscaling_group_name <EC2 instance ID>
53179
#
54180
# Prints to STDOUT the name of the AutoScaling group this instance is a part of and returns 0. If
@@ -101,6 +227,14 @@ autoscaling_enter_standby() {
101227
return 0
102228
fi
103229

230+
if [ "$HANDLE_PROCS" = "true" ]; then
231+
msg "Checking ASG ${asg_name} suspended processes"
232+
check_suspended_processes
233+
234+
# Suspend troublesome processes while deploying
235+
suspend_processes
236+
fi
237+
104238
msg "Checking to see if ASG ${asg_name} will let us decrease desired capacity"
105239
local min_desired=$($AWS_CLI autoscaling describe-auto-scaling-groups \
106240
--auto-scaling-group-name "${asg_name}" \
@@ -113,6 +247,7 @@ autoscaling_enter_standby() {
113247
if [ -z "$min_cap" -o -z "$desired_cap" ]; then
114248
msg "Unable to determine minimum and desired capacity for ASG ${asg_name}."
115249
msg "Attempting to put this instance into standby regardless."
250+
set_flag "asgmindecremented" "false"
116251
elif [ $min_cap == $desired_cap -a $min_cap -gt 0 ]; then
117252
local new_min=$(($min_cap - 1))
118253
msg "Decrementing ASG ${asg_name}'s minimum size to $new_min"
@@ -123,10 +258,13 @@ autoscaling_enter_standby() {
123258
msg "Failed to reduce ASG ${asg_name}'s minimum size to $new_min. Cannot put this instance into Standby."
124259
return 1
125260
else
126-
msg "ASG ${asg_name}'s minimum size has been decremented, creating flag file /tmp/asgmindecremented"
127-
# Create a "flag" file to denote that the ASG min has been decremented
128-
touch /tmp/asgmindecremented
261+
msg "ASG ${asg_name}'s minimum size has been decremented, creating flag in file $FLAGFILE"
262+
# Create a "flag" denote that the ASG min has been decremented
263+
set_flag "asgmindecremented" "true"
129264
fi
265+
else
266+
msg "No need to decrement ASG ${asg_name}'s minimum size"
267+
set_flag "asgmindecremented" "false"
130268
fi
131269

132270
msg "Putting instance $instance_id into Standby"
@@ -192,7 +330,9 @@ autoscaling_exit_standby() {
192330
return 1
193331
fi
194332

195-
if [ -a /tmp/asgmindecremented ]; then
333+
if ! local tmp_flag_value=$(get_flag "asgmindecremented"); then
334+
error_exit "$FLAGFILE doesn't exist or is unreadable"
335+
elif [ "$tmp_flag_value" = "true" ]; then
196336
local min_desired=$($AWS_CLI autoscaling describe-auto-scaling-groups \
197337
--auto-scaling-group-name "${asg_name}" \
198338
--query 'AutoScalingGroups[0].[MinSize, DesiredCapacity]' \
@@ -207,16 +347,22 @@ autoscaling_exit_standby() {
207347
--min-size $new_min)
208348
if [ $? != 0 ]; then
209349
msg "Failed to increase ASG ${asg_name}'s minimum size to $new_min."
350+
remove_flagfile
210351
return 1
211352
else
212353
msg "Successfully incremented ASG ${asg_name}'s minimum size"
213-
msg "Removing /tmp/asgmindecremented flag file"
214-
rm -f /tmp/asgmindecremented
215354
fi
216355
else
217356
msg "Auto scaling group was not decremented previously, not incrementing min value"
218357
fi
219358

359+
if [ "$HANDLE_PROCS" = "true" ]; then
360+
# Resume processes, except for the ones suspended before deployment
361+
resume_processes
362+
fi
363+
364+
# Clean up the FLAGFILE
365+
remove_flagfile
220366
return 0
221367
}
222368

@@ -240,6 +386,9 @@ get_instance_state_asg() {
240386
fi
241387
}
242388

389+
# Usage: reset_waiter_timeout <ELB name> <state name>
390+
#
391+
# Resets the timeout value to account for the ELB timeout and also connection draining.
243392
reset_waiter_timeout() {
244393
local elb=$1
245394
local state_name=$2
@@ -396,30 +545,11 @@ validate_elb() {
396545
get_elb_list() {
397546
local instance_id=$1
398547

399-
local asg_name=$($AWS_CLI autoscaling describe-auto-scaling-instances \
400-
--instance-ids $instance_id \
401-
--query AutoScalingInstances[*].AutoScalingGroupName \
402-
--output text | sed -e $'s/\t/ /g')
403548
local elb_list=""
404549

405-
if [ -z "${asg_name}" ]; then
406-
msg "Instance is not part of an ASG. Looking up from ELB."
407-
local all_balancers=$($AWS_CLI elb describe-load-balancers \
408-
--query LoadBalancerDescriptions[*].LoadBalancerName \
409-
--output text | sed -e $'s/\t/ /g')
410-
for elb in $all_balancers; do
411-
local instance_health
412-
instance_health=$(get_instance_health_elb $instance_id $elb)
413-
if [ $? == 0 ]; then
414-
elb_list="$elb_list $elb"
415-
fi
416-
done
417-
else
418-
elb_list=$($AWS_CLI autoscaling describe-auto-scaling-groups \
419-
--auto-scaling-group-names "${asg_name}" \
420-
--query AutoScalingGroups[*].LoadBalancerNames \
421-
--output text | sed -e $'s/\t/ /g')
422-
fi
550+
elb_list=$($AWS_CLI elb describe-load-balancers \
551+
--query $'LoadBalancerDescriptions[].[join(`,`,Instances[?InstanceId==`'$instance_id'`].InstanceId),LoadBalancerName]' \
552+
--output text | grep $instance_id | awk '{ORS=" ";print $2}')
423553

424554
if [ -z "$elb_list" ]; then
425555
return 1

0 commit comments

Comments
 (0)