Skip to content

Commit 26fa215

Browse files
committed
Add proposal for new commands to retreive container logs from cloudwatch.
1 parent fb37e03 commit 26fa215

File tree

1 file changed

+195
-0
lines changed

1 file changed

+195
-0
lines changed

proposals/awslogs.md

+195
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
# ECS CLI Logging
2+
3+
## Overview
4+
5+
The following Proposal lays out a design and implementation plan for creating a user experience in the ECS CLI for getting container logs from CloudWatch.
6+
7+
### Use Cases
8+
9+
1. User has a known error and wants to find more info on it.
10+
2. User is doing a deployment and wants to tail the logs and grep for errors.
11+
3. User wants to quickly set up their CloudWatch Logs based upon the configuration specified in their docker compose file, and have the CLI creates any necessary log groups for them.
12+
4. User wants to monitor their task/service, so they continually stream the logs.
13+
14+
## Phase 1 Solution
15+
Top level `ecs-cli logs` command that will not use the docker compose file. This allows it to be used by a wide array of ECS customers, not just compose users. The command will allow customers to find logs for a given task.
16+
17+
```
18+
ecs-cli logs --help
19+
--follow Stream logs (continuously poll for updates)
20+
--task-id [Required] View logs for a given Task ID
21+
--task-def Required with Task ID if the task has been stopped already. Format: family:revision
22+
--filter-pattern Substring to search for within the logs.
23+
--container-name, -c Filter logs for a given container definition
24+
--since Filter logs in the last X minutes (can not be used with start time and end time)
25+
--start-time Filter logs within a time frame, use with --end-time
26+
--end-time Filter logs within a time frame, use with --end-time
27+
--timestamps, -t View time-stamps with the logs
28+
```
29+
30+
```
31+
ecs-cli logs --task-id d86079d1-6858-45e9-8ce2-1ba881c55c12 --time-stamps
32+
Time-stamp Message
33+
2017-09-28 22:32:11 WordPress not found in /var/www/html - copying now...
34+
2017-09-28 22:32:11 Complete! WordPress has been successfully copied to /var/www/html
35+
2017-09-28 22:32:12 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message
36+
2017-09-28 22:32:12 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message
37+
2017-09-28 22:32:12 [Wed Sep 27 22:32:12.300422 2017] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.10 (Debian) PHP/5.6.31 configured -- resuming normal operations
38+
2017-09-28 22:32:12 [Wed Sep 27 22:32:12.300456 2017] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND'
39+
```
40+
41+
### Implementation
42+
43+
- The logs implementation will not include any pagination- the command will return all logs corresponding to the specified search. We expect most users will be piping the output of the command to save it to a file, so this should not be a problem.
44+
- If the user has not specified a log stream prefix in their task definition, then the command will fail and print an error message. Because without the log stream prefix set, we have no way of getting the logs for an individual task.
45+
- For performance reasons, the command will only pull from *a single log group*. If the customer has not configured all of their container definitions to use the same log group, then the command will fail with an error and tell the customer they must re-run the command with the `--container-name` argument. This way, only 1 log group needs to be queried.
46+
47+
Work Flow:
48+
1. User gives Task ID
49+
2. Call Describe Tasks to get the TaskDef ARN (Skip this step if user provides Task Def)
50+
3. Call Describe Task Definition to get the Container Definitions.
51+
4. From Container Definitions, get the log configuration.
52+
5. Create a list of log streams that correspond to the correct task for the log group.
53+
6. Call FilterLogEvents on the log group to get the log events.
54+
7. Print log events.
55+
56+
57+
## Compose Logs (Phase 2)
58+
Phase 2 will be implemented in the future when we have time, it is lower priority than Phase 1, and thus Phase 2 may not be implemented for some time. We welcome the contributions of any customer who wishes to help start implementation of Phase 2 sooner.
59+
60+
### Configure Logs
61+
- Log configuration using the docker-compose file is already supported
62+
- Problem: Customer not required specify log stream prefix, however, we basically need log stream prefix to be specified because of how the ECS Agent sets the log stream name. If prefix is specified then it adds the container name and task ID to the log stream name (so we can use it to get the logs for each task). However, if a prefix is not specified, then the log stream name will for all intents and purposes be a random useless string (its an ID picked by the docker daemon on the instance, which from our point of view is meaningless).
63+
- The log stream will be named like this (by the ECS Agent): `prefix-name/container-name/ecs-task-id`
64+
65+
*Solution:* Existing ability to configure logs remains undisturbed, but add additional flag `--create-log-groups` that creates the necessary log group(s) in CloudWatch.
66+
- The log configuration from the docker compose file will be read
67+
- If user has not specified a log stream prefix, warn them that we are auto-setting it to a default value in their task definition.
68+
- *Additionally*, even if `--create-log-groups` is not specified, but we detect that the there is no prefix configured in their docker compose file (but log group and awslogs driver is specified), prefix will still be auto-set, and the user will be warned about this. This technically will break backwards compatibility- however, this risk is acceptable. It is very unlikely that ECS CLI users would actually desire to have their log streams named without a prefix. If no prefix is given, the ECS Agent sets the log stream name to be the container ID which was randomly generated by docker. Understanding this random ID requires logging into the underlying instances and retrieving info from the Docker Daemon. For all intents and purposes, the container ID is meaningless from a customer standpoint.
69+
- The ECS CLI is designed to simplify workflows and make it easier to understand ECS. Therefore, we should be opinionated and protect users from accidentally configuring there logs in a poor way. We can help protect users from the less useful, complicated, legacy behavior of ECS.
70+
- Additionally, the user will be warned that the default retention policy is to keep all log events forever, causing them to be charged for all time. They can change the policy in the CloudWatch Console or AWS CLI.
71+
72+
```
73+
ecs-cli compose up --help
74+
--create-log-groups Creates any necessary log groups in CloudWatch.
75+
```
76+
77+
```
78+
ecs-cli compose up --create-log-groups
79+
INFO[0000] Creating Resources in CloudWatch for your logs.
80+
WARN[0001] You have not specified a log stream prefix, auto-setting it to 'ecs-compose-'
81+
WARN[0002] By default, CloudWatch will store your logs forever, it is recommended that you set a retention policy.
82+
```
83+
84+
*Suggested Configuration:*
85+
- If the user has not specified a log configuration in their compose file, then using the `--create-log-groups` command will fail and will print a help message with the suggested configuration. Here is one possible idea:
86+
For Services:
87+
```
88+
awslogs-group: ${cluster name}/${service name}
89+
```
90+
For Tasks:
91+
```
92+
awslogs-group: ${cluster name}/${task def family}
93+
```
94+
95+
### View Logs
96+
- New Commands: `ecs-cli compose logs`, and `ecs-cli compose service logs`
97+
- *Log command reads the configuration in user's docker compose file*
98+
99+
*Solution:* In docker-compose, and ECS task def, logs are configured per container definition. In docker-compose, these are called services and they must have names. Therefore, a user can view the logs per container definition. Since the agent will add the task ID to the log stream name, we can also list the logs for each task.
100+
101+
```
102+
ecs-cli compose logs --help
103+
--follow Stream logs (continuously poll for updates)
104+
--task-id View logs for a given Task ID
105+
--container-name, -c View logs for a given container definition
106+
--since View logs in the last X minutes (can not be used with start time and end time)
107+
--start-time View logs within a time frame, use with --end-time
108+
--end-time View logs within a time frame, use with --end-time
109+
--time-stamps, -t View time-stamps with the logs
110+
--output, -o Output to a file
111+
```
112+
113+
User's docker-compose file:
114+
```
115+
version: '2'
116+
services:
117+
mysql:
118+
image: mysql
119+
cpu_shares: 100
120+
mem_limit: 524288000
121+
cap_add:
122+
- ALL
123+
logging:
124+
driver: awslogs
125+
options:
126+
awslogs-group: ecs-log-streaming
127+
awslogs-region: us-west-2
128+
awslogs-stream-prefix: mysql-logs
129+
wordpress:
130+
image: wordpress
131+
cpu_shares: 132
132+
mem_limit: 524288001
133+
ports:
134+
- "80:80"
135+
links:
136+
- mysql
137+
logging:
138+
driver: awslogs
139+
options:
140+
awslogs-group: ecs-log-streaming
141+
awslogs-region: us-west-2
142+
awslogs-stream-prefix: wordpress-logs
143+
```
144+
145+
##### Examples
146+
147+
*User views logs for all MySQL Containers:*
148+
- Outputs the logs for all containers running the given container definition. Ie if the user has 10 tasks running using this compose file, then the logs for all 10 of the mysql containers will be outputted. The output can be organized by the task ID.
149+
150+
*Implementation Details:*
151+
- From the compose file, we know the log group for this container definition. We can call DescribeLogStreams to get a list of the log streams. FilterLogEvents can then be called with the list of LogStreams to get the log events. Each returned log event will have the log stream name associated with it- this will contain the task ID.
152+
153+
```
154+
ecs-cli compose logs --container-name mysql --time-stamps
155+
INFO[0000] Showing logs for all mysql containers
156+
_______________________________________
157+
Task: d86079d1-6858-45e9-8ce2-1ba881c55c12
158+
_______________________________________
159+
Time-stamp Message
160+
2017-09-28 22:32:11 WordPress not found in /var/www/html - copying now...
161+
2017-09-28 22:32:11 Complete! WordPress has been successfully copied to /var/www/html
162+
2017-09-28 22:32:12 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message
163+
2017-09-28 22:32:12 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message
164+
_______________________________________
165+
Task: d86079d1-6858-45e9-8ce2-1ba881c55c12
166+
______________________________________
167+
Time-stamp Message
168+
2017-09-28 22:32:12 [Wed Sep 27 22:32:12.300422 2017] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.10 (Debian) PHP/5.6.31 configured -- resuming normal operations
169+
2017-09-28 22:32:12 [Wed Sep 27 22:32:12.300456 2017] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND'
170+
```
171+
172+
173+
*User views logs for a given task:*
174+
- Outputs the logs for a given task ID
175+
- The logs can be organized by the container name
176+
177+
*Implementation Details:*
178+
- From the compose file, we know the log group for each container definition. We can call DescribeLogStreams to get a list of the log streams for each container definition. Each log stream will contain the Task ID in its name- so we can then call FilterLogEvents and use only the log streams for the given task ID as arguments. Each returned log event will have the log stream name associated with it- this will contain the container name.
179+
180+
```
181+
ecs-cli compose logs --task-id --t
182+
Container: MySql
183+
_______________________
184+
Time-stamp Message
185+
2017-09-28 22:32:11 WordPress not found in /var/www/html - copying now...
186+
2017-09-28 22:32:11 Complete! WordPress has been successfully copied to /var/www/html
187+
2017-09-28 22:32:12 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message
188+
2017-09-28 22:32:12 AH00558: apache2: Could not reliably determine the server's fully qualified domain name, using 172.17.0.3. Set the 'ServerName' directive globally to suppress this message
189+
_______________________
190+
Container: Wordpress
191+
_______________________
192+
Time-stamp Message
193+
2017-09-28 22:32:12 [Wed Sep 27 22:32:12.300422 2017] [mpm_prefork:notice] [pid 1] AH00163: Apache/2.4.10 (Debian) PHP/5.6.31 configured -- resuming normal operations
194+
2017-09-28 22:32:12 [Wed Sep 27 22:32:12.300456 2017] [core:notice] [pid 1] AH00094: Command line: 'apache2 -D FOREGROUND'
195+
```

0 commit comments

Comments
 (0)