Skip to content

Add APM Demo Data Test #112

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
May 16, 2025
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
314 changes: 314 additions & 0 deletions data_test/Readme.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,314 @@
# AWS Application Signals Testing Framework

This directory contains a comprehensive testing framework for validating Status of APM Demo APP's functionality by running data validation on Traces / Logs / Metrics that it should generate.

## Overview

The testing framework consists of three main components:

1. **Logs Testing** (`run_logs_tests.py`)
- Tests CloudWatch Logs queries and validations
- Supports various validation types including count checks and field content validation

2. **Metrics Testing** (`run_metrics_tests.py`)
- Tests CloudWatch Metrics data collection and threshold validations
- Supports business hours and non-business hours testing
- Includes various comparison operators for threshold validation

3. **Trace Testing** (`run_trace_tests.py`)
- Tests AWS X-Ray trace collection and analysis
- Supports multiple validation types including:
- Trace count validation
- Metadata checks
- Attribute existence and value matching
- Exception and error code validation
- HTTP status code validation

## Directory Structure

```
data_test/
├── run_logs_tests.py
├── run_metrics_tests.py
├── run_trace_tests.py
├── deploy/
├── test_cases/
└── lambda/
```

## Prerequisites

- Python 3.x
- AWS CLI configured with appropriate credentials
- Required AWS permissions for:
- CloudWatch Logs
- CloudWatch Metrics
- X-Ray

## Usage

Each test script follows a similar pattern and requires a JSON file containing test cases:

```bash
# For Logs Testing
python3 run_logs_tests.py <path_to_test_cases.json>

# For Metrics Testing
python3 run_metrics_tests.py <path_to_test_cases.json>

# For Trace Testing
python3 run_trace_tests.py <path_to_test_cases.json>
```

## Test Case Format

### Logs Test Case Example
```json
{
"log_test_cases": [
{
"test_case_id": "test-1",
"description": "Test log group query",
"log_group_names": ["/aws/lambda/example"],
"query_string": "fields @timestamp, @message",
"time_range": {
"relative_minutes": 60
},
"validation_checks": [
{
"check_type": "count",
"expected_count": 1,
"comparison_operator": "GreaterThanOrEqualToThreshold"
}
]
}
]
}
```

### Metrics Test Case Example
```json
{
"metric_test_cases": [
{
"test_case_id": "test-1",
"description": "Test metric threshold",
"metric_namespace": "AWS/Lambda",
"metric_name": "Errors",
"dimensions": [
{
"Name": "FunctionName",
"Value": "example-function"
}
],
"statistic": "Sum",
"evaluation_period_minutes": 5,
"threshold": {
"comparison_operator": [
{
"operator": "LessThanThreshold",
"threshold_value": 1
}
]
}
}
]
}
```

### Trace Test Case Example
```json
{
"trace_test_cases": [
{
"test_case_id": "test-1",
"description": "Test trace attributes",
"parameters": {
"relative_minutes": 60,
"filter_expression": "service(\"example-service\")"
},
"validation_checks": [
{
"check_type": "trace_attribute_exists",
"attribute_type": "http.response.status_code"
}
]
}
]
}
```

## Validation Types

### Logs Validation Types
- `count`: Validates the number of log records
- `field_contains`: Checks if a field contains specific content
- `general_exists`: Checks for general text existence in logs

### Metrics Validation Types
- Supports various comparison operators:
- `GreaterThanThreshold`
- `LessThanThreshold`
- `GreaterThanOrEqualToThreshold`
- `LessThanOrEqualToThreshold`

### Trace Validation Types
- `count`: Validates trace count
- `metadata_check`: Checks for metadata existence
- `trace_attribute_exists`: Validates trace attribute presence
- `trace_attribute_value_match`: Matches specific attribute values
- `segment_has_exception`: Checks for exceptions in segments
- `exception_message`: Validates exception messages
- `error_code`: Checks for specific error codes
- `http_status_code`: Validates HTTP status codes

## Contributing

When adding new test cases:
1. Create appropriate JSON test case files
2. Follow the existing validation patterns
3. Ensure proper error handling and logging
4. Test thoroughly before committing

## Notes

- All timestamps are handled in UTC
- Business hours are considered as UTC 9:00-17:00
- Test cases should be designed to be idempotent
- Consider rate limits when designing test cases


## Optional: Lambda Deployment and Monitoring

The framework supports running tests in AWS Lambda and monitoring test results through CloudWatch Alarms. The deployment process is managed through scripts in the `deploy` directory.

### Deployment Process

1. **Prepare Lambda Package**
```bash
# From the data_test directory
./deploy/deploy_lambda.sh
```
This script will:
- Create a deployment package with all necessary files
- Optionally deploy to AWS Lambda
- Clean up temporary files

2. **Configure Lambda Function**
- Set the Lambda function name in `deploy_lambda.sh`
- Ensure the Lambda has appropriate IAM permissions for:
- Read Cloudwatch Metrics / Traces / Logs
- Publish Cloudwatch Metrics

### Monitoring Setup

The framework provides two scripts for setting up monitoring:

1. **Individual Test Alarms** (`create_test_cases_alarms.py`)
- Creates CloudWatch Alarms for each test case
- Monitors test results in the `APMTestResults` namespace
- Alarms trigger when tests fail (value < 0.5)
- Sends notifications to configured SNS topic

2. **Composite Alarms** (`create_composite_alarms.py`)
- Creates hierarchical alarm structure:
- Individual test case alarms
- Scenario-level composite alarms
- Root-level composite alarm
- Provides consolidated monitoring view
- Reduces alert noise through aggregation

### Alarm Configuration

1. **Update SNS Topic**
- Set your SNS topic ARN in both alarm scripts:
```python
ActionSNS = 'FILL IN YOUR SNS TOPIC ARN HERE'
```

2. **Run Alarm Creation**
```bash
# Create individual test alarms
python3 deploy/create_test_cases_alarms.py

# Create composite alarms
python3 deploy/create_composite_alarms.py
```

### Alarm Structure

```
Root Composite Alarm
├── Scenario 1 Composite Alarm
│ ├── Test Case 1 Alarm
│ ├── Test Case 2 Alarm
│ └── ...
├── Scenario 2 Composite Alarm
│ ├── Test Case 1 Alarm
│ ├── Test Case 2 Alarm
│ └── ...
└── ...
```

### Test Results Metrics

- **Namespace**: `APMTestResults`
- **Metric Name**: `TestResult`
- **Dimensions**:
- `TestType`: logs/metrics/traces
- `TestCaseId`: unique test identifier
- `TestScenario`: scenario grouping
- **Values**:
- 1.0: Test passed
- 0.0: Test failed

### Alarm Parameters

- **Evaluation Period**: 30 minutes
- **Data Points**: 4 (2 hours)
- **Threshold**: 0.5
- **Operator**: LessThanThreshold
- **Missing Data**: treated as missing

## Cleanup and Removal

### Removing Alarms

To remove all test-related CloudWatch alarms, use the `deploy/remove_alarms.sh` script:

```bash
# Run from the data_test directory
./deploy/remove_alarms.sh
```

This script will:
- Find all alarms starting with "APMDemoTest"
- Display the list of alarms to be deleted
- Delete these alarms

### Removing Lambda Functions

To remove the deployed Lambda function, use the `deploy/remove_lambda.sh` script:

```bash
# Run from the data_test directory
./deploy/remove_lambda.sh
```

This script will:
- Ask for confirmation before deletion
- Delete the Lambda function named "APM_Demo_Test_Runner"
- Provide feedback on the deletion status

Note: Please ensure all related tests and monitoring have been stopped before deleting files and functions.


### Removing Deployment Files (If you forgot to remove after deploy lambda script)

To clean up temporary files created during deployment:

```bash
# Run from the data_test directory
rm -rf deploy/*.zip
rm -rf deploy/temp_*
```
Loading