Skip to content

Conversation

@chngpe
Copy link
Contributor

@chngpe chngpe commented Nov 10, 2025

Issue #, if available:

Description of changes:

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@amazon-inspector-n-virginia
Copy link

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

1 similar comment
@amazon-inspector-n-virginia
Copy link

⏳ I'm reviewing this pull request for security vulnerabilities and code quality issues. I'll provide an update when I'm done

engine = self.create_jdbc_engine(False)
with engine.connect() as conn:
conn = conn.execution_options(isolation_level="AUTOCOMMIT")
conn.execute(text(f"CREATE SCHEMA IF NOT EXISTS {database_name}"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

Description: Master password for the RDS SQL Server instance

Resources:
SqlServerDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: This is a deployment risk. The change indicates that the RDS instances do not have the BackupRetentionPeriod property set or it is set to 0, which means that backups are not enabled for the RDS instances. The deployment risk here is the potential for data loss in the event of a system failure or other unexpected incident. Without backups, there is no way to recover the data stored in the RDS instances, which could have severe consequences for the application and the business. This could lead to service disruptions, financial losses, and reputational damage. Ensuring that RDS backups are properly configured is a critical part of the deployment process to ensure the overall reliability and resilience of the application.

Severity: High

resource_path = pkg_resources.files(resources).joinpath(f'tpcds_data/schema/{table_name}.json')

with pkg_resources.as_file(resource_path) as config_path:
with open(config_path) as config_file:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Path traversal vulnerability detected. User-controlled input in file paths can allow attackers to access files outside intended directories using ../ sequences. Secure your code by using framework functions like safe_join(), secure_filename() and .startswith() checks. Learn more: https://cwe.mitre.org/data/definitions/22.html

Severity: High


def upload_glue_job_script(script_name):
test = Path(__file__).parent.parent.parent
local_file_path = os.path.join(test, 'resources', 'glue_job', script_name)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Path traversal vulnerability detected. User-controlled input in file paths can allow attackers to access files outside intended directories using ../ sequences. Secure your code by using framework functions like safe_join(), secure_filename() and .startswith() checks. Learn more: https://cwe.mitre.org/data/definitions/22.html

Severity: High

Description: Master password for the RDS SQL Server instance

Resources:
SqlServerDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Ensure all data stored in the RDS is securely encrypted at rest

Severity: High

Description: Master password for the RDS instance

Resources:
MySQLDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: This is a deployment risk. The change indicates that the RDS instances do not have the BackupRetentionPeriod property set or it is set to 0, which means that backups are not enabled for the RDS instances. The deployment risk here is the potential for data loss in the event of a system failure or other unexpected incident. Without backups, there is no way to recover the data stored in the RDS instances, which could have severe consequences for the application and the business. This could lead to service disruptions, financial losses, and reputational damage. Ensuring that RDS backups are properly configured is a critical part of the deployment process to ensure the overall reliability and resilience of the application.

Severity: High

# Due to Glue ETL job can't define schema, it will automatically set to upper case, hence use CTAS to use to lower case all entities
with engine.connect() as con:
result = con.execute(
text(f"DESCRIBE TABLE \"{config_helper.get_tpcds_scale_factor()}\".{table_name}_upper"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

conn = conn.execution_options(isolation_level="AUTOCOMMIT")

# Check if DB exists
result = conn.execute(text(f"CREATE SCHEMA IF NOT EXISTS tpcds.{database_name}"))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

try:
# build the image
ecr_repository_name = self.get_ecr_repository_name()
subprocess.run(["docker", "build","--provenance", "false", "--platform", "linux/amd64" ,"-t", ecr_repository_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Starting a process with a partial executable path https://bandit.readthedocs.io/en/latest/plugins/b607_start_process_with_partial_path.html

Severity: High

Description: Master password for the RDS SQL Server instance

Resources:
SqlServerDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Ensure all data stored in the RDS is securely encrypted at rest

Severity: High

Description: Master password for the RDS instance

Resources:
PostgreSQLDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: This is a deployment risk. The change indicates that the RDS instances do not have the BackupRetentionPeriod property set or it is set to 0, which means that backups are not enabled for the RDS instances. The deployment risk here is the potential for data loss in the event of a system failure or other unexpected incident. Without backups, there is no way to recover the data stored in the RDS instances, which could have severe consequences for the application and the business. This could lead to service disruptions, financial losses, and reputational damage. Ensuring that RDS backups are properly configured is a critical part of the deployment process to ensure the overall reliability and resilience of the application.

Severity: High

SubnetId: !Ref AthenaFederationIntegPublicSubnet2
RouteTableId: !Ref AthenaFederationIntegPublicRouteTable

AthenaFederationIntegOpenSecurityGroup:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: CidrIp Property is found to be set as 0.0.0.0/0. To enhance the security of your AWS resources, it is important to set a more restrictive CIDR (Classless Inter-Domain Routing) for SSH (Secure Shell) access. By doing so, you limit access to only specific IP ranges or individual IP addresses, reducing the risk of unauthorised access. Like Instead of allowing access from any IP address (0.0.0.0/0), specify a specific IP range or individual IP addresses. By setting a more restrictive CIDR for SSH, you ensure that only authorized IP addresses are allowed access, thereby enhancing the security of your AWS resources.

Severity: High

raise error
try:
#aws ecr create-repository --repository-name athena-federation-repository-$CONNECTOR_NAME --region us-east-1
subprocess.run(["aws", "ecr", "create-repository", "--repository-name", self.get_ecr_repository_name()])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Starting a process with a partial executable path https://bandit.readthedocs.io/en/latest/plugins/b607_start_process_with_partial_path.html

Severity: High

# CTAS query
columns_str = ", ".join(columns)
ctas_query = f'CREATE TABLE \"{config_helper.get_tpcds_scale_factor()}\".\"{table_name}\" AS SELECT {columns_str} FROM \"{config_helper.get_tpcds_scale_factor()}\".{table_name}_upper'
con.execute(text(ctas_query))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

for table_name in table_names:
with engine.connect() as con:
result = con.execute(
text(f"SHOW TABLES LIKE '{table_name}' in \"{config_helper.get_tpcds_scale_factor()}\""))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

conn = conn.execution_options(isolation_level="AUTOCOMMIT")
# SQLSERVER has 3 level, create database first(catalog level) with tpcds
# SQL server doesn't support create schema if not exists
conn.execute(text(f"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

conn = conn.execution_options(isolation_level="AUTOCOMMIT")
# SQLSERVER has 3 level, create database first(catalog level) with tpcds
# SQL server doesn't support create schema if not exists
conn.execute(text(f"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

Description: Master password for the RDS instance

Resources:
MySQLDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: RDS_INSTANCE_LOGGING_ENABLED : Amazon Relational Database Service logs (Amazon RDS) are not enabled. By ensuring that EnableCloudwatchLogsExports is set to true for your RDS instances, you enable the export of database logs to Amazon CloudWatch Logs. This allows you to monitor and analyze the logs, gain insights into database activity, and troubleshoot issues. CloudWatch Logs can be used to set up alarms, create custom metrics, and perform log analysis, helping you maintain the health, performance, and security of your RDS instances.

Severity: High

logging.info("✅ Docker build completed successfully.")
# tag the image
registry = f"{common_infra.get_account_id()}.dkr.ecr.{config_helper.get_region()}.amazonaws.com"
subprocess.run(["docker", "tag", f"{ecr_repository_name}:latest", f"{registry}/{ecr_repository_name}:latest"], check=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Starting a process with a partial executable path https://bandit.readthedocs.io/en/latest/plugins/b607_start_process_with_partial_path.html

Severity: High

SecurityGroupIds:
- !Ref AthenaFederationIntegOpenSecurityGroup
PrivateDnsEnabled: true
Tags:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Invalid Property Resources/AthenaFederationIntegGlueVPCEndpoint/Properties/Tags

Severity: High

DestinationCidrBlock: 0.0.0.0/0
GatewayId: !Ref AthenaFederationIntegGateway

AthenaFederationIntegPublicSubnet1:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Subnet - Public IP : Ensure Amazon Virtual Private Cloud (Amazon VPC) subnets are assigned a public IP address. You enable instances launched in those subnets to have a publicly accessible IP. This allows them to communicate with the internet directly, making it easier to access resources hosted within the VPC from external networks. However, note that enabling public IP assignment also exposes the instances to potential security risks, so make sure to implement appropriate security measures like network access control lists (ACLs) and security groups

Severity: High

try:
# build the image
ecr_repository_name = self.get_ecr_repository_name()
subprocess.run(["docker", "build","--provenance", "false", "--platform", "linux/amd64" ,"-t", ecr_repository_name,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Starting a process with a partial executable path https://bandit.readthedocs.io/en/latest/plugins/b607_start_process_with_partial_path.html

Severity: High

Description: Master password for the RDS instance

Resources:
OracleDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: This is a deployment risk. The change indicates that the RDS instances do not have the BackupRetentionPeriod property set or it is set to 0, which means that backups are not enabled for the RDS instances. The deployment risk here is the potential for data loss in the event of a system failure or other unexpected incident. Without backups, there is no way to recover the data stored in the RDS instances, which could have severe consequences for the application and the business. This could lead to service disruptions, financial losses, and reputational damage. Ensuring that RDS backups are properly configured is a critical part of the deployment process to ensure the overall reliability and resilience of the application.

Severity: High

raise error
try:
#aws ecr create-repository --repository-name athena-federation-repository-$CONNECTOR_NAME --region us-east-1
subprocess.run(["aws", "ecr", "create-repository", "--repository-name", self.get_ecr_repository_name()])

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Starting a process with a partial executable path https://bandit.readthedocs.io/en/latest/plugins/b607_start_process_with_partial_path.html

Severity: High

subprocess.run(["docker", "tag", f"{ecr_repository_name}:latest", f"{registry}/{ecr_repository_name}:latest"], check=True)
logging.info("✅ Docker tag completed successfully.")
# push the image
subprocess.run(["docker", "push", f"{registry}/{ecr_repository_name}:latest"], check=True)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Starting a process with a partial executable path https://bandit.readthedocs.io/en/latest/plugins/b607_start_process_with_partial_path.html

Severity: High

Description: Master password for the RDS instance

Resources:
PostgreSQLDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: RDS instances have AutoMinorVersionUpgrade set to false or AutoMinorVersionUpgrade property is not present. Enabling Auto Minor Version Upgrade means that AWS RDS will automatically apply minor upgrades during the maintenance window defined for your RDS instance. It also ensures that your service receives the latest security patches and bug fixes automatically. This helps to keep your system secure and up-to-date without requiring manual upgrades.

Severity: High

Description: Master password for the RDS instance

Resources:
OracleDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: RDS instances have AutoMinorVersionUpgrade set to false or AutoMinorVersionUpgrade property is not present. Enabling Auto Minor Version Upgrade means that AWS RDS will automatically apply minor upgrades during the maintenance window defined for your RDS instance. It also ensures that your service receives the latest security patches and bug fixes automatically. This helps to keep your system secure and up-to-date without requiring manual upgrades.

Severity: High

- Key: Name
Value: AthenaFederationIntegPublicSubnet1

AthenaFederationIntegPublicSubnet2:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Subnet - Public IP : Ensure Amazon Virtual Private Cloud (Amazon VPC) subnets are assigned a public IP address. You enable instances launched in those subnets to have a publicly accessible IP. This allows them to communicate with the internet directly, making it easier to access resources hosted within the VPC from external networks. However, note that enabling public IP assignment also exposes the instances to potential security risks, so make sure to implement appropriate security measures like network access control lists (ACLs) and security groups

Severity: High

engine = self.create_jdbc_engine(True)
with engine.connect() as conn:
# Create Schema which use to represent the scale factor
conn.execute(text(f"""

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

logging.error("Error during insert:", e)
logging.info(f"✅ Data source:{self.get_connection_type_name()}, table:{table_name} checked/created.")
with engine.connect() as con:
con.execute(text(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

for table_name in tpcds_reader.get_data_tables():
try:
result = con.execute(
text(f"DROP TABLE IF EXISTS \"{config_helper.get_tpcds_scale_factor()}\".\"{table_name}\""))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

SecurityGroupIds:
- !Ref AthenaFederationIntegOpenSecurityGroup
PrivateDnsEnabled: true
Tags:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Invalid Property Resources/AthenaFederationIntegSecretsManagerVPCEndpoint/Properties/Tags

Severity: High

Description: Master password for the RDS instance

Resources:
OracleDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: RDS_INSTANCE_LOGGING_ENABLED : Amazon Relational Database Service logs (Amazon RDS) are not enabled. By ensuring that EnableCloudwatchLogsExports is set to true for your RDS instances, you enable the export of database logs to Amazon CloudWatch Logs. This allows you to monitor and analyze the logs, gain insights into database activity, and troubleshoot issues. CloudWatch Logs can be used to set up alarms, create custom metrics, and perform log analysis, helping you maintain the health, performance, and security of your RDS instances.

Severity: High

Description: Master password for the RDS instance

Resources:
OracleDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Ensure all data stored in the RDS is securely encrypted at rest

Severity: High

Description: Master password for the RDS instance

Resources:
PostgreSQLDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: RDS instances with DeletionProtection property is not present or set to false. Set RDS instances? DeletionProtection to true to ensure the safety and security of your valuable data, if we enable Deletion Protection for our RDS instances, this will provide an extra layer of protection which prevents accidental deletions, reducing the risk of data loss and ensuring uninterrupted availability of your databases.

Severity: High

# Initialize boto3 S3 client
s3 = aws_client_factory.get_aws_client('s3')
# Upload the file
s3.upload_file(local_file_path, config_helper.get_glue_job_bucket_name(common_infra.get_account_id()), f'scripts/{script_name}')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Path traversal vulnerability detected. User-controlled input in file paths can allow attackers to access files outside intended directories using ../ sequences. Secure your code by using framework functions like safe_join(), secure_filename() and .startswith() checks. Learn more: https://cwe.mitre.org/data/definitions/22.html

Severity: High

Description: Master password for the RDS instance

Resources:
OracleDBInstance:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Ensure all data stored in the RDS is securely encrypted at rest

Severity: High

# Grant newly created user on write
sql = f"""alter user "{database_name}" quota unlimited on USERS """
logging.info(f"sql executed: {sql}")
conn.execute(text(sql))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Potential SQL injection risk detected in database operation. The query is constructed using string concatenation or formatting, which can be vulnerable if the data is not properly sanitized. To mitigate this risk, use parameterized queries appropriate for your database system. This approach separates SQL logic from data, enhancing query safety across different database systems. You can ignore this warning if you're certain that all inputs are properly sanitized and trusted. Learn more - https://cwe.mitre.org/data/definitions/89.html

Severity: High

return response['DomainStatus']['Endpoint']

def _get_es_table_name(self, table_name):
return f'{config_helper.get_tpcds_scale_factor()}_{table_name}' No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Description: Untrusted data must be properly encoded or sanitized before being
incorporated into web page content or used to generate dynamic output.
Failure to do so can result in Cross-Site Scripting (XSS) vulnerabilities,
enabling attackers to inject malicious scripts into the application. These
scripts, when executed in users' browsers, can lead to various security
breaches such as session hijacking, data theft, or unauthorized actions
performed within the victim's session context. To mitigate this risk,
use built-in template escaping mechanisms, HTML escape functions like
html.escape(), or dedicated security libraries such as Bleach.
Learn more: https://owasp.org/www-community/attacks/xss/

Severity: High

@codecov
Copy link

codecov bot commented Nov 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 68.28%. Comparing base (fe0a5d9) to head (db5f8c2).
⚠️ Report is 99 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@             Coverage Diff              @@
##             master    #3104      +/-   ##
============================================
+ Coverage     63.67%   68.28%   +4.61%     
- Complexity     4344     5061     +717     
============================================
  Files           621      636      +15     
  Lines         23286    24584    +1298     
  Branches       2859     3131     +272     
============================================
+ Hits          14827    16787    +1960     
+ Misses         7070     6353     -717     
- Partials       1389     1444      +55     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant