diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000000..90df607a3a --- /dev/null +++ b/.gitignore @@ -0,0 +1,5 @@ +*/target +*/dependency-reduced-pom.xml +.idea/ +/target/ +*/*.iml diff --git a/CODE_OF_CONDUCT.md b/CODE_OF_CONDUCT.md new file mode 100644 index 0000000000..5b627cfa60 --- /dev/null +++ b/CODE_OF_CONDUCT.md @@ -0,0 +1,4 @@ +## Code of Conduct +This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). +For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact +opensource-codeofconduct@amazon.com with any additional questions or comments. diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 0000000000..914e0741d7 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,61 @@ +# Contributing Guidelines + +Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional +documentation, we greatly value feedback and contributions from our community. + +Please read through this document before submitting any issues or pull requests to ensure we have all the necessary +information to effectively respond to your bug report or contribution. + + +## Reporting Bugs/Feature Requests + +We welcome you to use the GitHub issue tracker to report bugs or suggest features. + +When filing an issue, please check existing open, or recently closed, issues to make sure somebody else hasn't already +reported the issue. Please try to include as much information as you can. Details like these are incredibly useful: + +* A reproducible test case or series of steps +* The version of our code being used +* Any modifications you've made relevant to the bug +* Anything unusual about your environment or deployment + + +## Contributing via Pull Requests +Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that: + +1. You are working against the latest source on the *master* branch. +2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already. +3. You open an issue to discuss any significant work - we would hate for your time to be wasted. + +To send us a pull request, please: + +1. Fork the repository. +2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change. +3. Ensure local tests pass. +4. Commit to your fork using clear commit messages. +5. Send us a pull request, answering any default questions in the pull request interface. +6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation. + +GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and +[creating a pull request](https://help.github.com/articles/creating-a-pull-request/). + + +## Finding contributions to work on +Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any 'help wanted' issues is a great place to start. + + +## Code of Conduct +This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). +For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact +opensource-codeofconduct@amazon.com with any additional questions or comments. + + +## Security issue notifications +If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue. + + +## Licensing + +See the [LICENSE](LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution. + +We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes. diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000000..67db858821 --- /dev/null +++ b/LICENSE @@ -0,0 +1,175 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. diff --git a/NOTICE b/NOTICE new file mode 100644 index 0000000000..b8d0d46a06 --- /dev/null +++ b/NOTICE @@ -0,0 +1 @@ +Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. diff --git a/README.md b/README.md new file mode 100644 index 0000000000..188b4b4f54 --- /dev/null +++ b/README.md @@ -0,0 +1,138 @@ +# Amazon Athena Query Federation + +The Amazon Athena Query Federation SDK allows you to customize Amazon Athena with your own code. This enables you to integrate with new data sources, proprietary data formats, or build in new user defined functions. Initially these customizations will be limited to the parts of a query that occur during a TableScan operation but will eventually be expanded to include other parts of the query lifecycle using the same easy to understand interface. + +This functionality is currently in **Public Preview** while customers provide us feedback on usability, ease of using the service or building new connectors. We do not recommend that you use these connectors in production or use this preview to make assumptions about the performance of Athena’s Federation features. As we receive more feedback, we will make improvements to the preview and lift raise limits associated with query/connector performance, APIs, SDKs, and user experience. The best way to understand the performance of Athena Data Source Connectors is to run a benchmark when they become generally available (GA) or review our performance guidance. + +![Architecture Image](https://github.com/awslabs/aws-athena-query-federation/blob/master/docs/img/athena_federation_summary.png?raw=true) + +We've written integrations with more than 20 databases, storage formats, and live APIs in order to refine this interface and balance flexibility with ease of use. We hope that making this SDK and initial set of connectors Open Source will allow us to continue to improve the experience and performance of Athena Query Federation. + +## Serverless Big Data Using AWS Lambda + +![Architecture Image](https://github.com/awslabs/aws-athena-query-federation/blob/master/docs/img/athena_federation_flow.png?raw=true) + +## Example Usages + +- SecretsManager integration +- Serverless Application Repository + +### Queries That Span Data Stores + +Imagine a hypothetical e-commerce company who's architecture uses: + +1. Payment processing in a secure VPC with transaction records stored in HBase on EMR +2. Redis is used to store active orders so that the processing engine can get fast access to them. +3. DocumentDB (e.g. a mongodb compatible store) for Customer account data like email address, shipping addresses, etc.. +4. Their e-commerce site using auto-scaling on Fargate with their product catalog in Amazon Aurora. +5. Cloudwatch Logs to house the Order Processor's log events. +6. A write-once-read-many datawarehouse on Redshift. +7. Shipment tracking data in DynamoDB. +8. A fleet of Drivers performing last-mile delivery while utilizing IoT enabled tablets. +9. Advertising conversion data from a 3rd part cloud provider. + +![Architecture Image](https://github.com/awslabs/aws-athena-query-federation/blob/master/docs/img/athena_federation_demo.png?raw=true) + +Customer service agents begin receiving calls about orders 'stuck' in a weird state. Some show as pending even though they have delivered, others show as delivered but haven't actually shipped. It would be great if we could quickly run a query across this diverse architecture to understand which orders might be affected and what they have in common. + +Using Amazon Athena Query Federation and many of the connectors found in this repository, our hypothetical e-commerce company would be able to run a query that: + +1. Grabs all active orders from Redis. (see athena-redis) +2. Joins against any orders with 'WARN' or 'ERROR' events in Cloudwatch logs by using regex matching and extraction. (see athena-cloudwatch) +3. Joins against our EC2 inventory to get the hostname(s) and status of the Order Processor(s) that logged the 'WARN' or 'ERROR'. (see athena-cmdb) +4. Joins against DocumentDB to obtain customer contact details for the affected orders. (see athena-docdb) +5. Joins against a scatter-gather query sent to the Driver Fleet via Android Push notification. (see athena-android) +6. Joins against DynamoDB to get shipping status and tracking details. (see athena-dynamodb) +8. Joins against HBase to get payment status for the affected orders. (see athena-hbase) +7. Joins against the advertising conversion data in BigQuery to see which promotions need to be applied if a re-order is needed. (see athena-bigquery) + +```sql +WITH logs + AS (SELECT log_stream, + message AS + order_processor_log, + Regexp_extract(message, '.*orderId=(\d+) .*', 1) AS orderId, + Regexp_extract(message, '(.*):.*', 1) AS log_level + FROM + "lambda:cloudwatch"."/var/ecommerce-engine/order-processor".all_log_streams + WHERE Regexp_extract(message, '(.*):.*', 1) != 'WARN'), + active_orders + AS (SELECT * + FROM redis.redis_db.redis_customer_orders), + order_processors + AS (SELECT instanceid, + publicipaddress, + state.NAME + FROM awscmdb.ec2.ec2_instances), + customer + AS (SELECT id, + email + FROM docdb.customers.customer_info), + addresses + AS (SELECT id, + is_residential, + address.street AS street + FROM docdb.customers.customer_addresses), + drivers + AS ( SELECT name as driver_name, + result_field as driver_order, + device_id as truck_id, + last_updated + FROM android.android.live_query where query_timeout = 5000 and query_min_results=5), + impressions + AS ( SELECT path as advertisement, + conversion + FROM bigquery.click_impressions.click_conversions), + shipments + AS ( SELECT order_id, + shipment_id, + from_unixtime(cast(shipped_date as double)) as shipment_time, + carrier + FROM lambda_ddb.default.order_shipments), + payments + AS ( SELECT "summary:order_id", + "summary:status", + "summary:cc_id", + "details:network" + FROM "hbase".hbase_payments.transactions) + +SELECT _key_ AS redis_order_id, + customer_id, + customer.email AS cust_email, + "summary:cc_id" AS credit_card, + "details:network" AS CC_type, + "summary:status" AS payment_status, + impressions.advertisement as advertisement, + status AS redis_status, + addresses.street AS street_address, + shipments.shipment_time as shipment_time, + shipments.carrier as shipment_carrier, + driver_name AS driver_name, + truck_id AS truck_id, + last_updated AS driver_updated, + publicipaddress AS ec2_order_processor, + NAME AS ec2_state, + log_level, + order_processor_log +FROM active_orders + LEFT JOIN logs + ON logs.orderid = active_orders._key_ + LEFT JOIN order_processors + ON logs.log_stream = order_processors.instanceid + LEFT JOIN customer + ON customer.id = customer_id + LEFT JOIN addresses + ON addresses.id = address_id + LEFT JOIN drivers + ON drivers.driver_order = active_orders._key_ + LEFT JOIN impressions + ON impressions.conversion = active_orders._key_ + LEFT JOIN shipments + ON shipments.order_id = active_orders._key_ + LEFT JOIN payments + ON payments."summary:order_id" = active_orders._key_ +``` + +## License + +This project is licensed under the Apache-2.0 License. diff --git a/athena-android/pom.xml b/athena-android/pom.xml new file mode 100644 index 0000000000..801e7933c9 --- /dev/null +++ b/athena-android/pom.xml @@ -0,0 +1,67 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-android + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.google.firebase + firebase-admin + 6.10.0 + + + com.fasterxml.jackson.core + jackson-databind + 2.9.8 + + + com.amazonaws + aws-java-sdk-sqs + 1.11.636 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidDeviceTable.java b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidDeviceTable.java new file mode 100644 index 0000000000..a64fd4fa52 --- /dev/null +++ b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidDeviceTable.java @@ -0,0 +1,147 @@ +/*- + * #%L + * athena-android + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.android; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +public class AndroidDeviceTable +{ + private final TableName tableName; + private final Schema schema; + + public AndroidDeviceTable() + { + //Table name must match the firebase push subscription topic used on the devices + this.tableName = new TableName("android", "live_query"); + schema = new SchemaBuilder().newBuilder() + .addStringField("device_id") + .addStringField("name") + .addStringField("echo_value") + .addStringField("result_field") + .addField("last_updated", Types.MinorType.DATEMILLI.getType()) + .addIntField("score") + .addBigIntField("query_timeout") + .addBigIntField("query_min_results") + .addMetadata("device_id", "Android device id of the responding device.") + .addMetadata("name", "Name of the simulated device owner.") + .addMetadata("last_updated", "Last time this data was fetched") + .addMetadata("echo_value", "The value requested by the search.") + .addMetadata("result_field", "Flattened copy of the first value from the values field.") + .addMetadata("score", "Randomly generated score") + .addMetadata("query_timeout", "used to configure the number of milli-seconds the query waits for the min_results") + .addMetadata("query_min_results", "The min number of results to wait for.") + .build(); + } + + public TableName getTableName() + { + return tableName; + } + + public Schema getSchema() + { + return schema; + } + + public String getQueryMinResultsField() + { + return "query_min_results"; + } + + public String getQueryTimeout() + { + return "query_timeout"; + } + + public String getDeviceIdField() + { + return "device_id"; + } + + public String getLastUpdatedField() + { + return "last_updated"; + } + + public String getNameField() + { + return "name"; + } + + public String getEchoValueField() + { + return "echo_value"; + } + + public String getResultField() + { + return "result_field"; + } + + public String getScoreField() + { + return "score"; + } + + public FieldVector getQueryMinResultsField(Block block) + { + return block.getFieldVector("query_min_results"); + } + + public FieldVector getQueryTimeout(Block block) + { + return block.getFieldVector("query_timeout"); + } + + public FieldVector getDeviceIdField(Block block) + { + return block.getFieldVector("device_id"); + } + + public FieldVector getNameField(Block block) + { + return block.getFieldVector("name"); + } + + public FieldVector getLastUpdatedField(Block block) + { + return block.getFieldVector("last_updated"); + } + + public FieldVector getEchoValueField(Block block) + { + return block.getFieldVector("echo_value"); + } + + public FieldVector getResultField(Block block) + { + return block.getFieldVector("result_field"); + } + + public FieldVector getScoreField(Block block) + { + return block.getFieldVector("score"); + } +} diff --git a/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidMetadataHandler.java b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidMetadataHandler.java new file mode 100644 index 0000000000..8f5d7a4da6 --- /dev/null +++ b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidMetadataHandler.java @@ -0,0 +1,109 @@ +/*- + * #%L + * athena-android + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.android; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; + +import java.util.Collections; + +public class AndroidMetadataHandler + extends MetadataHandler +{ + private static final String sourceType = "android"; + private static final AndroidDeviceTable androidDeviceTable = new AndroidDeviceTable(); + + public AndroidMetadataHandler() + { + super(sourceType); + } + + @VisibleForTesting + protected AndroidMetadataHandler( + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(keyFactory, secretsManager, athena, sourceType, spillBucket, spillPrefix); + } + + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest listSchemasRequest) + { + String schemaName = androidDeviceTable.getTableName().getSchemaName(); + return new ListSchemasResponse(listSchemasRequest.getCatalogName(), Collections.singletonList(schemaName)); + } + + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest listTablesRequest) + { + return new ListTablesResponse(listTablesRequest.getCatalogName(), + Collections.singletonList(androidDeviceTable.getTableName())); + } + + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + if (!androidDeviceTable.getTableName().equals(getTableRequest.getTableName())) { + throw new RuntimeException("Unknown table " + getTableRequest.getTableName()); + } + + return new GetTableResponse(getTableRequest.getCatalogName(), + androidDeviceTable.getTableName(), + androidDeviceTable.getSchema()); + } + + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + //NoOp since we don't support partitioning + } + + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest getSplitsRequest) + { + //Every split needs a unique spill location. + SpillLocation spillLocation = makeSpillLocation(getSplitsRequest); + EncryptionKey encryptionKey = makeEncryptionKey(); + Split split = Split.newBuilder(spillLocation, encryptionKey).build(); + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), split); + } +} diff --git a/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidRecordHandler.java b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidRecordHandler.java new file mode 100644 index 0000000000..1e27ec4dbd --- /dev/null +++ b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/AndroidRecordHandler.java @@ -0,0 +1,174 @@ +/*- + * #%L + * athena-android + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.android; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.amazonaws.services.sqs.AmazonSQS; +import com.amazonaws.services.sqs.AmazonSQSClientBuilder; +import com.amazonaws.services.sqs.model.DeleteMessageBatchRequestEntry; +import com.amazonaws.services.sqs.model.GetQueueUrlResult; +import com.amazonaws.services.sqs.model.ReceiveMessageRequest; +import com.amazonaws.services.sqs.model.ReceiveMessageResult; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +public class AndroidRecordHandler + extends RecordHandler +{ + private static final String sourceType = "android"; + private static final Logger logger = LoggerFactory.getLogger(AndroidRecordHandler.class); + + private static final String FIREBASE_DB_URL = "FIREBASE_DB_URL"; + private static final String FIREBASE_CONFIG = "FIREBASE_CONFIG"; + private static final String RESPONSE_QUEUE_NAME = "RESPONSE_QUEUE_NAME"; + private static final String MAX_WAIT_TIME = "MAX_WAIT_TIME"; + private static final String MIN_RESULTS = "MIN_RESULTS"; + + private final AndroidDeviceTable androidTable = new AndroidDeviceTable(); + private final ObjectMapper mapper = new ObjectMapper(); + private final AmazonSQS amazonSQS; + private final LiveQueryService liveQueryService; + private final String queueUrl; + + public AndroidRecordHandler() + { + this(AmazonS3ClientBuilder.defaultClient(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + AmazonSQSClientBuilder.defaultClient(), + new LiveQueryService(System.getenv(FIREBASE_CONFIG), System.getenv(FIREBASE_DB_URL))); + } + + @VisibleForTesting + protected AndroidRecordHandler(AmazonS3 amazonS3, + AWSSecretsManager secretsManager, + AmazonAthena athena, + AmazonSQS amazonSQS, + LiveQueryService liveQueryService) + { + super(amazonS3, secretsManager, athena, sourceType); + this.amazonSQS = amazonSQS; + this.liveQueryService = liveQueryService; + GetQueueUrlResult queueUrlResult = amazonSQS.getQueueUrl(System.getenv(RESPONSE_QUEUE_NAME)); + queueUrl = queueUrlResult.getQueueUrl(); + } + + @Override + protected void readWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest readRecordsRequest, QueryStatusChecker queryStatusChecker) + { + QueryRequest request = QueryRequest.newBuilder() + .withQueryId(readRecordsRequest.getQueryId()) + .withQuery("query details") + .withResponseQueue(queueUrl) + .build(); + + String response = liveQueryService.broadcastQuery(readRecordsRequest.getTableName().getTableName(), request); + logger.info("readWithConstraint: Android broadcast result: " + response); + + readResultsFromSqs(blockSpiller, readRecordsRequest); + } + + private void readResultsFromSqs(BlockSpiller blockSpiller, ReadRecordsRequest readRecordsRequest) + { + final Map fields = new HashMap<>(); + readRecordsRequest.getSchema().getFields().forEach(next -> fields.put(next.getName(), next)); + + ReceiveMessageRequest receiveRequest = new ReceiveMessageRequest() + .withQueueUrl(queueUrl) + .withWaitTimeSeconds(1); + + ValueSet queryTimeoutValueSet = readRecordsRequest.getConstraints().getSummary().get(androidTable.getQueryTimeout()); + ValueSet minResultsValueSet = readRecordsRequest.getConstraints().getSummary().get(androidTable.getQueryMinResultsField()); + + long maxWaitTime = queryTimeoutValueSet != null && queryTimeoutValueSet.isSingleValue() ? + (long) queryTimeoutValueSet.getSingleValue() : Long.parseLong(System.getenv(MAX_WAIT_TIME)); + long minResults = minResultsValueSet != null && minResultsValueSet.isSingleValue() ? + (long) minResultsValueSet.getSingleValue() : Long.parseLong(System.getenv(MIN_RESULTS)); + + logger.info("readResultsFromSqs: using timeout of " + maxWaitTime + " ms and min_results of " + minResults); + + long startTime = System.currentTimeMillis(); + long numResults = 0; + ReceiveMessageResult receiveMessageResult; + List msgsToAck = new ArrayList<>(); + do { + receiveMessageResult = amazonSQS.receiveMessage(receiveRequest); + for (com.amazonaws.services.sqs.model.Message next : receiveMessageResult.getMessages()) { + try { + QueryResponse queryResponse = mapper.readValue(next.getBody(), QueryResponse.class); + if (queryResponse.getQueryId().equals(readRecordsRequest.getQueryId())) { + numResults++; + msgsToAck.add(new DeleteMessageBatchRequestEntry().withReceiptHandle(next.getReceiptHandle()).withId(next.getMessageId())); + blockSpiller.writeRows((Block block, int rowNum) -> { + int newRows = 0; + + for (String nextVal : queryResponse.getValues()) { + boolean matches = true; + int effectiveRow = newRows + rowNum; + + matches &= block.offerValue(androidTable.getDeviceIdField(), effectiveRow, queryResponse.getDeviceId()); + matches &= block.offerValue(androidTable.getNameField(), effectiveRow, queryResponse.getName()); + matches &= block.offerValue(androidTable.getEchoValueField(), effectiveRow, queryResponse.getEchoValue()); + matches &= block.offerValue(androidTable.getLastUpdatedField(), effectiveRow, System.currentTimeMillis()); + matches &= block.offerValue(androidTable.getResultField(), effectiveRow, nextVal); + matches &= block.offerValue(androidTable.getScoreField(), effectiveRow, queryResponse.getRandom()); + matches &= block.offerValue(androidTable.getQueryMinResultsField(), effectiveRow, minResults); + matches &= block.offerValue(androidTable.getQueryTimeout(), effectiveRow, maxWaitTime); + + newRows += matches ? 1 : 0; + } + + return newRows; + }); + logger.info("Received matching response " + queryResponse.toString()); + } + } + catch (RuntimeException | IOException ex) { + logger.error("Error processing msg", ex); + } + } + if (!msgsToAck.isEmpty()) { + amazonSQS.deleteMessageBatch(queueUrl, msgsToAck); + msgsToAck.clear(); + } + } + while (System.currentTimeMillis() - startTime < maxWaitTime && (numResults < minResults || receiveMessageResult.getMessages().size() > 0)); + } +} diff --git a/athena-android/src/main/java/com/amazonaws/athena/connectors/android/LiveQueryService.java b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/LiveQueryService.java new file mode 100644 index 0000000000..2e287c2092 --- /dev/null +++ b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/LiveQueryService.java @@ -0,0 +1,68 @@ +/*- + * #%L + * athena-android + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.android; + +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.google.auth.oauth2.GoogleCredentials; +import com.google.firebase.FirebaseApp; +import com.google.firebase.FirebaseOptions; +import com.google.firebase.messaging.FirebaseMessaging; +import com.google.firebase.messaging.FirebaseMessagingException; +import com.google.firebase.messaging.Message; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; + +public class LiveQueryService +{ + private static final String PUSH_MSG_FIELD = "query_request"; + private final ObjectMapper mapper = new ObjectMapper(); + + public LiveQueryService(String authConfig, String databaseUrl) + { + try { + InputStream inputStream = new ByteArrayInputStream(authConfig.getBytes()); + FirebaseOptions options = new FirebaseOptions.Builder() + .setCredentials(GoogleCredentials.fromStream(inputStream)) + .setDatabaseUrl(databaseUrl) + .build(); + + FirebaseApp.initializeApp(options); + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + public String broadcastQuery(String topic, QueryRequest query) + { + try { + Message.Builder messageBuilder = Message.builder(); + messageBuilder.putData(PUSH_MSG_FIELD, mapper.writeValueAsString(query)); + messageBuilder.setTopic(topic); + return FirebaseMessaging.getInstance().send(messageBuilder.build()); + } + catch (JsonProcessingException | FirebaseMessagingException ex) { + throw new RuntimeException(ex); + } + } +} diff --git a/athena-android/src/main/java/com/amazonaws/athena/connectors/android/QueryRequest.java b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/QueryRequest.java new file mode 100644 index 0000000000..01b8e8ae3f --- /dev/null +++ b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/QueryRequest.java @@ -0,0 +1,132 @@ +/*- + * #%L + * athena-android + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.android; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +public class QueryRequest +{ + private final String queryId; + private final String query; + private final String echoValue; + private final String responseQueue; + + @JsonCreator + public QueryRequest(@JsonProperty("queryId") String queryId, + @JsonProperty("query") String query, + @JsonProperty("echoValue") String echoValue, + @JsonProperty("responseQueue") String responseQueue) + { + this.queryId = queryId; + this.query = query; + this.echoValue = echoValue; + this.responseQueue = responseQueue; + } + + private QueryRequest(Builder builder) + { + queryId = builder.queryId; + query = builder.query; + echoValue = builder.echoValue; + responseQueue = builder.responseQueue; + } + + public static Builder newBuilder() + { + return new Builder(); + } + + @JsonProperty("query") + public String getQuery() + { + return query; + } + + @JsonProperty("echoValue") + public String getEchoValue() + { + return echoValue; + } + + @JsonProperty("queryId") + public String getQueryId() + { + return queryId; + } + + @JsonProperty("responseQueue") + public String getResponseQueue() + { + return responseQueue; + } + + @Override + public String toString() + { + return "QueryRequest{" + + "queryId='" + queryId + '\'' + + ", query='" + query + '\'' + + ", echoValue='" + echoValue + '\'' + + ", responseQueue='" + responseQueue + '\'' + + '}'; + } + + public static final class Builder + { + private String queryId; + private String query; + private String echoValue; + private String responseQueue; + + private Builder() + { + } + + public Builder withQuery(String val) + { + query = val; + return this; + } + + public Builder withEchoValue(String val) + { + echoValue = val; + return this; + } + + public Builder withResponseQueue(String val) + { + responseQueue = val; + return this; + } + + public Builder withQueryId(String val) + { + queryId = val; + return this; + } + + public QueryRequest build() + { + return new QueryRequest(this); + } + } +} diff --git a/athena-android/src/main/java/com/amazonaws/athena/connectors/android/QueryResponse.java b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/QueryResponse.java new file mode 100644 index 0000000000..b556d024e2 --- /dev/null +++ b/athena-android/src/main/java/com/amazonaws/athena/connectors/android/QueryResponse.java @@ -0,0 +1,170 @@ +/*- + * #%L + * athena-android + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.android; + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import java.util.List; + +public class QueryResponse +{ + private final String deviceId; + private final String queryId; + private final String name; + private final String echoValue; + private final List values; + private final int random; + + @JsonCreator + public QueryResponse(@JsonProperty("deviceId") String deviceId, + @JsonProperty("queryId") String queryId, + @JsonProperty("name") String name, + @JsonProperty("echoValue") String echoValue, + @JsonProperty("values") List values, + @JsonProperty("random") int random) + { + this.deviceId = deviceId; + this.queryId = queryId; + this.name = name; + this.echoValue = echoValue; + this.values = values; + this.random = random; + } + + private QueryResponse(Builder builder) + { + queryId = builder.queryId; + deviceId = builder.deviceId; + name = builder.name; + echoValue = builder.echoValue; + values = builder.values; + random = builder.random; + } + + public static Builder newBuilder() + { + return new Builder(); + } + + @JsonProperty("deviceId") + public String getDeviceId() + { + return deviceId; + } + + @JsonProperty("queryId") + public String getQueryId() + { + return queryId; + } + + @JsonProperty("name") + public String getName() + { + return name; + } + + @JsonProperty("echoValue") + public String getEchoValue() + { + return echoValue; + } + + @JsonProperty("values") + public List getValues() + { + return values; + } + + @JsonProperty("random") + public int getRandom() + { + return random; + } + + @Override + public String toString() + { + return "QueryResponse{" + + "deviceId='" + deviceId + '\'' + + ", queryId='" + queryId + '\'' + + ", name='" + name + '\'' + + ", echoValue='" + echoValue + '\'' + + ", values=" + values + + ", random=" + random + + '}'; + } + + public static final class Builder + { + private String deviceId; + private String queryId; + private String name; + private String echoValue; + private List values; + private int random; + + private Builder() + { + } + + public Builder withDeviceId(String val) + { + deviceId = val; + return this; + } + + public Builder withQueryId(String val) + { + queryId = val; + return this; + } + + public Builder withEchoValue(String val) + { + echoValue = val; + return this; + } + + public Builder withName(String val) + { + name = val; + return this; + } + + public Builder withValues(List val) + { + values = val; + return this; + } + + public Builder withRandom(int val) + { + random = val; + return this; + } + + public QueryResponse build() + { + return new QueryResponse(this); + } + } +} diff --git a/athena-aws-cmdb/LICENSE.txt b/athena-aws-cmdb/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-aws-cmdb/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-aws-cmdb/README.md b/athena-aws-cmdb/README.md new file mode 100644 index 0000000000..1cfdfa2925 --- /dev/null +++ b/athena-aws-cmdb/README.md @@ -0,0 +1,66 @@ +# Amazon Athena AWS CMDB Connector + +This connector enables Amazon Athena to communicate with various AWS Services, making your AWS Resource inventory accessible via SQL. + +## Usage + +### Parameters + +The Athena AWS CMDB Connector provides several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) +5. **default_ec2_image_owner** - (Optional) When set, this controls the default ec2 image (aka AMI) owner used to filter AMIs. When this isn't set and your query against the ec2 images table does not include a filter for owner you will get a large number of results since the response will include all public images. + +### Databases & Tables + +The Athena AWS CMDB Connector makes the following databases and tables available for querying your AWS Resource Inventory. For more information on the columns available in each table, try running a 'describe database.table' from the Athena Console or API. + +1. **ec2** - This database contains EC2 related resources, including: + * **ebs_volumes** - Contains details of you EBS volumes. + * **ec2_instances** - Contains details of your EC2 Instances. + * **ec2_images** - Contains details of your EC2 Instance images. + * **routing_tables** - Contains details of your VPC Routing Tables. + * **security_groups** - Contains details of your Security Groups. + * **subnets** - Contains details of your VPC Subnets. + * **vpcs** - Contains details of your VPCs. +2. **emr** - This database contains EMR related resources, including: + * **emr_clusters** - Contains details of your EMR Clusters. +3. **rds** - This database contains RDS related resources, including: + * **rds_instances** - Contains details of your RDS Instances. +4. **s3** - This database contains RDS related resources, including: + * **buckets** - Contains details of your S3 buckets. + * **objects** - Contains details of your S3 Objects (excludes their contents). + +### Required Permissions + +Review the "Policies" section of the athena-aws-cmdb.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +1. EC2 Describe - The connector uses this access to describe your EC2 Instances, Security Groups, VPCs, EBS Volumes, etc... +1. EMR Describe / List - The connector uses this access to describe your EMR Clusters. +1. RDS Describe - The connector uses this access to describe your RDS Instances. +1. S3 List - The connector uses this access to list your buckets and objects. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-aws-cmdb dir, run `mvn clean install`. +3. From the athena-aws-cmdb dir, run `../tools/publish.sh S3_BUCKET_NAME athena-aws-cmdb` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) +4. Try running a query like the one below in Athena: +```sql +select * from "lambda:".ec2.ec2_instances limit 100 +``` + +## Performance + +The Athena AWS CMDB Connector does not current support parallel scans. Predicate Pushdown is performed within the Lambda function and where possible partial predicates are pushed to the services being queried. For example, a query for the details of a specific EC2 Instance will turn into a targeted describe of that specific instance id against the EC2 API. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-aws-cmdb/athena-aws-cmdb.yaml b/athena-aws-cmdb/athena-aws-cmdb.yaml new file mode 100644 index 0000000000..76225a36b3 --- /dev/null +++ b/athena-aws-cmdb/athena-aws-cmdb.yaml @@ -0,0 +1,73 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaAwsCmdbConnector + Description: 'This connector enables Amazon Athena to communicate with various AWS Services, making your resource inventories accessible via SQL.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.aws.cmdb.AwsCmdbCompositeHandler" + CodeUri: "./target/athena-aws-cmdb-1.0.jar" + Description: "Enables Amazon Athena to communicate with various AWS Services, making your resource inventories accessible via SQL." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - autoscaling:Describe* + - elasticloadbalancing:Describe* + - ec2:Describe* + - elasticmapreduce:Describe* + - elasticmapreduce:List* + - rds:Describe* + - rds:ListTagsForResource + - athena:GetQueryExecution + - s3:ListAllMyBuckets + - s3:ListBucket + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-aws-cmdb/pom.xml b/athena-aws-cmdb/pom.xml new file mode 100644 index 0000000000..981a2aac8b --- /dev/null +++ b/athena-aws-cmdb/pom.xml @@ -0,0 +1,66 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-aws-cmdb + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.amazonaws + aws-java-sdk-ec2 + ${aws-sdk.version} + + + com.amazonaws + aws-java-sdk-emr + ${aws-sdk.version} + + + com.amazonaws + aws-java-sdk-rds + ${aws-sdk.version} + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbCompositeHandler.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbCompositeHandler.java new file mode 100644 index 0000000000..8036fd1e31 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose AwsCmdbMetadataHandler and AwsCmdbRecordHandler. + */ +public class AwsCmdbCompositeHandler + extends CompositeHandler +{ + public AwsCmdbCompositeHandler() + { + super(new AwsCmdbMetadataHandler(), new AwsCmdbRecordHandler()); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbMetadataHandler.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbMetadataHandler.java new file mode 100644 index 0000000000..78cd23ddc2 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbMetadataHandler.java @@ -0,0 +1,176 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; + +import java.util.List; +import java.util.Map; + +/** + * Handles metadata requests for the Athena AWS CMDB Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Maps AWS Resources to SQL tables using a set of TableProviders constructed from a TableProviderFactory. + * 2. This class is largely a mux that delegates requests to the appropriate TableProvider based on the + * requested TableName. + * 3. Provides a schema and table list by scanning all loaded TableProviders. + */ +public class AwsCmdbMetadataHandler + extends MetadataHandler +{ + private static final String SOURCE_TYPE = "cmdb"; + //Map of schema name to list of TableNames generated by scanning all loaded TableProviders. + private Map> schemas; + //Map of available fully qualified TableNames to their respective TableProviders. + private Map tableProviders; + + public AwsCmdbMetadataHandler() + { + super(SOURCE_TYPE); + TableProviderFactory tableProviderFactory = new TableProviderFactory(); + schemas = tableProviderFactory.getSchemas(); + tableProviders = tableProviderFactory.getTableProviders(); + } + + @VisibleForTesting + protected AwsCmdbMetadataHandler(TableProviderFactory tableProviderFactory, + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(keyFactory, secretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + schemas = tableProviderFactory.getSchemas(); + tableProviders = tableProviderFactory.getTableProviders(); + } + + /** + * Returns the list of supported schemas discovered from the loaded TableProvider scan. + * + * @see MetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest listSchemasRequest) + { + return new ListSchemasResponse(listSchemasRequest.getCatalogName(), schemas.keySet()); + } + + /** + * Returns the list of supported tables on the requested schema discovered from the loaded TableProvider scan. + * + * @see MetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest listTablesRequest) + { + return new ListTablesResponse(listTablesRequest.getCatalogName(), schemas.get(listTablesRequest.getSchemaName())); + } + + /** + * Delegates to the TableProvider that is registered for the requested table. + * + * @see MetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + TableProvider tableProvider = tableProviders.get(getTableRequest.getTableName()); + if (tableProvider == null) { + throw new RuntimeException("Unknown table " + getTableRequest.getTableName()); + } + return tableProvider.getTable(blockAllocator, getTableRequest); + } + + /** + * Delegates to the TableProvider that is registered for the requested table. + * + * @see MetadataHandler + */ + @Override + public void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + TableProvider tableProvider = tableProviders.get(request.getTableName()); + if (tableProvider == null) { + throw new RuntimeException("Unknown table " + request.getTableName()); + } + tableProvider.enhancePartitionSchema(partitionSchemaBuilder, request); + } + + /** + * Delegates to the TableProvider that is registered for the requested table. + * + * @see MetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + TableProvider tableProvider = tableProviders.get(request.getTableName()); + if (tableProvider == null) { + throw new RuntimeException("Unknown table " + request.getTableName()); + } + tableProvider.getPartitions(blockWriter, request); + } + + /** + * Delegates to the TableProvider that is registered for the requested table. + * + * @see MetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest getSplitsRequest) + { + TableProvider tableProvider = tableProviders.get(getSplitsRequest.getTableName()); + if (tableProvider == null) { + throw new RuntimeException("Unknown table " + getSplitsRequest.getTableName()); + } + + //Every split needs a unique spill location. + SpillLocation spillLocation = makeSpillLocation(getSplitsRequest); + EncryptionKey encryptionKey = makeEncryptionKey(); + Split split = Split.newBuilder(spillLocation, encryptionKey).build(); + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), split); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbRecordHandler.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbRecordHandler.java new file mode 100644 index 0000000000..ea78f6d996 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbRecordHandler.java @@ -0,0 +1,76 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; + +import java.util.Map; + +/** + * Handles record requests for the Athena AWS CMDB Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Maps AWS Resources to SQL tables using a set of TableProviders constructed from a TableProviderFactory. + * 2. This class is largely a mux that delegates requests to the appropriate TableProvider based on the + * requested TableName. + */ +public class AwsCmdbRecordHandler + extends RecordHandler +{ + private static final String SOURCE_TYPE = "cmdb"; + + //Map of available fully qualified TableNames to their respective TableProviders. + private Map tableProviders; + + public AwsCmdbRecordHandler() + { + super(SOURCE_TYPE); + tableProviders = new TableProviderFactory().getTableProviders(); + } + + @VisibleForTesting + protected AwsCmdbRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, TableProviderFactory tableProviderFactory) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + tableProviders = tableProviderFactory.getTableProviders(); + } + + /** + * Delegates to the TableProvider that is registered for the requested table. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest readRecordsRequest, QueryStatusChecker queryStatusChecker) + { + TableProvider tableProvider = tableProviders.get(readRecordsRequest.getTableName()); + tableProvider.readWithConstraint(blockSpiller, readRecordsRequest, queryStatusChecker); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/TableProviderFactory.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/TableProviderFactory.java new file mode 100644 index 0000000000..11259c0c2a --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/TableProviderFactory.java @@ -0,0 +1,123 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connectors.aws.cmdb.tables.EmrClusterTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.RdsTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.EbsTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.Ec2TableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.ImagesTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.RouteTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.SecurityGroupsTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.SubnetTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.ec2.VpcTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.s3.S3BucketsTableProvider; +import com.amazonaws.athena.connectors.aws.cmdb.tables.s3.S3ObjectsTableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.AmazonEC2ClientBuilder; +import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce; +import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduceClientBuilder; +import com.amazonaws.services.rds.AmazonRDS; +import com.amazonaws.services.rds.AmazonRDSClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import org.apache.arrow.util.VisibleForTesting; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * Acts as a factory for all supported TableProviders and also a source of meta-data about the + * schemas and tables that the loaded TableProviders support. + */ +public class TableProviderFactory +{ + private Map> schemas = new HashMap<>(); + private Map tableProviders = new HashMap<>(); + + public TableProviderFactory() + { + this(AmazonEC2ClientBuilder.standard().build(), + AmazonElasticMapReduceClientBuilder.standard().build(), + AmazonRDSClientBuilder.standard().build(), + AmazonS3ClientBuilder.standard().build()); + } + + @VisibleForTesting + protected TableProviderFactory(AmazonEC2 ec2, AmazonElasticMapReduce emr, AmazonRDS rds, AmazonS3 amazonS3) + { + addProvider(new Ec2TableProvider(ec2)); + addProvider(new EbsTableProvider(ec2)); + addProvider(new VpcTableProvider(ec2)); + addProvider(new SecurityGroupsTableProvider(ec2)); + addProvider(new RouteTableProvider(ec2)); + addProvider(new SubnetTableProvider(ec2)); + addProvider(new ImagesTableProvider(ec2)); + addProvider(new EmrClusterTableProvider(emr)); + addProvider(new RdsTableProvider(rds)); + addProvider(new S3ObjectsTableProvider(amazonS3)); + addProvider(new S3BucketsTableProvider(amazonS3)); + } + + /** + * Adds a new TableProvider to the loaded set, if and only if, no existing TableProvider is known + * for the fully qualified table represented by the new TableProvider we are attempting to add. + * + * @param provider The TableProvider to add. + */ + private void addProvider(TableProvider provider) + { + if (tableProviders.putIfAbsent(provider.getTableName(), provider) != null) { + throw new RuntimeException("Duplicate provider for " + provider.getTableName()); + } + + List tables = schemas.get(provider.getSchema()); + if (tables == null) { + tables = new ArrayList<>(); + schemas.put(provider.getSchema(), tables); + } + tables.add(provider.getTableName()); + } + + /** + * Provides access to the mapping of loaded TableProviders by their fully qualified table names. + * + * @return Map of TableNames to their corresponding TableProvider. + */ + public Map getTableProviders() + { + return tableProviders; + } + + /** + * Provides access to the mapping of TableNames for each schema name discovered during the TableProvider + * scann. + * + * @return Map of schema names to their corresponding list of fully qualified TableNames. + */ + public Map> getSchemas() + { + return schemas; + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/EmrClusterTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/EmrClusterTableProvider.java new file mode 100644 index 0000000000..ee3b15da91 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/EmrClusterTableProvider.java @@ -0,0 +1,205 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce; +import com.amazonaws.services.elasticmapreduce.model.Cluster; +import com.amazonaws.services.elasticmapreduce.model.ClusterSummary; +import com.amazonaws.services.elasticmapreduce.model.DescribeClusterRequest; +import com.amazonaws.services.elasticmapreduce.model.DescribeClusterResult; +import com.amazonaws.services.elasticmapreduce.model.ListClustersRequest; +import com.amazonaws.services.elasticmapreduce.model.ListClustersResult; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your EMR Clusters to a table. + */ +public class EmrClusterTableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonElasticMapReduce emr; + + public EmrClusterTableProvider(AmazonElasticMapReduce emr) + { + this.emr = emr; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "emr"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "emr_clusters"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls ListClusters and DescribeCluster on the AWS EMR Client returning all clusters that match the supplied + * predicate and attempting to push down certain predicates (namely queries for specific cluster) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + boolean done = false; + ListClustersRequest request = new ListClustersRequest(); + + while (!done) { + ListClustersResult response = emr.listClusters(request); + + for (ClusterSummary next : response.getClusters()) { + Cluster cluster = null; + if (!next.getStatus().getState().toLowerCase().contains("terminated")) { + DescribeClusterResult clusterResponse = emr.describeCluster(new DescribeClusterRequest().withClusterId(next.getId())); + cluster = clusterResponse.getCluster(); + } + clusterToRow(next, cluster, spiller); + } + + request.setMarker(response.getMarker()); + + if (response.getMarker() == null || !queryStatusChecker.isQueryRunning()) { + done = true; + } + } + } + + /** + * Maps an EBS Volume into a row in our Apache Arrow response block(s). + * + * @param clusterSummary The CluserSummary for the provided Cluster. + * @param cluster The EMR Cluster to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void clusterToRow(ClusterSummary clusterSummary, + Cluster cluster, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("id", row, clusterSummary.getId()); + matched &= block.offerValue("name", row, clusterSummary.getName()); + matched &= block.offerValue("instance_hours", row, clusterSummary.getNormalizedInstanceHours()); + matched &= block.offerValue("state", row, clusterSummary.getStatus().getState()); + matched &= block.offerValue("state_code", row, clusterSummary.getStatus().getStateChangeReason().getCode()); + matched &= block.offerValue("state_msg", row, clusterSummary.getStatus().getStateChangeReason().getMessage()); + + if (cluster != null) { + matched &= block.offerValue("autoscaling_role", row, cluster.getAutoScalingRole()); + matched &= block.offerValue("custom_ami", row, cluster.getCustomAmiId()); + matched &= block.offerValue("instance_collection_type", row, cluster.getInstanceCollectionType()); + matched &= block.offerValue("log_uri", row, cluster.getLogUri()); + matched &= block.offerValue("master_public_dns", row, cluster.getMasterPublicDnsName()); + matched &= block.offerValue("release_label", row, cluster.getReleaseLabel()); + matched &= block.offerValue("running_ami", row, cluster.getRunningAmiVersion()); + matched &= block.offerValue("scale_down_behavior", row, cluster.getScaleDownBehavior()); + matched &= block.offerValue("service_role", row, cluster.getServiceRole()); + matched &= block.offerValue("service_role", row, cluster.getServiceRole()); + + List applications = cluster.getApplications().stream() + .map(next -> next.getName() + ":" + next.getVersion()).collect(Collectors.toList()); + matched &= block.offerComplexValue("applications", row, FieldResolver.DEFAULT, applications); + + List tags = cluster.getTags().stream() + .map(next -> next.getKey() + ":" + next.getValue()).collect(Collectors.toList()); + matched &= block.offerComplexValue("tags", row, FieldResolver.DEFAULT, tags); + } + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("id") + .addStringField("name") + .addIntField("instance_hours") + .addStringField("state") + .addStringField("state_code") + .addStringField("state_msg") + .addStringField("autoscaling_role") + .addStringField("custom_ami") + .addStringField("instance_collection_type") + .addStringField("log_uri") + .addStringField("master_public_dns") + .addStringField("release_label") + .addStringField("running_ami") + .addStringField("scale_down_behavior") + .addStringField("service_role") + .addListField("applications", Types.MinorType.VARCHAR.getType()) + .addListField("tags", Types.MinorType.VARCHAR.getType()) + .addMetadata("id", "Cluster Id") + .addMetadata("name", "Cluster Name") + .addMetadata("state", "State of the cluster.") + .addMetadata("state_code", "Code associated with the state of the cluster.") + .addMetadata("state_msg", "Message associated with the state of the cluster.") + .addMetadata("autoscaling_role", "AutoScaling role used by the cluster.") + .addMetadata("custom_ami", "Custom AMI used by the cluster (if any)") + .addMetadata("instance_collection_type", "Instance collection type used by the cluster.") + .addMetadata("log_uri", "URI where debug logs can be found for the cluster.") + .addMetadata("master_public_dns", "Public DNS name of the master node.") + .addMetadata("release_label", "EMR release label the cluster is running.") + .addMetadata("running_ami", "AMI the cluster are running.") + .addMetadata("scale_down_behavior", "Scale down behavoir of the cluster.") + .addMetadata("applications", "The EMR applications installed on the cluster.") + .addMetadata("tags", "Tags associated with the volume.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/RdsTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/RdsTableProvider.java new file mode 100644 index 0000000000..c8338bbeb8 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/RdsTableProvider.java @@ -0,0 +1,378 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.rds.AmazonRDS; +import com.amazonaws.services.rds.model.DBInstance; +import com.amazonaws.services.rds.model.DBInstanceStatusInfo; +import com.amazonaws.services.rds.model.DBParameterGroupStatus; +import com.amazonaws.services.rds.model.DBSecurityGroupMembership; +import com.amazonaws.services.rds.model.DBSubnetGroup; +import com.amazonaws.services.rds.model.DescribeDBInstancesRequest; +import com.amazonaws.services.rds.model.DescribeDBInstancesResult; +import com.amazonaws.services.rds.model.DomainMembership; +import com.amazonaws.services.rds.model.Endpoint; +import com.amazonaws.services.rds.model.Subnet; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.stream.Collectors; + +/** + * Maps your RDS instances to a table. + */ +public class RdsTableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonRDS rds; + + public RdsTableProvider(AmazonRDS rds) + { + this.rds = rds; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "rds"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "rds_instances"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeDBInstances on the AWS RDS Client returning all DB Instances that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific DB Instance) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + boolean done = false; + DescribeDBInstancesRequest request = new DescribeDBInstancesRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("instance_id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setDBInstanceIdentifier(idConstraint.getSingleValue().toString()); + } + + while (!done) { + DescribeDBInstancesResult response = rds.describeDBInstances(request); + + for (DBInstance instance : response.getDBInstances()) { + instanceToRow(instance, spiller); + } + + request.setMarker(response.getMarker()); + + if (response.getMarker() == null || !queryStatusChecker.isQueryRunning()) { + done = true; + } + } + } + + /** + * Maps a DBInstance into a row in our Apache Arrow response block(s). + * + * @param instance The DBInstance to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(DBInstance instance, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("instance_id", row, instance.getDBInstanceIdentifier()); + matched &= block.offerValue("primary_az", row, instance.getAvailabilityZone()); + matched &= block.offerValue("storage_gb", row, instance.getAllocatedStorage()); + matched &= block.offerValue("is_encrypted", row, instance.getStorageEncrypted()); + matched &= block.offerValue("storage_type", row, instance.getStorageType()); + matched &= block.offerValue("backup_retention_days", row, instance.getBackupRetentionPeriod()); + matched &= block.offerValue("auto_upgrade", row, instance.getAutoMinorVersionUpgrade()); + matched &= block.offerValue("instance_class", row, instance.getDBInstanceClass()); + matched &= block.offerValue("port", row, instance.getDbInstancePort()); + matched &= block.offerValue("status", row, instance.getDBInstanceStatus()); + matched &= block.offerValue("dbi_resource_id", row, instance.getDbiResourceId()); + matched &= block.offerValue("name", row, instance.getDBName()); + matched &= block.offerValue("engine", row, instance.getEngine()); + matched &= block.offerValue("engine_version", row, instance.getEngineVersion()); + matched &= block.offerValue("license_model", row, instance.getLicenseModel()); + matched &= block.offerValue("secondary_az", row, instance.getSecondaryAvailabilityZone()); + matched &= block.offerValue("backup_window", row, instance.getPreferredBackupWindow()); + matched &= block.offerValue("maint_window", row, instance.getPreferredMaintenanceWindow()); + matched &= block.offerValue("read_replica_source_id", row, instance.getReadReplicaSourceDBInstanceIdentifier()); + matched &= block.offerValue("create_time", row, instance.getInstanceCreateTime()); + matched &= block.offerValue("public_access", row, instance.getPubliclyAccessible()); + matched &= block.offerValue("iops", row, instance.getIops()); + matched &= block.offerValue("is_multi_az", row, instance.getMultiAZ()); + + matched &= block.offerComplexValue("domains", row, (Field field, Object val) -> { + if (field.getName().equals("domain")) { + return ((DomainMembership) val).getDomain(); + } + else if (field.getName().equals("fqdn")) { + return ((DomainMembership) val).getFQDN(); + } + else if (field.getName().equals("iam_role")) { + return ((DomainMembership) val).getIAMRoleName(); + } + else if (field.getName().equals("status")) { + return ((DomainMembership) val).getStatus(); + } + + throw new RuntimeException("Unexpected field " + field.getName()); + }, + instance.getDomainMemberships()); + + matched &= block.offerComplexValue("param_groups", row, (Field field, Object val) -> { + if (field.getName().equals("name")) { + return ((DBParameterGroupStatus) val).getDBParameterGroupName(); + } + else if (field.getName().equals("status")) { + return ((DBParameterGroupStatus) val).getParameterApplyStatus(); + } + throw new RuntimeException("Unexpected field " + field.getName()); + }, + instance.getDBParameterGroups()); + + matched &= block.offerComplexValue("db_security_groups", + row, + (Field field, Object val) -> { + if (field.getName().equals("name")) { + return ((DBSecurityGroupMembership) val).getDBSecurityGroupName(); + } + else if (field.getName().equals("status")) { + return ((DBSecurityGroupMembership) val).getStatus(); + } + throw new RuntimeException("Unexpected field " + field.getName()); + }, + instance.getDBSecurityGroups()); + + matched &= block.offerComplexValue("subnet_group", + row, + (Field field, Object val) -> { + if (field.getName().equals("description")) { + return ((DBSubnetGroup) val).getDBSubnetGroupDescription(); + } + else if (field.getName().equals("name")) { + return ((DBSubnetGroup) val).getDBSubnetGroupName(); + } + else if (field.getName().equals("status")) { + return ((DBSubnetGroup) val).getSubnetGroupStatus(); + } + else if (field.getName().equals("vpc")) { + return ((DBSubnetGroup) val).getVpcId(); + } + else if (field.getName().equals("subnets")) { + return ((DBSubnetGroup) val).getSubnets().stream() + .map(next -> next.getSubnetIdentifier()).collect(Collectors.toList()); + } + else if (val instanceof Subnet) { + return ((Subnet) val).getSubnetIdentifier(); + } + throw new RuntimeException("Unexpected field " + field.getName()); + }, + instance.getDBSubnetGroup()); + + matched &= block.offerComplexValue("endpoint", + row, + (Field field, Object val) -> { + if (field.getName().equals("address")) { + return ((Endpoint) val).getAddress(); + } + else if (field.getName().equals("port")) { + return ((Endpoint) val).getPort(); + } + else if (field.getName().equals("zone")) { + return ((Endpoint) val).getHostedZoneId(); + } + throw new RuntimeException("Unexpected field " + field.getName()); + }, + instance.getEndpoint()); + + matched &= block.offerComplexValue("status_infos", + row, + (Field field, Object val) -> { + if (field.getName().equals("message")) { + return ((DBInstanceStatusInfo) val).getMessage(); + } + else if (field.getName().equals("is_normal")) { + return ((DBInstanceStatusInfo) val).getNormal(); + } + else if (field.getName().equals("status")) { + return ((DBInstanceStatusInfo) val).getStatus(); + } + else if (field.getName().equals("type")) { + return ((DBInstanceStatusInfo) val).getStatusType(); + } + throw new RuntimeException("Unexpected field " + field.getName()); + }, + instance.getStatusInfos()); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("instance_id") + .addStringField("primary_az") + .addIntField("storage_gb") + .addBitField("is_encrypted") + .addStringField("storage_type") + .addIntField("backup_retention_days") + .addBitField("auto_upgrade") + .addStringField("instance_class") + .addIntField("port") + .addStringField("status") + .addStringField("dbi_resource_id") + .addStringField("name") + .addField( + FieldBuilder.newBuilder("domains", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("domain", Types.MinorType.STRUCT.getType()) + .addStringField("domain") + .addStringField("fqdn") + .addStringField("iam_role") + .addStringField("status") + .build()) + .build()) + .addStringField("engine") + .addStringField("engine_version") + .addStringField("license_model") + .addStringField("secondary_az") + .addStringField("backup_window") + .addStringField("maint_window") + .addStringField("read_replica_source_id") + .addField( + FieldBuilder.newBuilder("param_groups", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("param_group", Types.MinorType.STRUCT.getType()) + .addStringField("name") + .addStringField("status") + .build()) + .build()) + .addField( + FieldBuilder.newBuilder("db_security_groups", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("db_security_group", Types.MinorType.STRUCT.getType()) + .addStringField("name") + .addStringField("status") + .build()) + .build()) + .addStructField("subnet_group") + .addChildField("subnet_group", "name", Types.MinorType.VARCHAR.getType()) + .addChildField("subnet_group", "status", Types.MinorType.VARCHAR.getType()) + .addChildField("subnet_group", "vpc", Types.MinorType.VARCHAR.getType()) + .addChildField("subnet_group", FieldBuilder.newBuilder("subnets", Types.MinorType.LIST.getType()) + .addStringField("subnets").build()) + .addField(FieldBuilder.newBuilder("endpoint", Types.MinorType.STRUCT.getType()) + .addStringField("address") + .addIntField("port") + .addStringField("zone") + .build()) + .addField("create_time", Types.MinorType.DATEMILLI.getType()) + .addBitField("public_access") + + .addField( + FieldBuilder.newBuilder("status_infos", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("status_info", Types.MinorType.STRUCT.getType()) + .addStringField("message") + .addBitField("is_normal") + .addStringField("status") + .addStringField("type") + .build()) + .build()) + + .addIntField("iops") + .addBitField("is_multi_az") + .addMetadata("instance_id", "Database Instance Id") + .addMetadata("primary_az", "The primary az for the database instance") + .addMetadata("storage_gb", "Total allocated storage for the Database Instances in GB.") + .addMetadata("is_encrypted", "True if the database is encrypted.") + .addMetadata("storage_type", "The type of storage used by this Database Instance.") + .addMetadata("backup_retention_days", "The number of days of backups to keep.") + .addMetadata("auto_upgrade", "True if the cluster auto-upgrades minor versions.") + .addMetadata("instance_class", "The instance type used by this database.") + .addMetadata("port", "Listen port for the database.") + .addMetadata("status", "Status of the DB Instance.") + .addMetadata("dbi_resource_id", "Unique id for the instance of the database.") + .addMetadata("name", "Name of the DB Instance.") + .addMetadata("domains", "Active Directory domains to which the DB Instance is associated.") + .addMetadata("applications", "The EMR applications installed on the cluster.") + .addMetadata("engine", "The engine type of the DB Instance.") + .addMetadata("engine_version", "The engine version of the DB Instance") + .addMetadata("license_model", "The license model of the DB Instance") + .addMetadata("secondary_az", "The secondary AZ of the DB Instance") + .addMetadata("backup_window", "The backup window of the DB Instance") + .addMetadata("maint_window", "The maintenance window of the DB Instance") + .addMetadata("read_replica_source_id", "The read replica source id, if present, of the DB Instance") + .addMetadata("param_groups", "The param groups applied to the DB Instance") + .addMetadata("db_security_groups", "The security groups applies the DB Instance") + .addMetadata("subnet_groups", "The subnets available to the DB Instance") + .addMetadata("endpoint", "The endpoint of the DB Instance") + .addMetadata("create_time", "The create time of the DB Instance") + .addMetadata("public_access", "True if publically accessible.") + .addMetadata("status_infos", "The status info details associated with the DB Instance") + .addMetadata("iops", "The total provisioned IOPs for the DB Instance.") + .addMetadata("is_multi_az", "True if the DB Instance is avialable in multiple AZs.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/TableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/TableProvider.java new file mode 100644 index 0000000000..dd2a3d7a25 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/TableProvider.java @@ -0,0 +1,88 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; + +/** + * Defines the functionality required to supply the metadata and data required for the Athena AWS CMDB connector to + * to allow SQL queries to run over the virtual table. + */ +public interface TableProvider +{ + /** + * The schema name (aka database) that this table provider's table belongs to. + * + * @return String containing the schema name. + */ + String getSchema(); + + /** + * The fully qualified name of the table represented by this TableProvider. + * + * @return The TableName containing the fully qualified name of the Table. + */ + TableName getTableName(); + + /** + * Provides access to the Schema details of the requested table. + * + * @See MetadataHandler + */ + GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest); + + /** + * Default implementation returns a single partition since many of the TableProviders may not support + * parallel scans. + * + * @See MetadataHandler + */ + default void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request) + throws Exception + { + //NoOp as we do not support partitioning. + } + + /** + * Default implementation does not enhance the partition results schema + * + * @See MetadataHandler + */ + default void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + //NoOp as we do not support partitioning or added partition data + } + + /** + * Effects the requested read against the table, writing result row data using the supplied BlockSpliller. + * + * @See RecordHandler + */ + void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker); +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/EbsTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/EbsTableProvider.java new file mode 100644 index 0000000000..48b6503757 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/EbsTableProvider.java @@ -0,0 +1,199 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeVolumesRequest; +import com.amazonaws.services.ec2.model.DescribeVolumesResult; +import com.amazonaws.services.ec2.model.Volume; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your EBS volumes to a table. + */ +public class EbsTableProvider + implements TableProvider +{ + private static final Logger logger = LoggerFactory.getLogger(EbsTableProvider.class); + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public EbsTableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "ebs_volumes"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeVolumes on the AWS EC2 Client returning all volumes that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific volumes) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + boolean done = false; + DescribeVolumesRequest request = new DescribeVolumesRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setVolumeIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + + while (!done) { + DescribeVolumesResult response = ec2.describeVolumes(request); + + for (Volume volume : response.getVolumes()) { + logger.info("readWithConstraint: {}", response); + instanceToRow(volume, spiller); + } + + request.setNextToken(response.getNextToken()); + + if (response.getNextToken() == null || !queryStatusChecker.isQueryRunning()) { + done = true; + } + } + } + + /** + * Maps an EBS Volume into a row in our Apache Arrow response block(s). + * + * @param volume The EBS Volume to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(Volume volume, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("id", row, volume.getVolumeId()); + matched &= block.offerValue("type", row, volume.getVolumeType()); + matched &= block.offerValue("availability_zone", row, volume.getAvailabilityZone()); + matched &= block.offerValue("created_time", row, volume.getCreateTime()); + matched &= block.offerValue("is_encrypted", row, volume.getEncrypted()); + matched &= block.offerValue("kms_key_id", row, volume.getKmsKeyId()); + matched &= block.offerValue("size", row, volume.getSize()); + matched &= block.offerValue("iops", row, volume.getIops()); + matched &= block.offerValue("snapshot_id", row, volume.getSnapshotId()); + matched &= block.offerValue("state", row, volume.getState()); + + if (volume.getAttachments().size() == 1) { + matched &= block.offerValue("target", row, volume.getAttachments().get(0).getInstanceId()); + matched &= block.offerValue("attached_device", row, volume.getAttachments().get(0).getDevice()); + matched &= block.offerValue("attachment_state", row, volume.getAttachments().get(0).getState()); + matched &= block.offerValue("attachment_time", row, volume.getAttachments().get(0).getAttachTime()); + } + + List tags = volume.getTags().stream() + .map(next -> next.getKey() + ":" + next.getValue()).collect(Collectors.toList()); + matched &= block.offerComplexValue("tags", row, FieldResolver.DEFAULT, tags); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("id") + .addStringField("type") + .addStringField("target") + .addStringField("attached_device") + .addStringField("attachment_state") + .addField("attachment_time", Types.MinorType.DATEMILLI.getType()) + .addStringField("availability_zone") + .addField("created_time", Types.MinorType.DATEMILLI.getType()) + .addBitField("is_encrypted") + .addStringField("kms_key_id") + .addIntField("size") + .addIntField("iops") + .addStringField("snapshot_id") + .addStringField("state") + .addListField("tags", Types.MinorType.VARCHAR.getType()) + .addMetadata("id", "EBS Volume Id") + .addMetadata("type", "EBS Volume Type") + .addMetadata("target", "EC2 Instance Id that this volume is attached to.") + .addMetadata("attached_device", "Device name where this EBS volume is attached.") + .addMetadata("attachment_state", "The state of the volume attachement.") + .addMetadata("attachment_time", "The time this volume was attached to its target.") + .addMetadata("availability_zone", "The AZ that this EBS Volume is in.") + .addMetadata("created_time", "The date time that the volume was created.") + .addMetadata("is_encrypted", "True if the volume is encrypted with KMS managed key.") + .addMetadata("kms_key_id", "The KMS key id used to encrypt this volume.") + .addMetadata("size", "The size in GBs of this volume.") + .addMetadata("iops", "Provisioned IOPs supported by this volume.") + .addMetadata("snapshot_id", "ID of the last snapshot for this volume.") + .addMetadata("state", "State of the EBS Volume.") + .addMetadata("tags", "Tags associated with the volume.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/Ec2TableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/Ec2TableProvider.java new file mode 100644 index 0000000000..1bf2f7af49 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/Ec2TableProvider.java @@ -0,0 +1,313 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeInstancesRequest; +import com.amazonaws.services.ec2.model.DescribeInstancesResult; +import com.amazonaws.services.ec2.model.Instance; +import com.amazonaws.services.ec2.model.InstanceNetworkInterface; +import com.amazonaws.services.ec2.model.InstanceState; +import com.amazonaws.services.ec2.model.Reservation; +import com.amazonaws.services.ec2.model.StateReason; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your EC2 instances to a table. + */ +public class Ec2TableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public Ec2TableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "ec2_instances"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeInstances on the AWS EC2 Client returning all instances that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific ec2 instance) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + boolean done = false; + DescribeInstancesRequest request = new DescribeInstancesRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("instance_id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setInstanceIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + + while (!done) { + DescribeInstancesResult response = ec2.describeInstances(request); + + for (Reservation reservation : response.getReservations()) { + for (Instance instance : reservation.getInstances()) { + instanceToRow(instance, spiller); + } + } + + request.setNextToken(response.getNextToken()); + + if (response.getNextToken() == null || !queryStatusChecker.isQueryRunning()) { + done = true; + } + } + } + + /** + * Maps an EC2 Instance into a row in our Apache Arrow response block(s). + * + * @param instance The EBS Volume to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(Instance instance, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("instance_id", row, instance.getInstanceId()); + matched &= block.offerValue("image_id", row, instance.getImageId()); + matched &= block.offerValue("instance_type", row, instance.getInstanceType()); + matched &= block.offerValue("platform", row, instance.getPlatform()); + matched &= block.offerValue("private_dns_name", row, instance.getPrivateDnsName()); + matched &= block.offerValue("private_ip_address", row, instance.getPrivateIpAddress()); + matched &= block.offerValue("public_dns_name", row, instance.getPublicDnsName()); + matched &= block.offerValue("public_ip_address", row, instance.getPublicIpAddress()); + matched &= block.offerValue("subnet_id", row, instance.getSubnetId()); + matched &= block.offerValue("vpc_id", row, instance.getVpcId()); + matched &= block.offerValue("architecture", row, instance.getArchitecture()); + matched &= block.offerValue("instance_lifecycle", row, instance.getInstanceLifecycle()); + matched &= block.offerValue("root_device_name", row, instance.getRootDeviceName()); + matched &= block.offerValue("root_device_type", row, instance.getRootDeviceType()); + matched &= block.offerValue("spot_instance_request_id", row, instance.getSpotInstanceRequestId()); + matched &= block.offerValue("virtualization_type", row, instance.getVirtualizationType()); + matched &= block.offerValue("key_name", row, instance.getKeyName()); + matched &= block.offerValue("kernel_id", row, instance.getKernelId()); + matched &= block.offerValue("capacity_reservation_id", row, instance.getCapacityReservationId()); + matched &= block.offerValue("launch_time", row, instance.getLaunchTime()); + + matched &= block.offerComplexValue("state", + row, + (Field field, Object val) -> { + if (field.getName().equals("name")) { + return ((InstanceState) val).getName(); + } + else if (field.getName().equals("code")) { + return ((InstanceState) val).getCode(); + } + throw new RuntimeException("Unknown field " + field.getName()); + }, instance.getState()); + + matched &= block.offerComplexValue("network_interfaces", + row, + (Field field, Object val) -> { + if (field.getName().equals("status")) { + return ((InstanceNetworkInterface) val).getStatus(); + } + else if (field.getName().equals("subnet")) { + return ((InstanceNetworkInterface) val).getSubnetId(); + } + else if (field.getName().equals("vpc")) { + return ((InstanceNetworkInterface) val).getVpcId(); + } + else if (field.getName().equals("mac")) { + return ((InstanceNetworkInterface) val).getMacAddress(); + } + else if (field.getName().equals("private_dns")) { + return ((InstanceNetworkInterface) val).getPrivateDnsName(); + } + else if (field.getName().equals("private_ip")) { + return ((InstanceNetworkInterface) val).getPrivateIpAddress(); + } + else if (field.getName().equals("security_groups")) { + return ((InstanceNetworkInterface) val).getGroups().stream().map(next -> next.getGroupName() + ":" + next.getGroupId()).collect(Collectors.toList()); + } + else if (field.getName().equals("interface_id")) { + return ((InstanceNetworkInterface) val).getNetworkInterfaceId(); + } + + throw new RuntimeException("Unknown field " + field.getName()); + }, instance.getNetworkInterfaces()); + + matched &= block.offerComplexValue("state_reason", row, (Field field, Object val) -> { + if (field.getName().equals("message")) { + return ((StateReason) val).getMessage(); + } + else if (field.getName().equals("code")) { + return ((StateReason) val).getCode(); + } + throw new RuntimeException("Unknown field " + field.getName()); + }, instance.getStateReason()); + + matched &= block.offerValue("ebs_optimized", row, instance.getEbsOptimized()); + + List securityGroups = instance.getSecurityGroups().stream() + .map(next -> next.getGroupId()).collect(Collectors.toList()); + matched &= block.offerComplexValue("security_groups", row, FieldResolver.DEFAULT, securityGroups); + + List securityGroupNames = instance.getSecurityGroups().stream() + .map(next -> next.getGroupName()).collect(Collectors.toList()); + matched &= block.offerComplexValue("security_group_names", row, FieldResolver.DEFAULT, securityGroupNames); + + List ebsVolumes = instance.getBlockDeviceMappings().stream() + .map(next -> next.getEbs().getVolumeId()).collect(Collectors.toList()); + matched &= block.offerComplexValue("ebs_volumes", row, FieldResolver.DEFAULT, ebsVolumes); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("instance_id") + .addStringField("image_id") + .addStringField("instance_type") + .addStringField("platform") + .addStringField("private_dns_name") + .addStringField("private_ip_address") + .addStringField("public_dns_name") + .addStringField("public_ip_address") + .addStringField("subnet_id") + .addStringField("vpc_id") + .addStringField("architecture") + .addStringField("instance_lifecycle") + .addStringField("root_device_name") + .addStringField("root_device_type") + .addStringField("spot_instance_request_id") + .addStringField("virtualization_type") + .addStringField("key_name") + .addStringField("kernel_id") + .addStringField("capacity_reservation_id") + .addField("launch_time", Types.MinorType.DATEMILLI.getType()) + .addStructField("state") + .addChildField("state", "name", Types.MinorType.VARCHAR.getType()) + .addChildField("state", "code", Types.MinorType.INT.getType()) + .addStructField("state_reason") + .addChildField("state_reason", "message", Types.MinorType.VARCHAR.getType()) + .addChildField("state_reason", "code", Types.MinorType.VARCHAR.getType()) + + //Example of a List of Structs + .addField( + FieldBuilder.newBuilder("network_interfaces", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("interface", Types.MinorType.STRUCT.getType()) + .addStringField("status") + .addStringField("subnet") + .addStringField("vpc") + .addStringField("mac") + .addStringField("private_dns") + .addStringField("private_ip") + .addListField("security_groups", Types.MinorType.VARCHAR.getType()) + .addStringField("interface_id") + .build()) + .build()) + .addBitField("ebs_optimized") + .addListField("security_groups", Types.MinorType.VARCHAR.getType()) + .addListField("security_group_names", Types.MinorType.VARCHAR.getType()) + .addListField("ebs_volumes", Types.MinorType.VARCHAR.getType()) + .addMetadata("instance_id", "EC2 Instance id.") + .addMetadata("image_id", "The id of the AMI used to boot the instance.") + .addMetadata("instance_type", "The EC2 instance type,") + .addMetadata("platform", "The platform of the instance (e.g. Linux)") + .addMetadata("private_dns_name", "The private dns name of the instance.") + .addMetadata("private_ip_address", "The private ip address of the instance.") + .addMetadata("public_dns_name", "The public dns name of the instance.") + .addMetadata("public_ip_address", "The public ip address of the instance.") + .addMetadata("subnet_id", "The subnet id that the instance was launched in.") + .addMetadata("vpc_id", "The id of the VPC that the instance was launched in.") + .addMetadata("architecture", "The architecture of the instance (e.g. x86).") + .addMetadata("instance_lifecycle", "The lifecycle state of the instance.") + .addMetadata("root_device_name", "The name of the root device that the instance booted from.") + .addMetadata("root_device_type", "The type of the root device that the instance booted from.") + .addMetadata("spot_instance_requestId", "Spot Request ID if the instance was launched via spot. ") + .addMetadata("virtualization_type", "The type of virtualization used by the instance (e.g. HVM)") + .addMetadata("key_name", "The name of the ec2 instance from the name tag.") + .addMetadata("kernel_id", "The id of the kernel used in the AMI that booted the instance.") + .addMetadata("capacity_reservation_id", "Capacity reservation id that this instance was launched against.") + .addMetadata("launch_time", "The time that the instance was launched at.") + .addMetadata("state", "The state of the ec2 instance.") + .addMetadata("state_reason", "The reason for the 'state' associated with the instance.") + .addMetadata("ebs_optimized", "True if the instance is EBS optimized.") + .addMetadata("network_interfaces", "The list of the network interfaces on the instance.") + .addMetadata("security_groups", "The list of security group (ids) attached to this instance.") + .addMetadata("security_group_names", "The list of security group (names) attached to this instance.") + .addMetadata("ebs_volumes", "The list of ebs volume (ids) attached to this instance.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/ImagesTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/ImagesTableProvider.java new file mode 100644 index 0000000000..8c7bc7a4e0 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/ImagesTableProvider.java @@ -0,0 +1,288 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.BlockDeviceMapping; +import com.amazonaws.services.ec2.model.DescribeImagesRequest; +import com.amazonaws.services.ec2.model.DescribeImagesResult; +import com.amazonaws.services.ec2.model.EbsBlockDevice; +import com.amazonaws.services.ec2.model.Image; +import com.amazonaws.services.ec2.model.Tag; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.List; + +/** + * Maps your EC2 images (aka AMIs) to a table. + */ +public class ImagesTableProvider + implements TableProvider +{ + private static final String DEFAULT_OWNER_ENV = "default_ec2_image_owner"; + private static final int MAX_IMAGES = 1000; + //Sets a default owner filter (when not null) to reduce the number of irrelevant AMIs returned when you do not + //query for a specific owner. + private static final String DEFAULT_OWNER = System.getenv(DEFAULT_OWNER_ENV); + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public ImagesTableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "ec2_images"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeImagess on the AWS EC2 Client returning all images that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific volumes) to EC2. + * + * @note Because of the large number of public AMIs we also support using a default 'owner' filter if your query doesn't + * filter on owner itself. You can set this using an env variable on your Lambda function defined by DEFAULT_OWNER_ENV. + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + DescribeImagesRequest request = new DescribeImagesRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("id"); + ValueSet ownerConstraint = recordsRequest.getConstraints().getSummary().get("owner"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setImageIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + else if (ownerConstraint != null && ownerConstraint.isSingleValue()) { + request.setOwners(Collections.singletonList(ownerConstraint.getSingleValue().toString())); + } + else if (DEFAULT_OWNER != null) { + request.setOwners(Collections.singletonList(DEFAULT_OWNER)); + } + else { + throw new RuntimeException("A default owner account must be set or the query must have owner" + + "in the where clause with exactly 1 value otherwise results may be too big."); + } + + DescribeImagesResult response = ec2.describeImages(request); + + int count = 0; + for (Image next : response.getImages()) { + if (count++ > MAX_IMAGES) { + throw new RuntimeException("Too many images returned, add an owner or id filter."); + } + instanceToRow(next, spiller); + } + } + + /** + * Maps an EC2 Image (AMI) into a row in our Apache Arrow response block(s). + * + * @param image The EC2 Image (AMI) to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(Image image, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("id", row, image.getImageId()); + matched &= block.offerValue("architecture", row, image.getArchitecture()); + matched &= block.offerValue("created", row, image.getCreationDate()); + matched &= block.offerValue("description", row, image.getDescription()); + matched &= block.offerValue("hypervisor", row, image.getHypervisor()); + matched &= block.offerValue("location", row, image.getImageLocation()); + matched &= block.offerValue("type", row, image.getImageType()); + matched &= block.offerValue("kernel", row, image.getKernelId()); + matched &= block.offerValue("name", row, image.getName()); + matched &= block.offerValue("owner", row, image.getOwnerId()); + matched &= block.offerValue("platform", row, image.getPlatform()); + matched &= block.offerValue("ramdisk", row, image.getRamdiskId()); + matched &= block.offerValue("root_device", row, image.getRootDeviceName()); + matched &= block.offerValue("root_type", row, image.getRootDeviceType()); + matched &= block.offerValue("srvio_net", row, image.getSriovNetSupport()); + matched &= block.offerValue("state", row, image.getState()); + matched &= block.offerValue("virt_type", row, image.getVirtualizationType()); + matched &= block.offerValue("is_public", row, image.getPublic()); + + List tags = image.getTags(); + matched &= block.offerComplexValue("tags", + row, + (Field field, Object val) -> { + if (field.getName().equals("key")) { + return ((Tag) val).getKey(); + } + else if (field.getName().equals("value")) { + return ((Tag) val).getValue(); + } + + throw new RuntimeException("Unexpected field " + field.getName()); + }, + tags); + + matched &= block.offerComplexValue("block_devices", + row, + (Field field, Object val) -> { + if (field.getName().equals("dev_name")) { + return ((BlockDeviceMapping) val).getDeviceName(); + } + else if (field.getName().equals("no_device")) { + return ((BlockDeviceMapping) val).getNoDevice(); + } + else if (field.getName().equals("virt_name")) { + return ((BlockDeviceMapping) val).getVirtualName(); + } + else if (field.getName().equals("ebs")) { + return ((BlockDeviceMapping) val).getEbs(); + } + else if (field.getName().equals("ebs_size")) { + return ((EbsBlockDevice) val).getVolumeSize(); + } + else if (field.getName().equals("ebs_iops")) { + return ((EbsBlockDevice) val).getIops(); + } + else if (field.getName().equals("ebs_type")) { + return ((EbsBlockDevice) val).getVolumeType(); + } + else if (field.getName().equals("ebs_kms_key")) { + return ((EbsBlockDevice) val).getKmsKeyId(); + } + + throw new RuntimeException("Unexpected field " + field.getName()); + }, + image.getBlockDeviceMappings()); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("id") + .addStringField("architecture") + .addStringField("created") + .addStringField("description") + .addStringField("hypervisor") + .addStringField("location") + .addStringField("type") + .addStringField("kernel") + .addStringField("name") + .addStringField("owner") + .addStringField("platform") + .addStringField("ramdisk") + .addStringField("root_device") + .addStringField("root_type") + .addStringField("srvio_net") + .addStringField("state") + .addStringField("virt_type") + .addBitField("is_public") + .addField( + FieldBuilder.newBuilder("tags", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("tag", Types.MinorType.STRUCT.getType()) + .addStringField("key") + .addStringField("value") + .build()) + .build()) + .addField( + FieldBuilder.newBuilder("block_devices", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("device", Types.MinorType.STRUCT.getType()) + .addStringField("dev_name") + .addStringField("no_device") + .addStringField("virt_name") + .addField( + FieldBuilder.newBuilder("ebs", Types.MinorType.STRUCT.getType()) + .addIntField("ebs_size") + .addIntField("ebs_iops") + .addStringField("ebs_type") + .addStringField("ebs_kms_key") + .build()) + .build()) + .build()) + .addMetadata("id", "The id of the image.") + .addMetadata("architecture", "The architecture required to run the image.") + .addMetadata("created", "The date and time the image was created.") + .addMetadata("description", "The description associated with the image.") + .addMetadata("hypervisor", "The type of hypervisor required by the image.") + .addMetadata("location", "The location of the image.") + .addMetadata("type", "The type of image.") + .addMetadata("kernel", "The kernel used by the image.") + .addMetadata("name", "The name of the image.") + .addMetadata("owner", "The owner of the image.") + .addMetadata("platform", "The platform required by the image.") + .addMetadata("ramdisk", "Detailed of the ram disk used by the image.") + .addMetadata("root_device", "The root device used by the image.") + .addMetadata("root_type", "The type of root device required by the image.") + .addMetadata("srvio_net", "Details of srvio network support in the image.") + .addMetadata("state", "The state of the image.") + .addMetadata("virt_type", "The type of virtualization supported by the image.") + .addMetadata("is_public", "True if the image is publically available.") + .addMetadata("tags", "Tags associated with the image.") + .addMetadata("block_devices", "Block devices required by the image.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/RouteTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/RouteTableProvider.java new file mode 100644 index 0000000000..24583be45e --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/RouteTableProvider.java @@ -0,0 +1,215 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeRouteTablesRequest; +import com.amazonaws.services.ec2.model.DescribeRouteTablesResult; +import com.amazonaws.services.ec2.model.Route; +import com.amazonaws.services.ec2.model.RouteTable; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your EC2 RouteTable entries (routes) to a table. + */ +public class RouteTableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public RouteTableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "routing_tables"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeRouteTables on the AWS EC2 Client returning all Routes that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific RoutingTables) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + boolean done = false; + DescribeRouteTablesRequest request = new DescribeRouteTablesRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("route_table_id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setRouteTableIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + + while (!done) { + DescribeRouteTablesResult response = ec2.describeRouteTables(request); + + for (RouteTable nextRouteTable : response.getRouteTables()) { + for (Route route : nextRouteTable.getRoutes()) { + instanceToRow(nextRouteTable, route, spiller); + } + } + + request.setNextToken(response.getNextToken()); + + if (response.getNextToken() == null || !queryStatusChecker.isQueryRunning()) { + done = true; + } + } + } + + /** + * Maps an EC2 Route into a row in our Apache Arrow response block(s). + * + * @param routeTable The RouteTable that owns the given Route. + * @param route The Route to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(RouteTable routeTable, + Route route, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("route_table_id", row, routeTable.getRouteTableId()); + matched &= block.offerValue("owner", row, routeTable.getOwnerId()); + matched &= block.offerValue("vpc", row, routeTable.getVpcId()); + matched &= block.offerValue("dst_cidr", row, route.getDestinationCidrBlock()); + matched &= block.offerValue("dst_cidr_v6", row, route.getDestinationIpv6CidrBlock()); + matched &= block.offerValue("dst_prefix_list", row, route.getDestinationPrefixListId()); + matched &= block.offerValue("egress_igw", row, route.getEgressOnlyInternetGatewayId()); + matched &= block.offerValue("gateway", row, route.getGatewayId()); + matched &= block.offerValue("instance_id", row, route.getInstanceId()); + matched &= block.offerValue("instance_owner", row, route.getInstanceOwnerId()); + matched &= block.offerValue("nat_gateway", row, route.getNatGatewayId()); + matched &= block.offerValue("interface", row, route.getNetworkInterfaceId()); + matched &= block.offerValue("origin", row, route.getOrigin()); + matched &= block.offerValue("state", row, route.getState()); + matched &= block.offerValue("transit_gateway", row, route.getTransitGatewayId()); + matched &= block.offerValue("vpc_peering_con", row, route.getVpcPeeringConnectionId()); + + List associations = routeTable.getAssociations().stream() + .map(next -> next.getSubnetId() + ":" + next.getRouteTableId()).collect(Collectors.toList()); + matched &= block.offerComplexValue("associations", row, FieldResolver.DEFAULT, associations); + + List tags = routeTable.getTags().stream() + .map(next -> next.getKey() + ":" + next.getValue()).collect(Collectors.toList()); + matched &= block.offerComplexValue("tags", row, FieldResolver.DEFAULT, tags); + + List propagatingVgws = routeTable.getPropagatingVgws().stream() + .map(next -> next.getGatewayId()).collect(Collectors.toList()); + matched &= block.offerComplexValue("propagating_vgws", row, FieldResolver.DEFAULT, propagatingVgws); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("route_table_id") + .addStringField("owner") + .addStringField("vpc") + .addListField("associations", Types.MinorType.VARCHAR.getType()) + .addListField("tags", Types.MinorType.VARCHAR.getType()) + .addListField("propagating_vgws", Types.MinorType.VARCHAR.getType()) + .addStringField("dst_cidr") + .addStringField("dst_cidr_v6") + .addStringField("dst_prefix_list") + .addStringField("egress_igw") + .addStringField("gateway") + .addStringField("instance_id") + .addStringField("instance_owner") + .addStringField("nat_gateway") + .addStringField("interface") + .addStringField("origin") + .addStringField("state") + .addStringField("transit_gateway") + .addStringField("vpc_peering_con") + .addMetadata("route_table_id", "Id of the route table the route belongs to.") + .addMetadata("owner", "Owner of the route table.") + .addMetadata("vpc", "VPC the route table is associated with.") + .addMetadata("associations", "List of associations for this route table.") + .addMetadata("tags", "Tags on the route table.") + .addMetadata("propagating_vgws", "Vgws the route table propogates through.") + .addMetadata("dst_cidr", "Destination IPv4 CIDR block for the route.") + .addMetadata("dst_cidr_v6", "Destination IPv6 CIDR block for the route.") + .addMetadata("dst_prefix_list", "Destination prefix list for the route.") + .addMetadata("egress_igw", "Egress gateway for the route.") + .addMetadata("gateway", "Gateway for the route.") + .addMetadata("instance_id", "Instance id of the route.") + .addMetadata("instance_owner", "Owner of the route.") + .addMetadata("nat_gateway", "NAT gateway used by the route.") + .addMetadata("interface", "Interface associated with the route.") + .addMetadata("origin", "Origin of the route.") + .addMetadata("state", "State of the route.") + .addMetadata("transit_gateway", "Transit Gateway associated with the route.") + .addMetadata("vpc_peering_con", "VPC Peering connection associated with the route.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SecurityGroupsTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SecurityGroupsTableProvider.java new file mode 100644 index 0000000000..8f4f6dd3c3 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SecurityGroupsTableProvider.java @@ -0,0 +1,209 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeSecurityGroupsRequest; +import com.amazonaws.services.ec2.model.DescribeSecurityGroupsResult; +import com.amazonaws.services.ec2.model.IpPermission; +import com.amazonaws.services.ec2.model.SecurityGroup; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your EC2 SecurityGroups to a table. + */ +public class SecurityGroupsTableProvider + implements TableProvider +{ + private static final String INGRESS = "ingress"; + private static final String EGRESS = "egress"; + + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public SecurityGroupsTableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "security_groups"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeSecurityGroups on the AWS EC2 Client returning all SecurityGroup rules that match the supplied + * predicate and attempting to push down certain predicates (namely queries for specific SecurityGroups) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + boolean done = false; + DescribeSecurityGroupsRequest request = new DescribeSecurityGroupsRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setGroupIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + + ValueSet nameConstraint = recordsRequest.getConstraints().getSummary().get("name"); + if (nameConstraint != null && nameConstraint.isSingleValue()) { + request.setGroupNames(Collections.singletonList(nameConstraint.getSingleValue().toString())); + } + + while (!done) { + DescribeSecurityGroupsResult response = ec2.describeSecurityGroups(request); + + //Each rule is mapped to a row in the response. SGs have INGRESS and EGRESS rules. + for (SecurityGroup next : response.getSecurityGroups()) { + for (IpPermission nextPerm : next.getIpPermissions()) { + instanceToRow(next, nextPerm, INGRESS, spiller); + } + + for (IpPermission nextPerm : next.getIpPermissionsEgress()) { + instanceToRow(next, nextPerm, EGRESS, spiller); + } + } + + request.setNextToken(response.getNextToken()); + if (response.getNextToken() == null || !queryStatusChecker.isQueryRunning()) { + done = true; + } + } + } + + /** + * Maps an each SecurityGroup rule (aka IpPermission) to a row in the response. + * + * @param securityGroup The SecurityGroup that owns the permission entry. + * @param permission The permission entry (aka rule) to map. + * @param direction The direction (EGRESS or INGRESS) of the rule. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(SecurityGroup securityGroup, + IpPermission permission, + String direction, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("id", row, securityGroup.getGroupId()); + matched &= block.offerValue("name", row, securityGroup.getGroupName()); + matched &= block.offerValue("description", row, securityGroup.getDescription()); + matched &= block.offerValue("from_port", row, permission.getFromPort()); + matched &= block.offerValue("to_port", row, permission.getFromPort()); + matched &= block.offerValue("protocol", row, permission.getIpProtocol()); + matched &= block.offerValue("direction", row, permission.getIpProtocol()); + + List ipv4Ranges = permission.getIpv4Ranges().stream() + .map(next -> next.getCidrIp() + ":" + next.getDescription()).collect(Collectors.toList()); + matched &= block.offerComplexValue("ipv4_ranges", row, FieldResolver.DEFAULT, ipv4Ranges); + + List ipv6Ranges = permission.getIpv6Ranges().stream() + .map(next -> next.getCidrIpv6() + ":" + next.getDescription()).collect(Collectors.toList()); + matched &= block.offerComplexValue("ipv6_ranges", row, FieldResolver.DEFAULT, ipv6Ranges); + + List prefixLists = permission.getPrefixListIds().stream() + .map(next -> next.getPrefixListId() + ":" + next.getDescription()).collect(Collectors.toList()); + matched &= block.offerComplexValue("prefix_lists", row, FieldResolver.DEFAULT, prefixLists); + + List userIdGroups = permission.getUserIdGroupPairs().stream() + .map(next -> next.getUserId() + ":" + next.getGroupId()) + .collect(Collectors.toList()); + matched &= block.offerComplexValue("user_id_groups", row, FieldResolver.DEFAULT, userIdGroups); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("id") + .addStringField("name") + .addStringField("description") + .addIntField("from_port") + .addIntField("to_port") + .addStringField("protocol") + .addStringField("direction") + .addListField("ipv4_ranges", Types.MinorType.VARCHAR.getType()) + .addListField("ipv6_ranges", Types.MinorType.VARCHAR.getType()) + .addListField("prefix_lists", Types.MinorType.VARCHAR.getType()) + .addListField("user_id_groups", Types.MinorType.VARCHAR.getType()) + .addMetadata("id", "Security Group ID.") + .addMetadata("name", "Name of the security group.") + .addMetadata("description", "Description of the security group.") + .addMetadata("from_port", "Beginging of the port range covered by this security group.") + .addMetadata("to_port", "Ending of the port range covered by this security group.") + .addMetadata("protocol", "The network protocol covered by this security group.") + .addMetadata("direction", "Notes if the rule applies inbound (ingress) or outbound (egress).") + .addMetadata("ipv4_ranges", "The ip v4 ranges covered by this security group.") + .addMetadata("ipv6_ranges", "The ip v6 ranges covered by this security group.") + .addMetadata("prefix_lists", "The prefix lists covered by this security group.") + .addMetadata("user_id_groups", "The user id groups covered by this security group.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SubnetTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SubnetTableProvider.java new file mode 100644 index 0000000000..f64bb9bd26 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SubnetTableProvider.java @@ -0,0 +1,168 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeSubnetsRequest; +import com.amazonaws.services.ec2.model.DescribeSubnetsResult; +import com.amazonaws.services.ec2.model.Subnet; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your EC2 Subnets to a table. + */ +public class SubnetTableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public SubnetTableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "subnets"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeSubnets on the AWS EC2 Client returning all subnets that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific subnet) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + DescribeSubnetsRequest request = new DescribeSubnetsRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setSubnetIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + + DescribeSubnetsResult response = ec2.describeSubnets(request); + for (Subnet subnet : response.getSubnets()) { + instanceToRow(subnet, spiller); + } + } + + /** + * Maps an EC2 Subnet into a row in our Apache Arrow response block(s). + * + * @param subnet The EC2 Subnet to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(Subnet subnet, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("id", row, subnet.getSubnetId()); + matched &= block.offerValue("availability_zone", row, subnet.getAvailabilityZone()); + matched &= block.offerValue("available_ip_count", row, subnet.getAvailableIpAddressCount()); + matched &= block.offerValue("cidr_block", row, subnet.getCidrBlock()); + matched &= block.offerValue("default_for_az", row, subnet.getDefaultForAz()); + matched &= block.offerValue("map_public_ip", row, subnet.getMapPublicIpOnLaunch()); + matched &= block.offerValue("owner", row, subnet.getOwnerId()); + matched &= block.offerValue("state", row, subnet.getState()); + matched &= block.offerValue("vpc", row, subnet.getVpcId()); + matched &= block.offerValue("vpc", row, subnet.getVpcId()); + + List tags = subnet.getTags().stream() + .map(next -> next.getKey() + ":" + next.getValue()).collect(Collectors.toList()); + matched &= block.offerComplexValue("tags", row, FieldResolver.DEFAULT, tags); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("id") + .addStringField("availability_zone") + .addIntField("available_ip_count") + .addStringField("cidr_block") + .addBitField("default_for_az") + .addBitField("map_public_ip") + .addStringField("owner") + .addStringField("state") + .addListField("tags", Types.MinorType.VARCHAR.getType()) + .addStringField("vpc") + .addMetadata("id", "Subnet Id") + .addMetadata("availability_zone", "Availability zone the subnet is in.") + .addMetadata("available_ip_count", "Number of available IPs in the subnet.") + .addMetadata("cidr_block", "The CIDR block that the subnet uses to allocate addresses.") + .addMetadata("default_for_az", "True if this is the default subnet for the AZ.") + .addMetadata("map_public_ip", "True if public addresses are signed by default in this subnet.") + .addMetadata("owner", "Owner of the subnet.") + .addMetadata("state", "The state of the subnet.") + .addMetadata("vpc", "The VPC the subnet is part of.") + .addMetadata("tags", "Tags associated with the volume.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/VpcTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/VpcTableProvider.java new file mode 100644 index 0000000000..18087ba5e5 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/VpcTableProvider.java @@ -0,0 +1,161 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeVpcsRequest; +import com.amazonaws.services.ec2.model.DescribeVpcsResult; +import com.amazonaws.services.ec2.model.Vpc; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.List; +import java.util.stream.Collectors; + +/** + * Maps your VPCs to a table. + */ +public class VpcTableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonEC2 ec2; + + public VpcTableProvider(AmazonEC2 ec2) + { + this.ec2 = ec2; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "ec2"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "vpcs"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeVPCs on the AWS EC2 Client returning all VPCs that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific VPCs) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + DescribeVpcsRequest request = new DescribeVpcsRequest(); + + ValueSet idConstraint = recordsRequest.getConstraints().getSummary().get("id"); + if (idConstraint != null && idConstraint.isSingleValue()) { + request.setVpcIds(Collections.singletonList(idConstraint.getSingleValue().toString())); + } + + DescribeVpcsResult response = ec2.describeVpcs(request); + for (Vpc vpc : response.getVpcs()) { + instanceToRow(vpc, spiller); + } + } + + /** + * Maps a VPC into a row in our Apache Arrow response block(s). + * + * @param vpc The VPCs to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void instanceToRow(Vpc vpc, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + + matched &= block.offerValue("id", row, vpc.getVpcId()); + matched &= block.offerValue("cidr_block", row, vpc.getCidrBlock()); + matched &= block.offerValue("dhcp_opts", row, vpc.getDhcpOptionsId()); + matched &= block.offerValue("tenancy", row, vpc.getInstanceTenancy()); + matched &= block.offerValue("owner", row, vpc.getOwnerId()); + matched &= block.offerValue("state", row, vpc.getState()); + matched &= block.offerValue("is_default", row, vpc.getIsDefault()); + + List tags = vpc.getTags().stream() + .map(next -> next.getKey() + ":" + next.getValue()).collect(Collectors.toList()); + matched &= block.offerComplexValue("tags", row, FieldResolver.DEFAULT, tags); + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("id") + .addStringField("cidr_block") + .addStringField("dhcp_opts") + .addStringField("tenancy") + .addStringField("owner") + .addStringField("state") + .addBitField("is_default") + .addListField("tags", Types.MinorType.VARCHAR.getType()) + .addMetadata("id", "VPC Id") + .addMetadata("cidr_block", "CIDR block used to vend IPs for the VPC.") + .addMetadata("dhcp_opts", "DHCP options used for DNS resolution in the VPC.") + .addMetadata("tenancy", "EC2 Instance tenancy of this VPC (e.g. dedicated)") + .addMetadata("owner", "The owner of the VPC.") + .addMetadata("state", "The state of the VPC.") + .addMetadata("is_default", "True if the VPC is the default VPC.") + .addMetadata("tags", "Tags associated with the volume.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3BucketsTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3BucketsTableProvider.java new file mode 100644 index 0000000000..0387ac6bf7 --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3BucketsTableProvider.java @@ -0,0 +1,133 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.s3; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.Bucket; +import com.amazonaws.services.s3.model.Owner; +import org.apache.arrow.vector.types.pojo.Schema; + +/** + * Maps your S3 Objects to a table. + */ +public class S3BucketsTableProvider + implements TableProvider +{ + private static final Schema SCHEMA; + private AmazonS3 amazonS3; + + public S3BucketsTableProvider(AmazonS3 amazonS3) + { + this.amazonS3 = amazonS3; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "s3"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "buckets"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeDBInstances on the AWS RDS Client returning all DB Instances that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific DB Instance) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + for (Bucket next : amazonS3.listBuckets()) { + toRow(next, spiller); + } + } + + /** + * Maps a DBInstance into a row in our Apache Arrow response block(s). + * + * @param bucket The S3 Bucket to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void toRow(Bucket bucket, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + matched &= block.offerValue("bucket_name", row, bucket.getName()); + matched &= block.offerValue("create_date", row, bucket.getCreationDate()); + + Owner owner = bucket.getOwner(); + if (owner != null) { + matched &= block.offerValue("owner_name", row, bucket.getOwner().getDisplayName()); + matched &= block.offerValue("owner_id", row, bucket.getOwner().getId()); + } + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("bucket_name") + .addDateMilliField("create_date") + .addStringField("owner_name") + .addStringField("owner_id") + .addMetadata("bucket_name", "The name of the bucket that this object is in.") + .addMetadata("create_date", "The time the bucket was created.") + .addMetadata("owner_name", "The owner name of the object.") + .addMetadata("owner_id", "The owner_id of the object.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3ObjectsTableProvider.java b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3ObjectsTableProvider.java new file mode 100644 index 0000000000..c58315f49e --- /dev/null +++ b/athena-aws-cmdb/src/main/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3ObjectsTableProvider.java @@ -0,0 +1,166 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.s3; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.ListObjectsV2Request; +import com.amazonaws.services.s3.model.ListObjectsV2Result; +import com.amazonaws.services.s3.model.Owner; +import com.amazonaws.services.s3.model.S3ObjectSummary; +import org.apache.arrow.vector.types.pojo.Schema; + +/** + * Maps your S3 Objects to a table. + */ +public class S3ObjectsTableProvider + implements TableProvider +{ + private static final int MAX_KEYS = 1000; + private static final Schema SCHEMA; + private AmazonS3 amazonS3; + + public S3ObjectsTableProvider(AmazonS3 amazonS3) + { + this.amazonS3 = amazonS3; + } + + /** + * @See TableProvider + */ + @Override + public String getSchema() + { + return "s3"; + } + + /** + * @See TableProvider + */ + @Override + public TableName getTableName() + { + return new TableName(getSchema(), "objects"); + } + + /** + * @See TableProvider + */ + @Override + public GetTableResponse getTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + return new GetTableResponse(getTableRequest.getCatalogName(), getTableName(), SCHEMA); + } + + /** + * Calls DescribeDBInstances on the AWS RDS Client returning all DB Instances that match the supplied predicate and attempting + * to push down certain predicates (namely queries for specific DB Instance) to EC2. + * + * @See TableProvider + */ + @Override + public void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + ValueSet bucketConstraint = recordsRequest.getConstraints().getSummary().get("bucket_name"); + String bucket; + if (bucketConstraint != null && bucketConstraint.isSingleValue()) { + bucket = bucketConstraint.getSingleValue().toString(); + } + else { + throw new IllegalArgumentException("Queries against the objects table must filter on a single bucket " + + "(e.g. where bucket_name='my_bucket'."); + } + + ListObjectsV2Request req = new ListObjectsV2Request().withBucketName(bucket).withMaxKeys(MAX_KEYS); + ListObjectsV2Result result; + do { + result = amazonS3.listObjectsV2(req); + for (S3ObjectSummary objectSummary : result.getObjectSummaries()) { + toRow(objectSummary, spiller); + } + req.setContinuationToken(result.getNextContinuationToken()); + } + while (result.isTruncated() && queryStatusChecker.isQueryRunning()); + } + + /** + * Maps a DBInstance into a row in our Apache Arrow response block(s). + * + * @param objectSummary The S3 ObjectSummary to map. + * @param spiller The BlockSpiller to use when we want to write a matching row to the response. + * @note The current implementation is rather naive in how it maps fields. It leverages a static + * list of fields that we'd like to provide and then explicitly filters and converts each field. + */ + private void toRow(S3ObjectSummary objectSummary, + BlockSpiller spiller) + { + spiller.writeRows((Block block, int row) -> { + boolean matched = true; + matched &= block.offerValue("bucket_name", row, objectSummary.getBucketName()); + matched &= block.offerValue("e_tag", row, objectSummary.getETag()); + matched &= block.offerValue("key", row, objectSummary.getKey()); + matched &= block.offerValue("bytes", row, objectSummary.getSize()); + matched &= block.offerValue("storage_class", row, objectSummary.getStorageClass()); + matched &= block.offerValue("last_modified", row, objectSummary.getLastModified()); + + Owner owner = objectSummary.getOwner(); + if (owner != null) { + matched &= block.offerValue("owner_name", row, owner.getDisplayName()); + matched &= block.offerValue("owner_id", row, owner.getId()); + } + + return matched ? 1 : 0; + }); + } + + /** + * Defines the schema of this table. + */ + static { + SCHEMA = SchemaBuilder.newBuilder() + .addStringField("bucket_name") + .addStringField("key") + .addStringField("e_tag") + .addBigIntField("bytes") + .addStringField("storage_class") + .addDateMilliField("last_modified") + .addStringField("owner_name") + .addStringField("owner_id") + .addMetadata("bucket_name", "The name of the bucket that this object is in.") + .addMetadata("key", "The key of the object.") + .addMetadata("e_tag", "eTag of the Object.") + .addMetadata("bytes", "The size of the object in bytes.") + .addMetadata("storage_class", "The storage class of the object.") + .addMetadata("last_modified", "The last time the object was modified.") + .addMetadata("owner_name", "The owner name of the object.") + .addMetadata("owner_id", "The owner_id of the object.") + .build(); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbMetadataHandlerTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbMetadataHandlerTest.java new file mode 100644 index 0000000000..909ceda41d --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbMetadataHandlerTest.java @@ -0,0 +1,207 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.verifyNoMoreInteractions; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class AwsCmdbMetadataHandlerTest +{ + private String catalog = "catalog"; + private String bucket = "bucket"; + private String prefix = "prefix"; + private String queryId = "queryId"; + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + + @Mock + private AmazonS3 mockS3; + + @Mock + private TableProviderFactory mockTableProviderFactory; + + @Mock + private Constraints mockConstraints; + + @Mock + private TableProvider mockTableProvider1; + + @Mock + private TableProvider mockTableProvider2; + + @Mock + private TableProvider mockTableProvider3; + + private BlockAllocator blockAllocator; + + @Mock + private Block mockBlock; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + private AwsCmdbMetadataHandler handler; + + @Before + public void setUp() + throws Exception + { + blockAllocator = new BlockAllocatorImpl(); + Map tableProviderMap = new HashMap<>(); + tableProviderMap.putIfAbsent(new TableName("schema1", "table1"), mockTableProvider1); + tableProviderMap.putIfAbsent(new TableName("schema1", "table2"), mockTableProvider2); + tableProviderMap.putIfAbsent(new TableName("schema2", "table1"), mockTableProvider3); + + when(mockTableProviderFactory.getTableProviders()).thenReturn(tableProviderMap); + + Map> schemas = new HashMap<>(); + schemas.put("schema1", new ArrayList<>()); + schemas.put("schema2", new ArrayList<>()); + schemas.get("schema1").add(new TableName("schema1", "table1")); + schemas.get("schema1").add(new TableName("schema1", "table2")); + schemas.get("schema2").add(new TableName("schema2", "table1")); + + when(mockTableProviderFactory.getSchemas()).thenReturn(schemas); + + handler = new AwsCmdbMetadataHandler(mockTableProviderFactory, new LocalKeyFactory(), mockSecretsManager, mockAthena, bucket, prefix); + + verify(mockTableProviderFactory, times(1)).getTableProviders(); + verify(mockTableProviderFactory, times(1)).getSchemas(); + verifyNoMoreInteractions(mockTableProviderFactory); + } + + @After + public void tearDown() + throws Exception + { + blockAllocator.close(); + } + + @Test + public void doListSchemaNames() + { + ListSchemasRequest request = new ListSchemasRequest(identity, queryId, catalog); + ListSchemasResponse response = handler.doListSchemaNames(blockAllocator, request); + + assertEquals(2, response.getSchemas().size()); + assertTrue(response.getSchemas().contains("schema1")); + assertTrue(response.getSchemas().contains("schema2")); + } + + @Test + public void doListTables() + { + ListTablesRequest request = new ListTablesRequest(identity, queryId, catalog, "schema1"); + ListTablesResponse response = handler.doListTables(blockAllocator, request); + + assertEquals(2, response.getTables().size()); + assertTrue(response.getTables().contains(new TableName("schema1", "table1"))); + assertTrue(response.getTables().contains(new TableName("schema1", "table2"))); + } + + @Test + public void doGetTable() + { + GetTableRequest request = new GetTableRequest(identity, queryId, catalog, new TableName("schema1", "table1")); + + when(mockTableProvider1.getTable(eq(blockAllocator), eq(request))).thenReturn(mock(GetTableResponse.class)); + GetTableResponse response = handler.doGetTable(blockAllocator, request); + + assertNotNull(response); + verify(mockTableProvider1, times(1)).getTable(eq(blockAllocator), eq(request)); + } + + @Test + public void doGetTableLayout() + throws Exception + { + GetTableLayoutRequest request = new GetTableLayoutRequest(identity, queryId, catalog, + new TableName("schema1", "table1"), + mockConstraints, + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET); + + GetTableLayoutResponse response = handler.doGetTableLayout(blockAllocator, request); + + assertNotNull(response); + assertEquals(1, response.getPartitions().getRowCount()); + } + + @Test + public void doGetSplits() + { + GetSplitsRequest request = new GetSplitsRequest(identity, queryId, catalog, + new TableName("schema1", "table1"), + mockBlock, + Collections.emptyList(), + new Constraints(new HashMap<>()), + null); + + GetSplitsResponse response = handler.doGetSplits(blockAllocator, request); + + assertNotNull(response); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbRecordHandlerTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbRecordHandlerTest.java new file mode 100644 index 0000000000..5015515721 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/AwsCmdbRecordHandlerTest.java @@ -0,0 +1,124 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; + +import java.util.Collections; +import java.util.UUID; + +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.verifyNoMoreInteractions; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class AwsCmdbRecordHandlerTest +{ + private String bucket = "bucket"; + private String prefix = "prefix"; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + + @Mock + private AmazonS3 mockS3; + + @Mock + private TableProviderFactory mockTableProviderFactory; + + @Mock + private ConstraintEvaluator mockEvaluator; + + @Mock + private BlockSpiller mockBlockSpiller; + + @Mock + private TableProvider mockTableProvider; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Mock + private QueryStatusChecker queryStatusChecker; + + private AwsCmdbRecordHandler handler; + + @Before + public void setUp() + throws Exception + { + when(mockTableProviderFactory.getTableProviders()) + .thenReturn(Collections.singletonMap(new TableName("schema", "table"), mockTableProvider)); + + handler = new AwsCmdbRecordHandler(mockS3, mockSecretsManager, mockAthena, mockTableProviderFactory); + + verify(mockTableProviderFactory, times(1)).getTableProviders(); + verifyNoMoreInteractions(mockTableProviderFactory); + + when(queryStatusChecker.isQueryRunning()).thenReturn(true); + } + + @Test + public void readWithConstraint() + { + ReadRecordsRequest request = new ReadRecordsRequest(identity, "catalog", + "queryId", + new TableName("schema", "table"), + SchemaBuilder.newBuilder().build(), + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket(bucket) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), keyFactory.create()).build(), + new Constraints(Collections.EMPTY_MAP), + 100_000, + 100_000); + + handler.readWithConstraint(mockBlockSpiller, request, queryStatusChecker); + + verify(mockTableProvider, times(1)).readWithConstraint(any(BlockSpiller.class), eq(request), eq(queryStatusChecker)); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/TableProviderFactoryTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/TableProviderFactoryTest.java new file mode 100644 index 0000000000..cea1c54fa4 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/TableProviderFactoryTest.java @@ -0,0 +1,85 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb; + +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce; +import com.amazonaws.services.rds.AmazonRDS; +import com.amazonaws.services.s3.AmazonS3; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; + +import java.util.List; +import java.util.Map; + +import static org.junit.Assert.*; + +@RunWith(MockitoJUnitRunner.class) +public class TableProviderFactoryTest +{ + private int expectedSchemas = 4; + private int expectedTables = 11; + + @Mock + private AmazonEC2 mockEc2; + + @Mock + private AmazonElasticMapReduce mockEmr; + + @Mock + private AmazonRDS mockRds; + + @Mock + private AmazonS3 amazonS3; + + private TableProviderFactory factory = new TableProviderFactory(mockEc2, mockEmr, mockRds, amazonS3); + + @Test + public void getTableProviders() + { + int count = 0; + for (Map.Entry next : factory.getTableProviders().entrySet()) { + assertEquals(next.getKey(), next.getValue().getTableName()); + assertEquals(next.getKey().getSchemaName(), next.getValue().getSchema()); + count++; + } + assertEquals(expectedTables, count); + } + + @Test + public void getSchemas() + { + int schemas = 0; + int tables = 0; + for (Map.Entry> next : factory.getSchemas().entrySet()) { + for (TableName nextTableName : next.getValue()) { + assertEquals(next.getKey(), nextTableName.getSchemaName()); + tables++; + } + schemas++; + } + assertEquals(expectedSchemas, schemas); + assertEquals(expectedTables, tables); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/AbstractTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/AbstractTableProviderTest.java new file mode 100644 index 0000000000..9ed15516fd --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/AbstractTableProviderTest.java @@ -0,0 +1,262 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SpillConfig; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public abstract class AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(AbstractTableProviderTest.class); + + private BlockAllocator allocator; + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String idField = getIdField(); + private String idValue = getIdValue(); + private String expectedQuery = "queryId"; + private String expectedCatalog = "catalog"; + private String expectedSchema = getExpectedSchema(); + private String expectedTable = getExpectedTable(); + private TableName expectedTableName = new TableName(expectedSchema, expectedTable); + + private TableProvider provider; + + private final List mockS3Store = new ArrayList<>(); + + @Mock + private AmazonS3 amazonS3; + + @Mock + private QueryStatusChecker queryStatusChecker; + + private S3BlockSpillReader blockSpillReader; + + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + + protected abstract String getIdField(); + + protected abstract String getIdValue(); + + protected abstract String getExpectedSchema(); + + protected abstract String getExpectedTable(); + + protected abstract TableProvider setUpSource(); + + protected abstract void setUpRead(); + + protected abstract int getExpectedRows(); + + protected abstract void validateRow(Block block, int pos); + + @Before + public void setUp() + { + allocator = new BlockAllocatorImpl(); + + when(amazonS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Store.add(byteHolder); + return mock(PutObjectResult.class); + }); + + when(amazonS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Store.get(0); + mockS3Store.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + }); + + blockSpillReader = new S3BlockSpillReader(amazonS3, allocator); + + provider = setUpSource(); + + when(queryStatusChecker.isQueryRunning()).thenReturn(true); + } + + @After + public void after() + { + mockS3Store.clear(); + allocator.close(); + } + + @Test + public void getSchema() + { + assertEquals(expectedSchema, provider.getSchema()); + } + + @Test + public void getTableName() + { + assertEquals(expectedTableName, provider.getTableName()); + } + + @Test + public void readTableTest() + { + GetTableRequest request = new GetTableRequest(identity, expectedQuery, expectedCatalog, expectedTableName); + GetTableResponse response = provider.getTable(allocator, request); + assertTrue(response.getSchema().getFields().size() > 1); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put(idField, + EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add(idValue).build()); + + Constraints constraints = new Constraints(constraintsMap); + + ConstraintEvaluator evaluator = new ConstraintEvaluator(allocator, response.getSchema(), constraints); + + S3SpillLocation spillLocation = S3SpillLocation.newBuilder() + .withBucket("bucket") + .withPrefix("prefix") + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + ReadRecordsRequest readRequest = new ReadRecordsRequest(identity, + expectedCatalog, + "queryId", + expectedTableName, + response.getSchema(), + Split.newBuilder(spillLocation, keyFactory.create()).build(), + constraints, + 100_000_000, + 100_000_000); + + SpillConfig spillConfig = SpillConfig.newBuilder() + .withSpillLocation(spillLocation) + .withMaxBlockBytes(3_000_000) + .withMaxInlineBlockBytes(0) + .withRequestId("queryid") + .withEncryptionKey(keyFactory.create()) + .build(); + + setUpRead(); + + BlockSpiller spiller = new S3BlockSpiller(amazonS3, spillConfig, allocator, response.getSchema(), evaluator); + provider.readWithConstraint(spiller, readRequest, queryStatusChecker); + + validateRead(response.getSchema(), blockSpillReader, spiller.getSpillLocations(), spillConfig.getEncryptionKey()); + } + + protected void validateRead(Schema schema, S3BlockSpillReader reader, List locations, EncryptionKey encryptionKey) + { + int blockNum = 0; + int rowNum = 0; + for (SpillLocation next : locations) { + S3SpillLocation spillLocation = (S3SpillLocation) next; + try (Block block = reader.read(spillLocation, encryptionKey, schema)) { + logger.info("validateRead: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount()); + + for (int i = 0; i < block.getRowCount(); i++) { + logger.info("validateRead: {}", BlockUtils.rowToString(block, i)); + rowNum++; + validateRow(block, i); + } + } + catch (Exception ex) { + throw new RuntimeException(ex); + } + } + + assertEquals(getExpectedRows(), rowNum); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/EmrClusterTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/EmrClusterTableProviderTest.java new file mode 100644 index 0000000000..b7d3d75b98 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/EmrClusterTableProviderTest.java @@ -0,0 +1,201 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce; +import com.amazonaws.services.elasticmapreduce.model.Application; +import com.amazonaws.services.elasticmapreduce.model.Cluster; +import com.amazonaws.services.elasticmapreduce.model.ClusterStateChangeReason; +import com.amazonaws.services.elasticmapreduce.model.ClusterStatus; +import com.amazonaws.services.elasticmapreduce.model.ClusterSummary; +import com.amazonaws.services.elasticmapreduce.model.DescribeClusterRequest; +import com.amazonaws.services.elasticmapreduce.model.DescribeClusterResult; +import com.amazonaws.services.elasticmapreduce.model.ListClustersRequest; +import com.amazonaws.services.elasticmapreduce.model.ListClustersResult; +import com.amazonaws.services.elasticmapreduce.model.Tag; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class EmrClusterTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(EmrClusterTableProviderTest.class); + + @Mock + private AmazonElasticMapReduce mockEmr; + + protected String getIdField() + { + return "id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "emr"; + } + + protected String getExpectedTable() + { + return "emr_clusters"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new EmrClusterTableProvider(mockEmr); + } + + @Override + protected void setUpRead() + { + when(mockEmr.listClusters(any(ListClustersRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + ListClustersResult mockResult = mock(ListClustersResult.class); + List values = new ArrayList<>(); + values.add(makeClusterSummary(getIdValue())); + values.add(makeClusterSummary(getIdValue())); + values.add(makeClusterSummary("fake-id")); + when(mockResult.getClusters()).thenReturn(values); + return mockResult; + }); + + when(mockEmr.describeCluster(any(DescribeClusterRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + DescribeClusterRequest request = (DescribeClusterRequest) invocation.getArguments()[0]; + DescribeClusterResult mockResult = mock(DescribeClusterResult.class); + List values = new ArrayList<>(); + values.add(makeClusterSummary(getIdValue())); + values.add(makeClusterSummary(getIdValue())); + values.add(makeClusterSummary("fake-id")); + when(mockResult.getCluster()).thenReturn(makeCluster(request.getClusterId())); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$") || field.getName().equals("direction")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private ClusterSummary makeClusterSummary(String id) + { + return new ClusterSummary() + .withName("name") + .withId(id) + .withStatus(new ClusterStatus() + .withState("state") + .withStateChangeReason(new ClusterStateChangeReason() + .withCode("state_code") + .withMessage("state_msg"))) + .withNormalizedInstanceHours(100); + } + + private Cluster makeCluster(String id) + { + return new Cluster() + .withId(id) + .withName("name") + .withAutoScalingRole("autoscaling_role") + .withCustomAmiId("custom_ami") + .withInstanceCollectionType("instance_collection_type") + .withLogUri("log_uri") + .withMasterPublicDnsName("master_public_dns") + .withReleaseLabel("release_label") + .withRunningAmiVersion("running_ami") + .withScaleDownBehavior("scale_down_behavior") + .withServiceRole("service_role") + .withApplications(new Application().withName("name").withVersion("version")) + .withTags(new Tag("key", "value")); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/RdsTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/RdsTableProviderTest.java new file mode 100644 index 0000000000..f27dec682f --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/RdsTableProviderTest.java @@ -0,0 +1,237 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.services.elasticmapreduce.AmazonElasticMapReduce; +import com.amazonaws.services.elasticmapreduce.model.Application; +import com.amazonaws.services.elasticmapreduce.model.Cluster; +import com.amazonaws.services.elasticmapreduce.model.ClusterStateChangeReason; +import com.amazonaws.services.elasticmapreduce.model.ClusterStatus; +import com.amazonaws.services.elasticmapreduce.model.ClusterSummary; +import com.amazonaws.services.elasticmapreduce.model.DescribeClusterRequest; +import com.amazonaws.services.elasticmapreduce.model.DescribeClusterResult; +import com.amazonaws.services.elasticmapreduce.model.ListClustersRequest; +import com.amazonaws.services.elasticmapreduce.model.ListClustersResult; +import com.amazonaws.services.elasticmapreduce.model.Tag; +import com.amazonaws.services.rds.AmazonRDS; +import com.amazonaws.services.rds.model.DBInstance; +import com.amazonaws.services.rds.model.DBInstanceStatusInfo; +import com.amazonaws.services.rds.model.DBParameterGroup; +import com.amazonaws.services.rds.model.DBParameterGroupStatus; +import com.amazonaws.services.rds.model.DBSecurityGroupMembership; +import com.amazonaws.services.rds.model.DBSubnetGroup; +import com.amazonaws.services.rds.model.DescribeDBInstancesRequest; +import com.amazonaws.services.rds.model.DescribeDBInstancesResult; +import com.amazonaws.services.rds.model.DomainMembership; +import com.amazonaws.services.rds.model.Endpoint; +import com.amazonaws.services.rds.model.Subnet; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Date; +import java.util.List; +import java.util.concurrent.atomic.AtomicLong; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class RdsTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(RdsTableProviderTest.class); + + @Mock + private AmazonRDS mockRds; + + protected String getIdField() + { + return "instance_id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "rds"; + } + + protected String getExpectedTable() + { + return "rds_instances"; + } + + protected int getExpectedRows() + { + return 6; + } + + protected TableProvider setUpSource() + { + return new RdsTableProvider(mockRds); + } + + @Override + protected void setUpRead() + { + final AtomicLong requestCount = new AtomicLong(0); + when(mockRds.describeDBInstances(any(DescribeDBInstancesRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + DescribeDBInstancesResult mockResult = mock(DescribeDBInstancesResult.class); + List values = new ArrayList<>(); + values.add(makeValue(getIdValue())); + values.add(makeValue(getIdValue())); + values.add(makeValue("fake-id")); + when(mockResult.getDBInstances()).thenReturn(values); + + if (requestCount.incrementAndGet() < 3) { + when(mockResult.getMarker()).thenReturn(String.valueOf(requestCount.get())); + } + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + try { + logger.info("validate: {} {}", fieldReader.getField().getName(), fieldReader.getMinorType()); + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + catch (RuntimeException ex) { + throw new RuntimeException("Error validating field " + fieldReader.getField().getName(), ex); + } + } + + private DBInstance makeValue(String id) + { + return new DBInstance() + .withDBInstanceIdentifier(id) + .withAvailabilityZone("primary_az") + .withAllocatedStorage(100) + .withStorageEncrypted(true) + .withBackupRetentionPeriod(100) + .withAutoMinorVersionUpgrade(true) + .withDBInstanceClass("instance_class") + .withDbInstancePort(100) + .withDBInstanceStatus("status") + .withStorageType("storage_type") + .withDbiResourceId("dbi_resource_id") + .withDBName("name") + .withDomainMemberships(new DomainMembership() + .withDomain("domain") + .withFQDN("fqdn") + .withIAMRoleName("iam_role") + .withStatus("status")) + .withEngine("engine") + .withEngineVersion("engine_version") + .withLicenseModel("license_model") + .withSecondaryAvailabilityZone("secondary_az") + .withPreferredBackupWindow("backup_window") + .withPreferredMaintenanceWindow("maint_window") + .withReadReplicaSourceDBInstanceIdentifier("read_replica_source_id") + .withDBParameterGroups(new DBParameterGroupStatus() + .withDBParameterGroupName("name") + .withParameterApplyStatus("status")) + .withDBSecurityGroups(new DBSecurityGroupMembership() + .withDBSecurityGroupName("name") + .withStatus("status")) + .withDBSubnetGroup(new DBSubnetGroup() + .withDBSubnetGroupName("name") + .withSubnetGroupStatus("status") + .withVpcId("vpc") + .withSubnets(new Subnet() + .withSubnetIdentifier("subnet"))) + .withStatusInfos(new DBInstanceStatusInfo() + .withStatus("status") + .withMessage("message") + .withNormal(true) + .withStatusType("type")) + .withEndpoint(new Endpoint() + .withAddress("address") + .withPort(100) + .withHostedZoneId("zone")) + .withInstanceCreateTime(new Date(100000)) + .withIops(100) + .withMultiAZ(true) + .withPubliclyAccessible(true); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/EbsTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/EbsTableProviderTest.java new file mode 100644 index 0000000000..cf30fd58eb --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/EbsTableProviderTest.java @@ -0,0 +1,181 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeVolumesRequest; +import com.amazonaws.services.ec2.model.DescribeVolumesResult; +import com.amazonaws.services.ec2.model.Tag; +import com.amazonaws.services.ec2.model.Volume; +import com.amazonaws.services.ec2.model.VolumeAttachment; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Date; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class EbsTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(EbsTableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "ebs_volumes"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new EbsTableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeVolumes(any(DescribeVolumesRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + DescribeVolumesRequest request = (DescribeVolumesRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getVolumeIds().get(0)); + DescribeVolumesResult mockResult = mock(DescribeVolumesResult.class); + List values = new ArrayList<>(); + values.add(makeVolume(getIdValue())); + values.add(makeVolume(getIdValue())); + values.add(makeVolume("fake-id")); + when(mockResult.getVolumes()).thenReturn(values); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private Volume makeVolume(String id) + { + Volume volume = new Volume(); + volume.withVolumeId(id) + .withVolumeType("type") + .withAttachments(new VolumeAttachment() + .withInstanceId("target") + .withDevice("attached_device") + .withState("attachment_state") + .withAttachTime(new Date(100_000))) + .withAvailabilityZone("availability_zone") + .withCreateTime(new Date(100_000)) + .withEncrypted(true) + .withKmsKeyId("kms_key_id") + .withSize(100) + .withIops(100) + .withSnapshotId("snapshot_id") + .withState("state") + .withTags(new Tag("key", "value")); + + return volume; + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/Ec2TableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/Ec2TableProviderTest.java new file mode 100644 index 0000000000..478d81100c --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/Ec2TableProviderTest.java @@ -0,0 +1,226 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeInstancesRequest; +import com.amazonaws.services.ec2.model.DescribeInstancesResult; +import com.amazonaws.services.ec2.model.EbsInstanceBlockDevice; +import com.amazonaws.services.ec2.model.GroupIdentifier; +import com.amazonaws.services.ec2.model.Instance; +import com.amazonaws.services.ec2.model.InstanceBlockDeviceMapping; +import com.amazonaws.services.ec2.model.InstanceNetworkInterface; +import com.amazonaws.services.ec2.model.InstanceState; +import com.amazonaws.services.ec2.model.Reservation; +import com.amazonaws.services.ec2.model.StateReason; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Date; +import java.util.List; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class Ec2TableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(Ec2TableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "instance_id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "ec2_instances"; + } + + protected int getExpectedRows() + { + return 4; + } + + protected TableProvider setUpSource() + { + return new Ec2TableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeInstances(any(DescribeInstancesRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + DescribeInstancesRequest request = (DescribeInstancesRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getInstanceIds().get(0)); + DescribeInstancesResult mockResult = mock(DescribeInstancesResult.class); + List reservations = new ArrayList<>(); + reservations.add(makeReservation()); + reservations.add(makeReservation()); + when(mockResult.getReservations()).thenReturn(reservations); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private Reservation makeReservation() + { + Reservation reservation = mock(Reservation.class); + List instances = new ArrayList<>(); + instances.add(makeInstance(getIdValue())); + instances.add(makeInstance(getIdValue())); + instances.add(makeInstance("non-matching-id")); + when(reservation.getInstances()).thenReturn(instances); + return reservation; + } + + private Instance makeInstance(String id) + { + Instance instance = new Instance(); + instance.withInstanceId(id) + .withImageId("image_id") + .withInstanceType("instance_type") + .withPlatform("platform") + .withPrivateDnsName("private_dns_name") + .withPrivateIpAddress("private_ip_address") + .withPublicDnsName("public_dns_name") + .withPublicIpAddress("public_ip_address") + .withSubnetId("subnet_id") + .withVpcId("vpc_id") + .withArchitecture("architecture") + .withInstanceLifecycle("instance_lifecycle") + .withRootDeviceName("root_device_name") + .withRootDeviceType("root_device_type") + .withSpotInstanceRequestId("spot_instance_request_id") + .withVirtualizationType("virtualization_type") + .withKeyName("key_name") + .withKernelId("kernel_id") + .withCapacityReservationId("capacity_reservation_id") + .withLaunchTime(new Date(100_000)) + .withState(new InstanceState().withCode(100).withName("name")) + .withStateReason(new StateReason().withCode("code").withMessage("message")) + .withEbsOptimized(true); + + List interfaces = new ArrayList<>(); + interfaces.add(new InstanceNetworkInterface() + .withStatus("status") + .withSubnetId("subnet") + .withVpcId("vpc") + .withMacAddress("mac_address") + .withPrivateDnsName("private_dns") + .withPrivateIpAddress("private_ip") + .withNetworkInterfaceId("interface_id") + .withGroups(new GroupIdentifier().withGroupId("group_id").withGroupName("group_name"))); + + interfaces.add(new InstanceNetworkInterface() + .withStatus("status") + .withSubnetId("subnet") + .withVpcId("vpc") + .withMacAddress("mac") + .withPrivateDnsName("private_dns") + .withPrivateIpAddress("private_ip") + .withNetworkInterfaceId("interface_id") + .withGroups(new GroupIdentifier().withGroupId("group_id").withGroupName("group_name"))); + + instance.withNetworkInterfaces(interfaces) + .withSecurityGroups(new GroupIdentifier().withGroupId("group_id").withGroupName("group_name")) + .withBlockDeviceMappings(new InstanceBlockDeviceMapping().withDeviceName("device_name").withEbs(new EbsInstanceBlockDevice().withVolumeId("volume_id"))); + + return instance; + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/ImagesTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/ImagesTableProviderTest.java new file mode 100644 index 0000000000..e58c6ee452 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/ImagesTableProviderTest.java @@ -0,0 +1,193 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.BlockDeviceMapping; +import com.amazonaws.services.ec2.model.DescribeImagesRequest; +import com.amazonaws.services.ec2.model.DescribeImagesResult; +import com.amazonaws.services.ec2.model.EbsBlockDevice; +import com.amazonaws.services.ec2.model.Image; +import com.amazonaws.services.ec2.model.Tag; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class ImagesTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(ImagesTableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "ec2_images"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new ImagesTableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeImages(any(DescribeImagesRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + DescribeImagesRequest request = (DescribeImagesRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getImageIds().get(0)); + DescribeImagesResult mockResult = mock(DescribeImagesResult.class); + List values = new ArrayList<>(); + values.add(makeImage(getIdValue())); + values.add(makeImage(getIdValue())); + values.add(makeImage("fake-id")); + when(mockResult.getImages()).thenReturn(values); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private Image makeImage(String id) + { + Image image = new Image(); + image.withImageId(id) + .withArchitecture("architecture") + .withCreationDate("created") + .withDescription("description") + .withHypervisor("hypervisor") + .withImageLocation("location") + .withImageType("type") + .withKernelId("kernel") + .withName("name") + .withOwnerId("owner") + .withPlatform("platform") + .withRamdiskId("ramdisk") + .withRootDeviceName("root_device") + .withRootDeviceType("root_type") + .withSriovNetSupport("srvio_net") + .withState("state") + .withVirtualizationType("virt_type") + .withPublic(true) + .withTags(new Tag("key", "value")) + .withBlockDeviceMappings(new BlockDeviceMapping() + .withDeviceName("dev_name") + .withNoDevice("no_device") + .withVirtualName("virt_name") + .withEbs(new EbsBlockDevice() + .withIops(100) + .withKmsKeyId("ebs_kms_key") + .withVolumeType("ebs_type") + .withVolumeSize(100))); + + return image; + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/RouteTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/RouteTableProviderTest.java new file mode 100644 index 0000000000..f7afdbec72 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/RouteTableProviderTest.java @@ -0,0 +1,187 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeRouteTablesRequest; +import com.amazonaws.services.ec2.model.DescribeRouteTablesResult; +import com.amazonaws.services.ec2.model.PropagatingVgw; +import com.amazonaws.services.ec2.model.Route; +import com.amazonaws.services.ec2.model.RouteTable; +import com.amazonaws.services.ec2.model.RouteTableAssociation; +import com.amazonaws.services.ec2.model.Tag; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class RouteTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(RouteTableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "route_table_id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "routing_tables"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new RouteTableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeRouteTables(any(DescribeRouteTablesRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + DescribeRouteTablesRequest request = (DescribeRouteTablesRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getRouteTableIds().get(0)); + DescribeRouteTablesResult mockResult = mock(DescribeRouteTablesResult.class); + List values = new ArrayList<>(); + values.add(makeRouteTable(getIdValue())); + values.add(makeRouteTable(getIdValue())); + values.add(makeRouteTable("fake-id")); + when(mockResult.getRouteTables()).thenReturn(values); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private RouteTable makeRouteTable(String id) + { + RouteTable routeTable = new RouteTable(); + routeTable.withRouteTableId(id) + .withOwnerId("owner") + .withVpcId("vpc") + .withAssociations(new RouteTableAssociation().withSubnetId("subnet").withRouteTableId("route_table_id")) + .withTags(new Tag("key", "value")) + .withPropagatingVgws(new PropagatingVgw().withGatewayId("gateway_id")) + .withRoutes(new Route() + .withDestinationCidrBlock("dst_cidr") + .withDestinationIpv6CidrBlock("dst_cidr_v6") + .withDestinationPrefixListId("dst_prefix_list") + .withEgressOnlyInternetGatewayId("egress_igw") + .withGatewayId("gateway") + .withInstanceId("instance_id") + .withInstanceOwnerId("instance_owner") + .withNatGatewayId("nat_gateway") + .withNetworkInterfaceId("interface") + .withOrigin("origin") + .withState("state") + .withTransitGatewayId("transit_gateway") + .withVpcPeeringConnectionId("vpc_peering_con") + ); + + return routeTable; + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SecurityGroupsTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SecurityGroupsTableProviderTest.java new file mode 100644 index 0000000000..d49562e47d --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SecurityGroupsTableProviderTest.java @@ -0,0 +1,179 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeSecurityGroupsRequest; +import com.amazonaws.services.ec2.model.DescribeSecurityGroupsResult; +import com.amazonaws.services.ec2.model.IpPermission; +import com.amazonaws.services.ec2.model.IpRange; +import com.amazonaws.services.ec2.model.Ipv6Range; +import com.amazonaws.services.ec2.model.PrefixListId; +import com.amazonaws.services.ec2.model.SecurityGroup; +import com.amazonaws.services.ec2.model.UserIdGroupPair; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class SecurityGroupsTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(SecurityGroupsTableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "security_groups"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new SecurityGroupsTableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeSecurityGroups(any(DescribeSecurityGroupsRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + DescribeSecurityGroupsRequest request = (DescribeSecurityGroupsRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getGroupIds().get(0)); + DescribeSecurityGroupsResult mockResult = mock(DescribeSecurityGroupsResult.class); + List values = new ArrayList<>(); + values.add(makeSecurityGroup(getIdValue())); + values.add(makeSecurityGroup(getIdValue())); + values.add(makeSecurityGroup("fake-id")); + when(mockResult.getSecurityGroups()).thenReturn(values); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$") || field.getName().equals("direction")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private SecurityGroup makeSecurityGroup(String id) + { + return new SecurityGroup() + .withGroupId(id) + .withGroupName("name") + .withDescription("description") + .withIpPermissions(new IpPermission() + .withIpProtocol("protocol") + .withFromPort(100) + .withToPort(100) + .withIpv4Ranges(new IpRange().withCidrIp("cidr").withDescription("description")) + + .withIpv6Ranges(new Ipv6Range().withCidrIpv6("cidr").withDescription("description")) + .withPrefixListIds(new PrefixListId().withPrefixListId("prefix").withDescription("description")) + .withUserIdGroupPairs(new UserIdGroupPair().withGroupId("group_id").withUserId("user_id")) + ); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SubnetTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SubnetTableProviderTest.java new file mode 100644 index 0000000000..a17e1d3faf --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/SubnetTableProviderTest.java @@ -0,0 +1,172 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeSubnetsRequest; +import com.amazonaws.services.ec2.model.DescribeSubnetsResult; +import com.amazonaws.services.ec2.model.Subnet; +import com.amazonaws.services.ec2.model.Tag; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class SubnetTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(SubnetTableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "subnets"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new SubnetTableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeSubnets(any(DescribeSubnetsRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + DescribeSubnetsRequest request = (DescribeSubnetsRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getSubnetIds().get(0)); + DescribeSubnetsResult mockResult = mock(DescribeSubnetsResult.class); + List values = new ArrayList<>(); + values.add(makeSubnet(getIdValue())); + values.add(makeSubnet(getIdValue())); + values.add(makeSubnet("fake-id")); + when(mockResult.getSubnets()).thenReturn(values); + + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private Subnet makeSubnet(String id) + { + return new Subnet() + .withSubnetId(id) + .withAvailabilityZone("availability_zone") + .withCidrBlock("cidr_block") + .withAvailableIpAddressCount(100) + .withDefaultForAz(true) + .withMapPublicIpOnLaunch(true) + .withOwnerId("owner") + .withState("state") + .withTags(new Tag().withKey("key").withValue("value")) + .withVpcId("vpc"); + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/VpcTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/VpcTableProviderTest.java new file mode 100644 index 0000000000..f22b9b4d8d --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/ec2/VpcTableProviderTest.java @@ -0,0 +1,171 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.ec2; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.ec2.AmazonEC2; +import com.amazonaws.services.ec2.model.DescribeVpcsRequest; +import com.amazonaws.services.ec2.model.DescribeVpcsResult; +import com.amazonaws.services.ec2.model.Tag; +import com.amazonaws.services.ec2.model.Vpc; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class VpcTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(VpcTableProviderTest.class); + + @Mock + private AmazonEC2 mockEc2; + + protected String getIdField() + { + return "id"; + } + + protected String getIdValue() + { + return "123"; + } + + protected String getExpectedSchema() + { + return "ec2"; + } + + protected String getExpectedTable() + { + return "vpcs"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new VpcTableProvider(mockEc2); + } + + @Override + protected void setUpRead() + { + when(mockEc2.describeVpcs(any(DescribeVpcsRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + DescribeVpcsRequest request = (DescribeVpcsRequest) invocation.getArguments()[0]; + + assertEquals(getIdValue(), request.getVpcIds().get(0)); + DescribeVpcsResult mockResult = mock(DescribeVpcsResult.class); + List values = new ArrayList<>(); + values.add(makeVpc(getIdValue())); + values.add(makeVpc(getIdValue())); + values.add(makeVpc("fake-id")); + when(mockResult.getVpcs()).thenReturn(values); + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private Vpc makeVpc(String id) + { + Vpc vpc = new Vpc(); + vpc.withVpcId(id) + .withCidrBlock("cidr_block") + .withDhcpOptionsId("dhcp_opts") + .withInstanceTenancy("tenancy") + .withOwnerId("owner") + .withState("state") + .withIsDefault(true) + .withTags(new Tag("key", "valye")); + + return vpc; + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3BucketsTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3BucketsTableProviderTest.java new file mode 100644 index 0000000000..0e57ef3027 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3BucketsTableProviderTest.java @@ -0,0 +1,155 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.s3; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.Bucket; +import com.amazonaws.services.s3.model.Owner; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Date; +import java.util.List; + +import static org.junit.Assert.*; +import static org.mockito.Mockito.when; + +public class S3BucketsTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(S3BucketsTableProviderTest.class); + + @Mock + private AmazonS3 mockS3; + + protected String getIdField() + { + return "bucket_name"; + } + + protected String getIdValue() + { + return "my_bucket"; + } + + protected String getExpectedSchema() + { + return "s3"; + } + + protected String getExpectedTable() + { + return "buckets"; + } + + protected int getExpectedRows() + { + return 2; + } + + protected TableProvider setUpSource() + { + return new S3BucketsTableProvider(mockS3); + } + + @Override + protected void setUpRead() + { + when(mockS3.listBuckets()).thenAnswer((InvocationOnMock invocation) -> { + List values = new ArrayList<>(); + values.add(makeBucket(getIdValue())); + values.add(makeBucket(getIdValue())); + values.add(makeBucket("fake-id")); + return values; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private Bucket makeBucket(String id) + { + Bucket bucket = new Bucket(); + bucket.setName(id); + Owner owner = new Owner(); + owner.setDisplayName("owner_name"); + owner.setId("owner_id"); + bucket.setOwner(owner); + bucket.setCreationDate(new Date(100_000)); + return bucket; + } +} diff --git a/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3ObjectsTableProviderTest.java b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3ObjectsTableProviderTest.java new file mode 100644 index 0000000000..3499e800e9 --- /dev/null +++ b/athena-aws-cmdb/src/test/java/com/amazonaws/athena/connectors/aws/cmdb/tables/s3/S3ObjectsTableProviderTest.java @@ -0,0 +1,185 @@ +/*- + * #%L + * athena-aws-cmdb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.aws.cmdb.tables.s3; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connectors.aws.cmdb.tables.AbstractTableProviderTest; +import com.amazonaws.athena.connectors.aws.cmdb.tables.TableProvider; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.ListObjectsV2Request; +import com.amazonaws.services.s3.model.ListObjectsV2Result; +import com.amazonaws.services.s3.model.Owner; +import com.amazonaws.services.s3.model.S3ObjectSummary; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.joda.time.DateTimeZone; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Date; +import java.util.List; +import java.util.concurrent.atomic.AtomicLong; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class S3ObjectsTableProviderTest + extends AbstractTableProviderTest +{ + private static final Logger logger = LoggerFactory.getLogger(S3ObjectsTableProviderTest.class); + + @Mock + private AmazonS3 mockS3; + + protected String getIdField() + { + return "bucket_name"; + } + + protected String getIdValue() + { + return "my_bucket"; + } + + protected String getExpectedSchema() + { + return "s3"; + } + + protected String getExpectedTable() + { + return "objects"; + } + + protected int getExpectedRows() + { + return 4; + } + + protected TableProvider setUpSource() + { + return new S3ObjectsTableProvider(mockS3); + } + + @Override + protected void setUpRead() + { + AtomicLong count = new AtomicLong(0); + when(mockS3.listObjectsV2(any(ListObjectsV2Request.class))).thenAnswer((InvocationOnMock invocation) -> { + ListObjectsV2Request request = (ListObjectsV2Request) invocation.getArguments()[0]; + assertEquals(getIdValue(), request.getBucketName()); + + ListObjectsV2Result mockResult = mock(ListObjectsV2Result.class); + List values = new ArrayList<>(); + values.add(makeObjectSummary(getIdValue())); + values.add(makeObjectSummary(getIdValue())); + values.add(makeObjectSummary("fake-id")); + when(mockResult.getObjectSummaries()).thenReturn(values); + + if (count.get() > 0) { + assertNotNull(request.getContinuationToken()); + } + + if (count.incrementAndGet() < 2) { + when(mockResult.isTruncated()).thenReturn(true); + when(mockResult.getNextContinuationToken()).thenReturn("token"); + } + + return mockResult; + }); + } + + protected void validateRow(Block block, int pos) + { + for (FieldReader fieldReader : block.getFieldReaders()) { + fieldReader.setPosition(pos); + Field field = fieldReader.getField(); + + if (field.getName().equals(getIdField())) { + assertEquals(getIdValue(), fieldReader.readText().toString()); + } + else { + validate(fieldReader); + } + } + } + + private void validate(FieldReader fieldReader) + { + Field field = fieldReader.getField(); + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case VARCHAR: + if (field.getName().equals("$data$")) { + assertNotNull(fieldReader.readText().toString()); + } + else { + assertEquals(field.getName(), fieldReader.readText().toString()); + } + break; + case DATEMILLI: + assertEquals(100_000, fieldReader.readLocalDateTime().toDateTime(DateTimeZone.UTC).getMillis()); + break; + case BIT: + assertTrue(fieldReader.readBoolean()); + break; + case INT: + assertTrue(fieldReader.readInteger() > 0); + break; + case BIGINT: + assertTrue(fieldReader.readLong() > 0); + break; + case STRUCT: + for (Field child : field.getChildren()) { + validate(fieldReader.reader(child.getName())); + } + break; + case LIST: + validate(fieldReader.reader()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + + private S3ObjectSummary makeObjectSummary(String id) + { + S3ObjectSummary summary = new S3ObjectSummary(); + Owner owner = new Owner(); + owner.setId("owner_id"); + owner.setDisplayName("owner_name"); + summary.setOwner(owner); + summary.setBucketName(id); + summary.setETag("e_tag"); + summary.setKey("key"); + summary.setSize(100); + summary.setLastModified(new Date(100_000)); + summary.setStorageClass("storage_class"); + return summary; + } +} diff --git a/athena-bigquery/LICENSE.txt b/athena-bigquery/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-bigquery/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-bigquery/README.md b/athena-bigquery/README.md new file mode 100644 index 0000000000..90f224dce7 --- /dev/null +++ b/athena-bigquery/README.md @@ -0,0 +1,40 @@ +# Amazon Athena Google BigQuery Connector + +This connector enables Amazon Athena to communicate with BigQuery, making your BigQuery data accessible. + +### Parameters + +The Athena Google BigQuery Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +|Parameter Name|Example Value|Description| +|--------------|--------------------|------------------| +|spill_bucket|my_bucket|When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from.| +|spill_prefix|temporary/split| (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours.| +|kms_key_id|a7e63k4b-8loc-40db-a2a1-4d0en2cd8331|(Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys.| +|disable_spill_encryption|True or False|(Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption.| +|gcp_project_id|semiotic-primer-1234567|The project id (not project name) that contains the datasets that this connector should read from.| +|secret_manager_gcp_creds_name|GoogleCloudPlatformCredentials|The name of the secret within AWS Secrets Manager that contains your BigQuery credentials JSON. The credentials | + + ### Deploying The Connector + + To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + + 1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. + 2. From the athena-bigquery dir, run `mvn clean install`. + 3. From the athena-bigquery dir, run `../tools/publish.sh S3_BUCKET_NAME athena-bigquery` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) + + +## Limitations and Other Notes + +The following is a list of limitations or other notes. +- Lambda has a maximum timeout value of 15 mins. Each split executes a query on BigQuery and must finish with enough time to store the results for Athena to read. If the Lambda times out, the query will fail. +- Google BigQuery is case sensitive. We attempt to correct the case of dataset names, and table names but we do not do any case correction for project id's. This is necessary because Presto lower cases all metadata. These corrections will make many extra calls to Google BigQuery. +- Many data types are currently not supported, such as Timestamps, Dates, Binary, and Complex data types such as Maps, Lists, Structs. + +## Performance + +This connector will attempt to push as many constraints to Google BigQuery to decrease the number of results returned. This connector currently does not support querying partitioned tables. This will be added in a future release. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-bigquery/athena-bigquery.yaml b/athena-bigquery/athena-bigquery.yaml new file mode 100644 index 0000000000..5a03b10707 --- /dev/null +++ b/athena-bigquery/athena-bigquery.yaml @@ -0,0 +1,79 @@ +Transform: 'AWS::Serverless-2016-10-31' + +Metadata: + AWS::ServerlessRepo::Application: + Name: AthenaBigQueryConnector + Description: An Athena connector to interact with BigQuery + Author: Amazon Athena + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: ['athena-federation', 'BigQuery'] + HomePageUrl: https://github.com/awslabs/aws-athena-query-federation + SemanticVersion: 1.0.0 + SourceCodeUrl: https://github.com/awslabs/aws-athena-query-federation + +# Parameters are CloudFormation features to pass input +# to your template when you create a stack +Parameters: + AthenaCatalogName: + Description: "The name you will give to this catalog in Athena will be used as the function name prefix." + Type: String + SpillBucket: + Description: "The bucket where this function can spill data." + Type: String + Default: "athena-spill-test" + SpillPrefix: + Description: "The bucket where this function can spill data." + Type: String + Default: "athena-spill" + LambdaTimeout: + Description: "Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)" + Default: 900 + Type: Number + LambdaMemory: + Description: "Lambda memory in MB (min 128 - 3008 max)." + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: "false" + Type: String + GCPProjectID: + Description: "The project ID within Google Cloud Platform ." + Default: BigQueryCred + Type: String + SecretManagerGCPCredsName: + Description: "The secret name within AWS Secrets Manager that contains your Google Cloud Platform Credentials." + Default: GoogleCloudPlatformCredentials + Type: String + +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + secret_manager_big_query_creds_name: !Ref SecretManagerGCPCredsName + gcp_project_id: !Ref GCPProjectID + FunctionName: !Sub "${AthenaCatalogName}" + Handler: "com.amazonaws.athena.connectors.bigquery.BigQueryCompositeHandler" + CodeUri: "./target/athena-bigquery-1.0-SNAPSHOT.jar" + Description: "Allows Athena to call and execute BigQuery queries and process the results." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-bigquery/pom.xml b/athena-bigquery/pom.xml new file mode 100644 index 0000000000..5a985025e3 --- /dev/null +++ b/athena-bigquery/pom.xml @@ -0,0 +1,77 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-bigquery + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.google.cloud + google-cloud-bigquery + 1.87.0 + + + com.google.cloud + google-cloud-resourcemanager + 0.108.0-alpha + + + + com.amazonaws + aws-java-sdk-secretsmanager + ${aws-sdk.version} + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + + package + + shade + + + + + classworlds:classworlds + junit:junit + jmock:* + *:xml-apis + org.apache.maven:lib:tests + + + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + + + + + \ No newline at end of file diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryCompositeHandler.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryCompositeHandler.java new file mode 100644 index 0000000000..b19b713f8e --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +import java.io.IOException; + +public class BigQueryCompositeHandler + extends CompositeHandler +{ + public BigQueryCompositeHandler() + throws IOException + { + super(new BigQueryMetadataHandler(), new BigQueryRecordHandler()); + } +} diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryConstants.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryConstants.java new file mode 100644 index 0000000000..ac7688f7fd --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryConstants.java @@ -0,0 +1,46 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.bigquery; + +public class BigQueryConstants +{ + /** + * The source type that is used to aid in logging diagnostic info when raising a support case. + */ + public static final String SOURCE_TYPE = "bigquery"; + + /** + * The maximum number of datasets and tables that can be returned from Google BigQuery API calls for metadata. + */ + public static final long MAX_RESULTS = 10_000; + + /** + * The Project ID within the Google Cloud Platform where the datasets and tables exist to query. + */ + public static final String GCP_PROJECT_ID = "gcp_project_id"; + + /** + * The name of the secret within Secrets Manager that contains credentials JSON that provides this Lambda access + * to call Google BigQuery. + */ + public static final String ENV_BIG_QUERY_CREDS_SM_ID = "secret_manager_gcp_creds_name"; + + private BigQueryConstants() {} +} diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryExceptions.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryExceptions.java new file mode 100644 index 0000000000..99a62806d4 --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryExceptions.java @@ -0,0 +1,33 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +public class BigQueryExceptions +{ + static class TooManyTablesException + extends RuntimeException + { + TooManyTablesException() + { + super("Too many tables, exceeded max metadata results for schema count."); + } + } +} diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryMetadataHandler.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryMetadataHandler.java new file mode 100644 index 0000000000..6c44fd5770 --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryMetadataHandler.java @@ -0,0 +1,179 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connectors.bigquery.BigQueryExceptions.TooManyTablesException; +import com.google.api.gax.paging.Page; +import com.google.cloud.bigquery.BigQuery; +import com.google.cloud.bigquery.Dataset; +import com.google.cloud.bigquery.DatasetId; +import com.google.cloud.bigquery.Field; +import com.google.cloud.bigquery.Table; +import com.google.cloud.bigquery.TableDefinition; +import com.google.cloud.bigquery.TableId; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; + +import static com.amazonaws.athena.connectors.bigquery.BigQueryUtils.fixCaseForDatasetName; +import static com.amazonaws.athena.connectors.bigquery.BigQueryUtils.fixCaseForTableName; +import static com.amazonaws.athena.connectors.bigquery.BigQueryUtils.translateToArrowType; + +public class BigQueryMetadataHandler + extends MetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(BigQueryMetadataHandler.class); + + /** + * The {@link BigQuery} client to interact with the BigQuery Service. + */ + private final BigQuery bigQuery; + + BigQueryMetadataHandler() + throws IOException + { + this(BigQueryUtils.getBigQueryClient()); + } + + @VisibleForTesting + BigQueryMetadataHandler(BigQuery bigQuery) + { + super(BigQueryConstants.SOURCE_TYPE); + this.bigQuery = bigQuery; + } + + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest listSchemasRequest) + { + logger.info("doListSchemaNames called with Catalog: {}", listSchemasRequest.getCatalogName()); + + final List schemas = new ArrayList<>(); + final String projectName = BigQueryUtils.getProjectName(listSchemasRequest); + Page response = bigQuery.listDatasets(projectName); + + for (Dataset dataset : response.iterateAll()) { + if (schemas.size() > BigQueryConstants.MAX_RESULTS) { + throw new TooManyTablesException(); + } + schemas.add(dataset.getDatasetId().getDataset().toLowerCase()); + logger.debug("Found Dataset: {}", dataset.getDatasetId().getDataset()); + } + + logger.info("Found {} schemas!", schemas.size()); + + return new ListSchemasResponse(listSchemasRequest.getCatalogName(), schemas); + } + + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest listTablesRequest) + { + logger.info("doListTables called with request {}:{}", listTablesRequest.getCatalogName(), + listTablesRequest.getSchemaName()); + + //Get the project name, dataset name, and dataset id. Google BigQuery is case sensitive. + final String projectName = BigQueryUtils.getProjectName(listTablesRequest); + final String datasetName = fixCaseForDatasetName(projectName, listTablesRequest.getSchemaName(), bigQuery); + final DatasetId datasetId = DatasetId.of(projectName, datasetName); + + Page response = bigQuery.listTables(datasetId); + List tables = new ArrayList<>(); + + for (Table table : response.iterateAll()) { + if (tables.size() > BigQueryConstants.MAX_RESULTS) { + throw new TooManyTablesException(); + } + tables.add(new TableName(listTablesRequest.getSchemaName(), table.getTableId().getTable().toLowerCase())); + } + + logger.info("Found {} table(s)!", tables.size()); + + return new ListTablesResponse(listTablesRequest.getCatalogName(), tables); + } + + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + logger.info("doGetTable called with request {}:{}", BigQueryUtils.getProjectName(getTableRequest), + getTableRequest.getTableName()); + + final Schema tableSchema = getSchema(BigQueryUtils.getProjectName(getTableRequest), getTableRequest.getTableName().getSchemaName(), + getTableRequest.getTableName().getTableName()); + return new GetTableResponse(BigQueryUtils.getProjectName(getTableRequest).toLowerCase(), + getTableRequest.getTableName(), tableSchema); + } + + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + //NoOp since we don't support partitioning at this time. + } + + @Override + public GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + { + if (logger.isInfoEnabled()) { + logger.info("DoGetSplits: {}.{} Part Cols: {}", request.getSchema(), request.getTableName(), + String.join(",", request.getPartitionCols())); + } + + //Every split must have a unique location if we wish to spill to avoid failures + SpillLocation spillLocation = makeSpillLocation(request); + + return new GetSplitsResponse(request.getCatalogName(), Split.newBuilder(spillLocation, + makeEncryptionKey()).build()); + } + + private Schema getSchema(String projectName, String datasetName, String tableName) + { + datasetName = fixCaseForDatasetName(projectName, datasetName, bigQuery); + tableName = fixCaseForTableName(projectName, datasetName, tableName, bigQuery); + TableId tableId = TableId.of(projectName, datasetName, tableName); + Table response = bigQuery.getTable(tableId); + TableDefinition tableDefinition = response.getDefinition(); + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + for (Field field : tableDefinition.getSchema().getFields()) { + schemaBuilder.addField(field.getName(), translateToArrowType(field.getType())); + } + return schemaBuilder.build(); + } +} diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryRecordHandler.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryRecordHandler.java new file mode 100644 index 0000000000..c517f99343 --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryRecordHandler.java @@ -0,0 +1,181 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.google.cloud.bigquery.BigQuery; +import com.google.cloud.bigquery.BigQueryException; +import com.google.cloud.bigquery.FieldValue; +import com.google.cloud.bigquery.FieldValueList; +import com.google.cloud.bigquery.Job; +import com.google.cloud.bigquery.JobId; +import com.google.cloud.bigquery.JobInfo; +import com.google.cloud.bigquery.QueryJobConfiguration; +import com.google.cloud.bigquery.TableResult; +import com.google.common.annotations.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; + +import static com.amazonaws.athena.connectors.bigquery.BigQueryUtils.fixCaseForDatasetName; +import static com.amazonaws.athena.connectors.bigquery.BigQueryUtils.fixCaseForTableName; +import static com.amazonaws.athena.connectors.bigquery.BigQueryUtils.getObjectFromFieldValue; + +/** + * This record handler is an example of how you can implement a lambda that calls bigquery and pulls data. + * This Lambda requires that your BigQuery table is small enough so that a table scan can be completed + * within 5-10 mins or this lambda will time out and it will fail. + */ +public class BigQueryRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(BigQueryRecordHandler.class); + + /** + * The {@link BigQuery} client to interact with the BigQuery Service. + */ + private final BigQuery bigQueryClient; + + BigQueryRecordHandler() + throws IOException + { + this(AmazonS3ClientBuilder.defaultClient(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + BigQueryUtils.getBigQueryClient() + ); + } + + @VisibleForTesting + BigQueryRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, BigQuery bigQueryClient) + { + super(amazonS3, secretsManager, athena, BigQueryConstants.SOURCE_TYPE); + this.bigQueryClient = bigQueryClient; + } + + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + throws Exception + { + final String projectName = BigQueryUtils.getProjectName(recordsRequest.getCatalogName()); + final String datasetName = fixCaseForDatasetName(projectName, recordsRequest.getTableName().getSchemaName(), bigQueryClient); + final String tableName = fixCaseForTableName(projectName, datasetName, recordsRequest.getTableName().getTableName(), + bigQueryClient); + + logger.info("Got Request with constraints: {}", recordsRequest.getConstraints()); + + final String sqlToExecute = BigQuerySqlUtils.buildSqlFromSplit(new TableName(datasetName, tableName), + recordsRequest.getSchema(), recordsRequest.getConstraints(), recordsRequest.getSplit()); + + QueryJobConfiguration queryConfig = + QueryJobConfiguration.newBuilder(sqlToExecute) + // Use standard SQL syntax for queries. + // See: https://cloud.google.com/bigquery/sql-reference/ + .setUseLegacySql(false) + .build(); + + logger.info("Executing SQL Query: {} for Split: {}", sqlToExecute, recordsRequest.getSplit()); + + Job queryJob; + try { + JobId jobId = JobId.of(fixRequestId(recordsRequest.getQueryId())); + queryJob = bigQueryClient.create(JobInfo.newBuilder(queryConfig).setJobId(jobId).build()); + } + catch (BigQueryException bqe) { + if (bqe.getMessage().contains("Already Exists: Job")) { + logger.info("Caught exception that this job is already running. "); + //Return silently because another lambda is already processing this. + //Ideally when this happens, we would want to get the existing queryJob. + //This would allow this Lambda to timeout while waiting for the query. + //and rejoin it. This would provide much more time for Lambda to wait for + //BigQuery to finish its query for up to 15 mins * the number of retries. + //However, Presto is creating multiple splits, even if we return a single split. + return; + } + throw bqe; + } + + TableResult result; + try { + while (true) { + if (queryJob.isDone()) { + result = queryJob.getQueryResults(); + break; + } + else if (!queryStatusChecker.isQueryRunning()) { + queryJob.cancel(); + } + else { + Thread.sleep(10); + } + } + } + catch (InterruptedException ie) { + throw new IllegalStateException("Got interrupted waiting for Big Query to finish the query."); + } + + outputResults(spiller, recordsRequest, result); + } + + private String fixRequestId(String queryId) + { + return queryId.replaceAll("[^a-zA-Z0-9-_]", ""); + } + + /** + * Iterates through all the results that comes back from BigQuery and saves the result to be read by the Athena Connector. + * + * @param spiller The {@link BlockSpiller} provided when readWithConstraints() is called. + * @param recordsRequest The {@link ReadRecordsRequest} provided when readWithConstraints() is called. + * @param result The {@link TableResult} provided by {@link BigQuery} client after a query has completed executing. + */ + private void outputResults(BlockSpiller spiller, ReadRecordsRequest recordsRequest, TableResult result) + { + for (FieldValueList row : result.iterateAll()) { + spiller.writeRows((Block block, int rowNum) -> { + boolean isMatched = true; + for (Field field : recordsRequest.getSchema().getFields()) { + FieldValue fieldValue = row.get(field.getName()); + Object val = getObjectFromFieldValue(field.getName(), fieldValue, + field.getFieldType().getType()); + isMatched &= block.offerValue(field.getName(), rowNum, val); + if (!isMatched) { + return 0; + } + } + return 1; + }); + } + } +} diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQuerySqlUtils.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQuerySqlUtils.java new file mode 100644 index 0000000000..d277bdae2f --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQuerySqlUtils.java @@ -0,0 +1,182 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Map; +import java.util.StringJoiner; + +/** + * Utilities that help with Sql operations. + */ +class BigQuerySqlUtils +{ + private BigQuerySqlUtils() + { + } + + /** + * Builds an SQL statement from the schema, table name, split and contraints that can be executable by + * BigQuery. + * + * @param tableName The table name of the table we are querying. + * @param schema The schema of the table that we are querying. + * @param constraints The constraints that we want to apply to the query. + * @param split The split information to add as a constraint. + * @return SQL Statement that represents the table, columns, split, and constraints. + */ + static String buildSqlFromSplit(TableName tableName, Schema schema, Constraints constraints, Split split) + { + StringBuilder sqlBuilder = new StringBuilder("SELECT "); + + StringJoiner sj = new StringJoiner(","); + if (schema.getFields().isEmpty()) { + sj.add("*"); + } + else { + for (Field field : schema.getFields()) { + sj.add(field.getName()); + } + } + sqlBuilder.append(sj.toString()) + .append(" from ") + .append(tableName.getSchemaName()) + .append(".") + .append(tableName.getTableName()); + + //Buids Where Clause + sj = new StringJoiner(") AND ("); + for (Map.Entry summary : constraints.getSummary().entrySet()) { + final ValueSet value = summary.getValue(); + final String columnName = summary.getKey(); + if (value instanceof EquatableValueSet) { + if (value.isSingleValue()) { + if (value.isNullAllowed()) { + sj.add(columnName + " is null"); + } + else { + //Check Arrow type to see if we + sj.add(columnName + " = " + getValueForWhereClause(columnName, value.getSingleValue(), value.getType())); + } + } + //TODO:: process multiple values in "IN" clause. + } + else if (value instanceof SortedRangeSet) { + SortedRangeSet sortedRangeSet = (SortedRangeSet) value; + if (sortedRangeSet.isNone()) { + if (sortedRangeSet.isNullAllowed()) { + sj.add(columnName + " is null"); + } + //If there is no values and null is not allowed, then that means ignore this valueset. + continue; + } + Range range = sortedRangeSet.getSpan(); + if (!sortedRangeSet.isNullAllowed() && range.getLow().isLowerUnbounded() && range.getHigh().isUpperUnbounded()) { + sj.add(columnName + " is not null"); + continue; + } + if (!range.getLow().isLowerUnbounded() && !range.getLow().isNullValue()) { + final String sqlValue = getValueForWhereClause(columnName, range.getLow().getValue(), value.getType()); + switch (range.getLow().getBound()) { + case ABOVE: + sj.add(columnName + " > " + sqlValue); + break; + case EXACTLY: + sj.add(columnName + " >= " + sqlValue); + break; + case BELOW: + throw new IllegalArgumentException("Low Marker should never use BELOW bound: " + range); + default: + throw new AssertionError("Unhandled bound: " + range.getLow().getBound()); + } + } + if (!range.getHigh().isUpperUnbounded() && !range.getHigh().isNullValue()) { + final String sqlValue = getValueForWhereClause(columnName, range.getHigh().getValue(), value.getType()); + switch (range.getHigh().getBound()) { + case ABOVE: + throw new IllegalArgumentException("High Marker should never use ABOVE bound: " + range); + case EXACTLY: + sj.add(columnName + " <= " + sqlValue); + break; + case BELOW: + sj.add(columnName + " < " + sqlValue); + break; + default: + throw new AssertionError("Unhandled bound: " + range.getHigh().getBound()); + } + } + } + } + if (sj.length() > 0) { + sqlBuilder.append(" WHERE (") + .append(sj.toString()) + .append(")"); + } + + return sqlBuilder.toString(); + } + + //Gets the representation of a value that can be used in a where clause, ie String values need to be quoted, numeric doesn't. + private static String getValueForWhereClause(String columnName, Object value, ArrowType arrowType) + { + switch (arrowType.getTypeID()) { + case Int: + case Decimal: + case FloatingPoint: + return value.toString(); + case Bool: + if ((Boolean) value) { + return "true"; + } + else { + return "false"; + } + case Utf8: + return "'" + value.toString() + "'"; + case Date: + case Time: + case Timestamp: + case Interval: + case Binary: + case FixedSizeBinary: + case Null: + case Struct: + case List: + case FixedSizeList: + case Union: + case NONE: + throw new UnsupportedOperationException("The Arrow type: " + arrowType.getTypeID().name() + " is currently not supported"); + default: + throw new IllegalArgumentException("Unknown type has been encountered during range processing: " + columnName + + " Field Type: " + arrowType.getTypeID().name()); + } + } +} diff --git a/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryUtils.java b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryUtils.java new file mode 100644 index 0000000000..4bc3983a62 --- /dev/null +++ b/athena-bigquery/src/main/java/com/amazonaws/athena/connectors/bigquery/BigQueryUtils.java @@ -0,0 +1,231 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import com.google.api.gax.paging.Page; +import com.google.auth.Credentials; +import com.google.auth.oauth2.ServiceAccountCredentials; +import com.google.cloud.bigquery.BigQuery; +import com.google.cloud.bigquery.BigQueryOptions; +import com.google.cloud.bigquery.Dataset; +import com.google.cloud.bigquery.DatasetId; +import com.google.cloud.bigquery.FieldValue; +import com.google.cloud.bigquery.LegacySQLTypeName; +import com.google.cloud.bigquery.Table; +import com.google.cloud.resourcemanager.ResourceManager; +import com.google.cloud.resourcemanager.ResourceManagerOptions; +import org.apache.arrow.vector.types.DateUnit; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.TimeUnit; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.io.ByteArrayInputStream; +import java.io.IOException; + +class BigQueryUtils +{ + private BigQueryUtils() {} + + static Credentials getCredentialsFromSecretsManager() + throws IOException + { + AWSSecretsManager secretsManager = AWSSecretsManagerClientBuilder.defaultClient(); + GetSecretValueResult response = secretsManager.getSecretValue(new GetSecretValueRequest() + .withSecretId(getEnvBigQueryCredsSmId())); + return ServiceAccountCredentials.fromStream(new ByteArrayInputStream(response.getSecretString().getBytes())); + } + + static BigQuery getBigQueryClient() + throws IOException + { + BigQueryOptions.Builder bigqueryBuilder = BigQueryOptions.newBuilder(); + bigqueryBuilder.setCredentials(getCredentialsFromSecretsManager()); + return bigqueryBuilder.build().getService(); + } + + static ResourceManager getResourceManagerClient() + throws IOException + { + ResourceManagerOptions.Builder resourceManagerBuilder = ResourceManagerOptions.newBuilder(); + resourceManagerBuilder.setCredentials(getCredentialsFromSecretsManager()); + return resourceManagerBuilder.build().getService(); + } + + static String getEnvBigQueryCredsSmId() + { + return getEnvVar(BigQueryConstants.ENV_BIG_QUERY_CREDS_SM_ID); + } + + static String getEnvVar(String envVar) + { + String var = System.getenv(envVar); + if (var == null || var.length() == 0) { + throw new IllegalArgumentException("Lambda Environment Variable " + envVar + " has not been populated! "); + } + return var; + } + + /** + * Gets the project name that exists within Google Cloud Platform that contains the datasets that we wish to query. + * The Lambda environment variables are first inspected and if it does not exist, then we take it from the catalog + * name in the request. + * + * @param catalogNameFromRequest The Catalog Name from the request that is passed in from the Athena Connector framework. + * @return The project name. + */ + static String getProjectName(String catalogNameFromRequest) + { + if (System.getenv(BigQueryConstants.GCP_PROJECT_ID) != null) { + return System.getenv(BigQueryConstants.GCP_PROJECT_ID); + } + return catalogNameFromRequest; + } + + /** + * Gets the project name that exists within Google Cloud Platform that contains the datasets that we wish to query. + * The Lambda environment variables are first inspected and if it does not exist, then we take it from the catalog + * name in the request. + * + * @param request The {@link MetadataRequest} from the request that is passed in from the Athena Connector framework. + * @return The project name. + */ + static String getProjectName(MetadataRequest request) + { + return getProjectName(request.getCatalogName()); + } + + /** + * BigQuery is case sensitive for its Project and Dataset Names. This function will return the first + * case insensitive match. + * + * @param projectName The dataset name we want to look up. The project name must be case correct. + * @return A case correct dataset name. + */ + static String fixCaseForDatasetName(String projectName, String datasetName, BigQuery bigQuery) + { + Page response = bigQuery.listDatasets(projectName); + for (Dataset dataset : response.iterateAll()) { + if (dataset.getDatasetId().getDataset().equalsIgnoreCase(datasetName)) { + return dataset.getDatasetId().getDataset(); + } + } + + throw new IllegalArgumentException("Google Dataset with name " + datasetName + + " could not be found in Project " + projectName + " in GCP. "); + } + + static String fixCaseForTableName(String projectName, String datasetName, String tableName, BigQuery bigQuery) + { + Page
response = bigQuery.listTables(DatasetId.of(projectName, datasetName)); + for (Table table : response.iterateAll()) { + if (table.getTableId().getTable().equalsIgnoreCase(tableName)) { + return table.getTableId().getTable(); + } + } + throw new IllegalArgumentException("Google Table with name " + datasetName + + " could not be found in Project " + projectName + " in GCP. "); + } + + static Object getObjectFromFieldValue(String fieldName, FieldValue fieldValue, ArrowType arrowType) + { + if (fieldValue == null || fieldValue.isNull() || fieldValue.getValue().equals("null")) { + return null; + } + switch (Types.getMinorTypeForArrowType(arrowType)) { + case TIMESTAMPMILLI: + //getTimestampValue() returns a long in microseconds. Return it in Milliseconds which is how its stored. + return fieldValue.getTimestampValue() / 1000; + case SMALLINT: + case TINYINT: + case INT: + case BIGINT: + return fieldValue.getLongValue(); + case DECIMAL: + return fieldValue.getNumericValue(); + case BIT: + return fieldValue.getBooleanValue(); + case FLOAT4: + case FLOAT8: + return fieldValue.getDoubleValue(); + case VARCHAR: + return fieldValue.getStringValue(); + //TODO: Support complex types. + default: + throw new IllegalArgumentException("Unknown type has been encountered: Field Name: " + fieldName + + " Field Type: " + arrowType.toString() + " MinorType: " + Types.getMinorTypeForArrowType(arrowType)); + } + } + + static ArrowType translateToArrowType(LegacySQLTypeName type) + { + switch (type.getStandardType()) { + case BOOL: + return new ArrowType.Bool(); + /** A 64-bit signed integer value. */ + case INT64: + return new ArrowType.Int(64, true); + /** A 64-bit IEEE binary floating-point value. */ + case FLOAT64: + return new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE); + /** A decimal value with 38 digits of precision and 9 digits of scale. */ + case NUMERIC: + return new ArrowType.Decimal(38, 9); + /** Variable-length character (Unicode) data. */ + case STRING: + return new ArrowType.Utf8(); + /** Variable-length binary data. */ + case BYTES: + return new ArrowType.Binary(); + /** Container of ordered fields each with a type (required) and field name (optional). */ + case STRUCT: + return new ArrowType.Struct(); + /** Ordered list of zero or more elements of any non-array type. */ + case ARRAY: + return new ArrowType.List(); + /** + * Represents an absolute point in time, with microsecond precision. Values range between the + * years 1 and 9999, inclusive. + */ + case TIMESTAMP: + return new ArrowType.Timestamp(TimeUnit.MILLISECOND, null); + /** Represents a logical calendar date. Values range between the years 1 and 9999, inclusive. */ + case DATE: + return new ArrowType.Date(DateUnit.DAY); + /** Represents a time, independent of a specific date, to microsecond precision. */ + case TIME: + return new ArrowType.Time(TimeUnit.MILLISECOND, 32); + /** Represents a year, month, day, hour, minute, second, and subsecond (microsecond precision). */ + case DATETIME: + return new ArrowType.Date(DateUnit.MILLISECOND); + /** Represents a set of geographic points, represented as a Well Known Text (WKT) string. */ + case GEOGRAPHY: + return new ArrowType.Utf8(); + } + throw new IllegalArgumentException("Unable to map Google Type of StandardType: " + type.getStandardType().toString() + + " NonStandardType: " + type.name()); + } +} diff --git a/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryMetadataHandlerTest.java b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryMetadataHandlerTest.java new file mode 100644 index 0000000000..64311daec8 --- /dev/null +++ b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryMetadataHandlerTest.java @@ -0,0 +1,184 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.google.cloud.bigquery.BigQuery; +import com.google.cloud.bigquery.Dataset; +import com.google.cloud.bigquery.DatasetId; +import com.google.cloud.bigquery.Schema; +import com.google.cloud.bigquery.StandardTableDefinition; +import com.google.cloud.bigquery.Table; +import com.google.cloud.bigquery.TableId; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mock; +import org.mockito.MockitoAnnotations; + +import java.util.Collections; +import java.util.HashMap; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class BigQueryMetadataHandlerTest +{ + private static final String QUERY_ID = "queryId"; + private static final String CATALOG = "catalog"; + private static final TableName TABLE_NAME = new TableName("dataset1", "table1"); + + @Mock + BigQuery bigQuery; + + private BigQueryMetadataHandler bigQueryMetadataHandler; + + private BlockAllocator blockAllocator; + + @Before + public void setUp() + { + MockitoAnnotations.initMocks(this); + bigQueryMetadataHandler = new BigQueryMetadataHandler(bigQuery); + blockAllocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + blockAllocator.close(); + } + + @Test + public void testDoListSchemaNames() + { + final int numDatasets = 5; + BigQueryPage datasetPage = + new BigQueryPage<>(BigQueryTestUtils.getDatasetList(BigQueryTestUtils.PROJECT_1_NAME, numDatasets)); + when(bigQuery.listDatasets(any(String.class))).thenReturn(datasetPage); + + //This will test case insenstivity + ListSchemasRequest request = new ListSchemasRequest(BigQueryTestUtils.FEDERATED_IDENTITY, + QUERY_ID, BigQueryTestUtils.PROJECT_1_NAME.toLowerCase()); + ListSchemasResponse schemaNames = bigQueryMetadataHandler.doListSchemaNames(blockAllocator, request); + + assertNotNull(schemaNames); + assertEquals("Schema count does not match!", numDatasets, schemaNames.getSchemas().size()); + } + + @Test + public void testDoListTables() + { + //Build mocks for Datasets + final int numDatasets = 5; + BigQueryPage datasetPage = + new BigQueryPage<>(BigQueryTestUtils.getDatasetList(BigQueryTestUtils.PROJECT_1_NAME, numDatasets)); + when(bigQuery.listDatasets(any(String.class))).thenReturn(datasetPage); + + //Get the first dataset name. + String datasetName = datasetPage.iterateAll().iterator().next().getDatasetId().getDataset(); + + final int numTables = 5; + BigQueryPage
tablesPage = + new BigQueryPage<>(BigQueryTestUtils.getTableList(BigQueryTestUtils.PROJECT_1_NAME, + datasetName, numTables)); + + when(bigQuery.listTables(any(DatasetId.class))).thenReturn(tablesPage); + + //This will test case insenstivity + ListTablesRequest request = new ListTablesRequest(BigQueryTestUtils.FEDERATED_IDENTITY, + QUERY_ID, BigQueryTestUtils.PROJECT_1_NAME.toLowerCase(), + datasetName); + ListTablesResponse tableNames = bigQueryMetadataHandler.doListTables(blockAllocator, request); + + assertNotNull(tableNames); + assertEquals("Schema count does not match!", numTables, tableNames.getTables().size()); + } + + @Test + public void testDoGetTable() + { + //Build mocks for Datasets + final int numDatasets = 5; + BigQueryPage datasetPage = + new BigQueryPage<>(BigQueryTestUtils.getDatasetList(BigQueryTestUtils.PROJECT_1_NAME, numDatasets)); + when(bigQuery.listDatasets(any(String.class))).thenReturn(datasetPage); + + //Get the first dataset name. + String datasetName = datasetPage.iterateAll().iterator().next().getDatasetId().getDataset(); + + //Build mocks for Tables + final int numTables = 5; + BigQueryPage
tablesPage = + new BigQueryPage<>(BigQueryTestUtils.getTableList(BigQueryTestUtils.PROJECT_1_NAME, + datasetName, numTables)); + + String tableName = tablesPage.iterateAll().iterator().next().getTableId().getTable(); + + when(bigQuery.listTables(any(DatasetId.class))).thenReturn(tablesPage); + + Schema tableSchema = BigQueryTestUtils.getTestSchema(); + StandardTableDefinition tableDefinition = StandardTableDefinition.newBuilder() + .setSchema(tableSchema).build(); + + Table table = mock(Table.class); + when(table.getTableId()).thenReturn(TableId.of(BigQueryTestUtils.PROJECT_1_NAME, datasetName, tableName)); + when(table.getDefinition()).thenReturn(tableDefinition); + when(bigQuery.getTable(any(TableId.class))).thenReturn(table); + + //Make the call + GetTableRequest getTableRequest = new GetTableRequest(BigQueryTestUtils.FEDERATED_IDENTITY, + QUERY_ID, BigQueryTestUtils.PROJECT_1_NAME, + new TableName(datasetName, tableName)); + GetTableResponse response = bigQueryMetadataHandler.doGetTable(blockAllocator, getTableRequest); + + assertNotNull(response); + + //Number of Fields + assertEquals(tableSchema.getFields().size(), response.getSchema().getFields().size()); + } + + @Test + public void testDoGetSplits() + { + GetSplitsRequest request = new GetSplitsRequest(BigQueryTestUtils.FEDERATED_IDENTITY, + QUERY_ID, CATALOG, TABLE_NAME, + mock(Block.class), Collections.emptyList(), new Constraints(new HashMap<>()), null); + GetSplitsResponse response = bigQueryMetadataHandler.doGetSplits(blockAllocator, request); + + assertNotNull(response); + } +} diff --git a/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryPage.java b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryPage.java new file mode 100644 index 0000000000..64fdad325d --- /dev/null +++ b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryPage.java @@ -0,0 +1,86 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.google.api.gax.paging.Page; + +import java.util.Collection; +import java.util.Iterator; + +/** + * This class is a wrapper around the {@link Page} class as a convient way to create Pages in unit tests. + * + * @param The type of object that is being returned from a Google BigQuery API call. For example, getDatasets(). + */ +class BigQueryPage + implements Page +{ + final Collection collection; + + BigQueryPage(Collection collection) + { + this.collection = collection; + } + + @Override + public boolean hasNextPage() + { + return false; + } + + @Override + public String getNextPageToken() + { + return null; + } + + @Override + public Page getNextPage() + { + return null; + } + + @Override + public Iterable iterateAll() + { + return new Iterable() + { + @Override + public Iterator iterator() + { + return collection.iterator(); + } + }; + } + + @Override + public Iterable getValues() + { + return new Iterable() + { + @Override + public Iterator iterator() + { + return collection.iterator(); + } + }; + } +} diff --git a/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryRecordHandlerTest.java b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryRecordHandlerTest.java new file mode 100644 index 0000000000..f5555e5fa1 --- /dev/null +++ b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryRecordHandlerTest.java @@ -0,0 +1,314 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SpillConfig; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.api.gax.paging.Page; +import com.google.cloud.bigquery.BigQuery; +import com.google.cloud.bigquery.Dataset; +import com.google.cloud.bigquery.DatasetId; +import com.google.cloud.bigquery.FieldValue; +import com.google.cloud.bigquery.FieldValueList; +import com.google.cloud.bigquery.Job; +import com.google.cloud.bigquery.JobInfo; +import com.google.cloud.bigquery.Table; +import com.google.cloud.bigquery.TableResult; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mock; +import org.mockito.MockitoAnnotations; +import org.mockito.invocation.InvocationOnMock; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.List; +import java.util.UUID; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class BigQueryRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(BigQueryRecordHandlerTest.class); + + private String bucket = "bucket"; + private String prefix = "prefix"; + + @Mock + BigQuery bigQuery; + + @Mock + AWSSecretsManager awsSecretsManager; + + @Mock + private AmazonAthena athena; + + private BigQueryRecordHandler bigQueryRecordHandler; + + private BlockAllocator allocator; + private List mockS3Storage = new ArrayList<>(); + private AmazonS3 amazonS3; + private S3BlockSpiller spillWriter; + private S3BlockSpillReader spillReader; + private Schema schemaForRead; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private EncryptionKey encryptionKey = keyFactory.create(); + private SpillConfig spillConfig; + private String queryId = UUID.randomUUID().toString(); + private S3SpillLocation s3SpillLocation = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(queryId) + .withIsDirectory(true) + .build(); + + @Before + public void init() + { + logger.info("Starting init."); + MockitoAnnotations.initMocks(this); + + allocator = new BlockAllocatorImpl(); + amazonS3 = mock(AmazonS3.class); + + mockS3Client(); + + //Create Spill config + spillConfig = SpillConfig.newBuilder() + .withEncryptionKey(encryptionKey) + //This will be enough for a single block + .withMaxBlockBytes(100000) + //This will force the writer to spill. + .withMaxInlineBlockBytes(100) + //Async Writing. + .withNumSpillThreads(0) + .withRequestId(UUID.randomUUID().toString()) + .withSpillLocation(s3SpillLocation) + .build(); + + schemaForRead = new Schema(BigQueryTestUtils.getTestSchemaFieldsArrow()); + spillWriter = new S3BlockSpiller(amazonS3, spillConfig, allocator, schemaForRead, ConstraintEvaluator.emptyEvaluator()); + spillReader = new S3BlockSpillReader(amazonS3, allocator); + + //Mock the BigQuery Client to return Datasets, and Table Schema information. + BigQueryPage datasets = new BigQueryPage(BigQueryTestUtils.getDatasetList(BigQueryTestUtils.PROJECT_1_NAME, 2)); + when(bigQuery.listDatasets(any(String.class))).thenReturn(datasets); + BigQueryPage
tables = new BigQueryPage
(BigQueryTestUtils.getTableList(BigQueryTestUtils.PROJECT_1_NAME, "dataset1", 2)); + when(bigQuery.listTables(any(DatasetId.class))).thenReturn(tables); + + //The class we want to test. + bigQueryRecordHandler = new BigQueryRecordHandler(amazonS3, awsSecretsManager, athena, bigQuery); + + logger.info("Completed init."); + } + + @Test + public void testReadWithConstraint() + throws Exception + { + try (ReadRecordsRequest request = new ReadRecordsRequest( + BigQueryTestUtils.FEDERATED_IDENTITY, + BigQueryTestUtils.PROJECT_1_NAME, + "queryId", + new TableName("dataset1", "table1"), + BigQueryTestUtils.getBlockTestSchema(), + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket(bucket) + .withPrefix(prefix) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), + keyFactory.create()).build(), + new Constraints(Collections.EMPTY_MAP), + 0, //This is ignored when directly calling readWithConstraints. + 0)) { //This is ignored when directly calling readWithConstraints. + //Always return try for the evaluator to keep all rows. + ConstraintEvaluator evaluator = mock(ConstraintEvaluator.class); + when(evaluator.apply(any(String.class), any(Object.class))).thenAnswer( + (InvocationOnMock invocationOnMock) -> { + return true; + } + ); + + //Populate the schema and data that the mocked Google BigQuery client will return. + com.google.cloud.bigquery.Schema tableSchema = BigQueryTestUtils.getTestSchema(); + List tableRows = Arrays.asList( + BigQueryTestUtils.getBigQueryFieldValueList(false, 1000, "test1", 123123.12312), + BigQueryTestUtils.getBigQueryFieldValueList(true, 500, "test2", 5345234.22111), + BigQueryTestUtils.getBigQueryFieldValueList(false, 700, "test3", 324324.23423), + BigQueryTestUtils.getBigQueryFieldValueList(true, 900, null, null), + BigQueryTestUtils.getBigQueryFieldValueList(null, null, "test5", 2342.234234), + BigQueryTestUtils.getBigQueryFieldValueList(true, 1200, "test6", 1123.12312), + BigQueryTestUtils.getBigQueryFieldValueList(false, 100, "test7", 1313.12312), + BigQueryTestUtils.getBigQueryFieldValueList(true, 120, "test8", 12313.1312), + BigQueryTestUtils.getBigQueryFieldValueList(false, 300, "test9", 12323.1312) + ); + Page fieldValueList = new BigQueryPage<>(tableRows); + TableResult result = new TableResult(tableSchema, tableRows.size(), fieldValueList); + + //Mock out the Google BigQuery Job. + Job mockBigQueryJob = mock(Job.class); + when(mockBigQueryJob.isDone()).thenReturn(false).thenReturn(true); + when(mockBigQueryJob.getQueryResults()).thenReturn(result); + when(bigQuery.create(any(JobInfo.class))).thenReturn(mockBigQueryJob); + + QueryStatusChecker queryStatusChecker = mock(QueryStatusChecker.class); + when(queryStatusChecker.isQueryRunning()).thenReturn(true); + + //Execute the test + bigQueryRecordHandler.readWithConstraint(spillWriter, request, queryStatusChecker); + + //Ensure that there was a spill so that we can read the spilled block. + assertTrue(spillWriter.spilled()); + //Calling getSpillLocations() forces a flush. + assertEquals(1, spillWriter.getSpillLocations().size()); + + //Read the spilled block + Block block = spillReader.read(s3SpillLocation, encryptionKey, schemaForRead); + + assertEquals("The number of rows expected do not match!", tableRows.size(), block.getRowCount()); + validateBlock(block, tableRows); + } + } + + private void validateBlock(Block block, List tableRows) + { + //Iterator through the fields + for (Field field : block.getFields()) { + FieldReader fieldReader = block.getFieldReader(field.getName()); + int currentCount = 0; + //Iterator through the rows and match up with the block + for (FieldValueList tableRow : tableRows) { + FieldValue orgValue = tableRow.get(field.getName()); + fieldReader.setPosition(currentCount); + currentCount++; + + logger.debug("comparing: {} with {}", orgValue.getValue(), fieldReader.readObject()); + + //Check for null values. + if ((orgValue.getValue() == null || fieldReader.readObject() == null)) { + assertTrue(orgValue.isNull()); + assertFalse(fieldReader.isSet()); + continue; + } + + //Check regular values. + Types.MinorType type = Types.getMinorTypeForArrowType(field.getType()); + switch (type) { + case INT: + assertEquals(orgValue.getLongValue(), (long) fieldReader.readInteger()); + break; + case BIT: + assertEquals(orgValue.getBooleanValue(), fieldReader.readBoolean()); + break; + case FLOAT4: + assertEquals(orgValue.getDoubleValue(), fieldReader.readFloat(), 0.001); + break; + case FLOAT8: + assertEquals(orgValue.getDoubleValue(), fieldReader.readDouble(), 0.001); + break; + case VARCHAR: + assertEquals(orgValue.getStringValue(), fieldReader.readText().toString()); + break; + default: + throw new RuntimeException("No validation configured for field " + field.getName() + ":" + type + " " + field.getChildren()); + } + } + } + } + + //Mocks the S3 client by storing any putObjects() and returning the object when getObject() is called. + private void mockS3Client() + { + when(amazonS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + }); + + when(amazonS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + }); + } + + private class ByteHolder + { + private byte[] bytes; + + void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQuerySqlUtilsTest.java b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQuerySqlUtilsTest.java new file mode 100644 index 0000000000..ee0564cb80 --- /dev/null +++ b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQuerySqlUtilsTest.java @@ -0,0 +1,115 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.AllOrNoneValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Marker; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Test; + +import java.util.HashMap; +import java.util.LinkedHashMap; +import java.util.Map; + +import static org.junit.Assert.assertEquals; + +public class BigQuerySqlUtilsTest +{ + static final TableName tableName = new TableName("schema", "table"); + static final Split split = null; + + static final ArrowType BOOLEAN_TYPE = ArrowType.Bool.INSTANCE; + static final ArrowType INT_TYPE = new ArrowType.Int(32, true); + + @Test + public void testSqlWithConstraintsEquality() + throws Exception + { + Map constraintMap = new LinkedHashMap<>(); + constraintMap.put("bool1", EquatableValueSet.newBuilder(new BlockAllocatorImpl(), BOOLEAN_TYPE, + true, false).add(false).build()); + constraintMap.put("int1", EquatableValueSet.newBuilder(new BlockAllocatorImpl(), INT_TYPE, + true, false).add(14).build()); + constraintMap.put("nullableField", EquatableValueSet.newBuilder(new BlockAllocatorImpl(), INT_TYPE, + true, true).build()); + + try (Constraints constraints = new Constraints(constraintMap)) { + String sql = BigQuerySqlUtils.buildSqlFromSplit(tableName, makeSchema(constraintMap), constraints, split); + assertEquals("SELECT bool1,int1,nullableField from schema.table WHERE (bool1 = false) AND (int1 = 14) AND (nullableField is null)", sql); + } + } + + @Test + public void testSqlWithConstraintsRanges() + throws Exception + { + Map constraintMap = new LinkedHashMap<>(); + ValueSet rangeSet = SortedRangeSet.newBuilder(INT_TYPE, true).add(new Range(Marker.above(new BlockAllocatorImpl(), INT_TYPE, 10), + Marker.exactly(new BlockAllocatorImpl(), INT_TYPE, 20))).build(); + + ValueSet isNullRangeSet = SortedRangeSet.newBuilder(INT_TYPE, true).build(); + + ValueSet isNonNullRangeSet = SortedRangeSet.newBuilder(INT_TYPE, false).add( + new Range(Marker.lowerUnbounded(new BlockAllocatorImpl(), INT_TYPE), + Marker.upperUnbounded(new BlockAllocatorImpl(), INT_TYPE))) + .build(); + + constraintMap.put("integerRange", rangeSet); + constraintMap.put("isNullRange", isNullRangeSet); + constraintMap.put("isNotNullRange", isNonNullRangeSet); + + try (Constraints constraints = new Constraints(constraintMap)) { + String sql = BigQuerySqlUtils.buildSqlFromSplit(tableName, makeSchema(constraintMap), constraints, split); + assertEquals("SELECT integerRange,isNullRange,isNotNullRange from schema.table WHERE (integerRange > 10) AND (integerRange <= 20) AND (isNullRange is null) AND (isNotNullRange is not null)", sql); + } + } + + private Schema makeSchema(Map constraintMap) + { + SchemaBuilder builder = new SchemaBuilder(); + for (Map.Entry field : constraintMap.entrySet()) { + ArrowType.ArrowTypeID typeId = field.getValue().getType().getTypeID(); + switch (typeId) { + case Int: + builder.addIntField(field.getKey()); + break; + case Bool: + builder.addBitField(field.getKey()); + break; + case Utf8: + builder.addStringField(field.getKey()); + default: + throw new UnsupportedOperationException("Type Not Implemented: " + typeId.name()); + } + } + return builder.build(); + } +} diff --git a/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryTestUtils.java b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryTestUtils.java new file mode 100644 index 0000000000..168b86ac20 --- /dev/null +++ b/athena-bigquery/src/test/java/com/amazonaws/athena/connectors/bigquery/BigQueryTestUtils.java @@ -0,0 +1,143 @@ +/*- + * #%L + * athena-bigquery + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +package com.amazonaws.athena.connectors.bigquery; + +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.google.cloud.bigquery.Dataset; +import com.google.cloud.bigquery.DatasetId; +import com.google.cloud.bigquery.Field; +import com.google.cloud.bigquery.FieldList; +import com.google.cloud.bigquery.FieldValue; +import com.google.cloud.bigquery.FieldValueList; +import com.google.cloud.bigquery.LegacySQLTypeName; +import com.google.cloud.bigquery.Schema; +import com.google.cloud.bigquery.Table; +import com.google.cloud.bigquery.TableId; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.FieldType; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.List; + +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class BigQueryTestUtils +{ + public static final FederatedIdentity FEDERATED_IDENTITY = new FederatedIdentity("id", "principal", "account"); + public static final String BOOL_FIELD_NAME_1 = "bool1"; + public static final String INTEGER_FIELD_NAME_1 = "int1"; + public static final String STRING_FIELD_NAME_1 = "string1"; + public static final String FLOAT_FIELD_NAME_1 = "float1"; + + private BigQueryTestUtils() { + } + + public static final String PROJECT_1_NAME = "testProject"; + + //Returns a list of mocked Datasets. + static List getDatasetList(String projectName, int numDatasets) + { + List datasetList = new ArrayList<>(); + for (int i = 0; i < numDatasets; i++) { + Dataset dataset1 = mock(Dataset.class); + when(dataset1.getDatasetId()).thenReturn(DatasetId.of(projectName, "dataset" + i)); + when(dataset1.getFriendlyName()).thenReturn("dataset" + i); + datasetList.add(dataset1); + } + return datasetList; + } + + //Returns a list of mocked Tables + static List
getTableList(String projectName, String dataset, int numTables) + { + List
tableList = new ArrayList<>(); + for (int i = 0; i < numTables; i++) { + Table table = mock(Table.class); + when(table.getTableId()).thenReturn(TableId.of(projectName, dataset, "table" + i)); + tableList.add(table); + } + return tableList; + } + + //Returns the schema by returning a list of fields in Google BigQuery Format. + static List getTestSchemaFields() + { + return Arrays.asList(Field.of(BOOL_FIELD_NAME_1, LegacySQLTypeName.BOOLEAN), + Field.of(INTEGER_FIELD_NAME_1, LegacySQLTypeName.INTEGER), + Field.of(STRING_FIELD_NAME_1, LegacySQLTypeName.STRING), + Field.of(FLOAT_FIELD_NAME_1, LegacySQLTypeName.FLOAT) + ); + } + + static Schema getTestSchema() + { + return Schema.of(getTestSchemaFields()); + } + + //Gets the schema in Arrow Format. + static org.apache.arrow.vector.types.pojo.Schema getBlockTestSchema() + { + return SchemaBuilder.newBuilder() + .addBitField(BOOL_FIELD_NAME_1) + .addIntField(INTEGER_FIELD_NAME_1) + .addStringField(STRING_FIELD_NAME_1) + .addFloat8Field(FLOAT_FIELD_NAME_1) + .build(); + } + + static Collection getTestSchemaFieldsArrow() + { + return Arrays.asList( + new org.apache.arrow.vector.types.pojo.Field(BOOL_FIELD_NAME_1, + FieldType.nullable(ArrowType.Bool.INSTANCE), null), + new org.apache.arrow.vector.types.pojo.Field(INTEGER_FIELD_NAME_1, + FieldType.nullable(new ArrowType.Int(32, true)), null), + new org.apache.arrow.vector.types.pojo.Field(STRING_FIELD_NAME_1, + FieldType.nullable(new ArrowType.Utf8()), null), + new org.apache.arrow.vector.types.pojo.Field(FLOAT_FIELD_NAME_1, + FieldType.nullable(new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)), null) + ); + } + + static List generateBigQueryRowValue(Boolean bool, Integer integer, String string, Double floatVal) + { + return Arrays.asList( + //Primitives are stored as Strings. + FieldValue.of(FieldValue.Attribute.PRIMITIVE, bool == null ? null : String.valueOf(bool)), + FieldValue.of(FieldValue.Attribute.PRIMITIVE, integer == null ? null : String.valueOf(integer)), + //Timestamps are stored as a number, where the integer component of the number is seconds since epoch + //and the microsecond part is the decimal part. + FieldValue.of(FieldValue.Attribute.PRIMITIVE, string), + FieldValue.of(FieldValue.Attribute.PRIMITIVE, floatVal == null ? null : String.valueOf(floatVal)) + ); + } + + static FieldValueList getBigQueryFieldValueList(Boolean bool, Integer integer, String string, Double floatVal) + { + return FieldValueList.of(generateBigQueryRowValue(bool, integer, string, floatVal), + FieldList.of(getTestSchemaFields())); + } +} diff --git a/athena-cloudwatch-metrics/LICENSE.txt b/athena-cloudwatch-metrics/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-cloudwatch-metrics/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-cloudwatch-metrics/README.md b/athena-cloudwatch-metrics/README.md new file mode 100644 index 0000000000..caed435c27 --- /dev/null +++ b/athena-cloudwatch-metrics/README.md @@ -0,0 +1,85 @@ +# Amazon Athena Cloudwatch Metrics Connector + +This connector enables Amazon Athena to communicate with Cloudwatch Metrics, making your metrics data accessible via SQL. + +## Usage + +### Parameters + +The Athena Cloudwatch Metrics Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) + +The connector also supports AIMD Congestion Control for handling throttling events from Cloudwatch via the Athena Query Federation SDK's ThrottlingInvoker construct. You can tweak the default throttling behavior by setting any of the below (optional) environment variables: + +1. **throttle_initial_delay_ms** - (Default: 10ms) This is the initial call delay applied after the first congestion event. +1. **throttle_max_delay_ms** - (Default: 1000ms) This is the max delay between calls. You can derive TPS by dividing it into 1000ms. +1. **throttle_decrease_factor** - (Default: 0.5) This is the factor by which we reduce our call rate. +1. **throttle_increase_ms** - (Default: 10ms) This is the rate at which we decrease the call delay. + + +### Databases & Tables + +The Athena Cloudwatch Metrics Connector maps your Namespaces, Dimensions, Metrics, and Metric Values into two tables in a single schema called "default". + +1. **metrics** - This table contains the available metrics as uniquely defined by a triple of namespace, set, name. More specifically, this table contains the following columns. + + * **namespace** - A VARCHAR containing the namespace. + * **metric_name** - A VARCHAR containing the metric name. + * **dimensions** - A LIST of STRUCTS comprised of dim_name (VARCHAR) and dim_value (VARCHAR). + * **statistic** - A List of VARCH statistics (e.g. p90, AVERAGE, etc..) avialable for the metric. + +1. **metric_samples** - This table contains the available metric samples for each metric named in the **metrics** table. More specifically, the table contains the following columns: + * **namespace** - A VARCHAR containing the namespace. + * **metric_name** - A VARCHAR containing the metric name. + * **dimensions** - A LIST of STRUCTS comprised of dim_name (VARCHAR) and dim_value (VARCHAR). + * **dim_name** - A VARCHAR convenience field used to easily filter on a single dimension name. + * **dim_value** - A VARCHAR convenience field used to easily filter on a single dimension value. + * **period** - An INT field representing the 'period' of the metric in seconds. (e.g. 60 second metric) + * **timestamp** - A BIGINT field representing the epoch time (in seconds) the metric sample is for. + * **value** - A FLOAT8 field containing the value of the sample. + * **statistic** - A VARCHAR containing the statistic type of the sample. (e.g. AVERAGE, p90, etc..) + +### Required Permissions + +Review the "Policies" section of the athena-cloudwatch-metrics.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +2. Cloudwatch Metrics ReadOnly - The connector uses this access to query your metrics data. +2. Cloudwatch Logs Write - The connector uses this access to write its own diagnostic logs. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-cloudwatch-metrics dir, run `mvn clean install`. +3. From the athena-cloudwatch-metrics dir, run `../tools/publish.sh S3_BUCKET_NAME athena-cloudwatch-metrics` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) +4. Try running a query like the one below in Athena: +```sql +-- Get the list of available metrics +select * from "lambda:"."default".metrics limit 100 + +-- Query the last 3 days of AWS/Lambda Invocations metrics +SELECT * +FROM "lambda:"."default".metric_samples +WHERE metric_name = 'Invocations' + AND namespace = 'AWS/Lambda' + AND statistic IN ( 'p90', 'Average' ) + AND period = 60 + AND timestamp BETWEEN To_unixtime(Now() - INTERVAL '3' day) AND + To_unixtime(Now()) +LIMIT 100; +``` + +## Performance + +The Athena Cloudwatch Metrics Connector will attempt to parallelize queries against Cloudwatch Metrics by parallelizing scans of the various metrics needed for your query. Predicate Pushdown is performed within the Lambda function and also within Cloudwatch Logs for certain time period , metric, namespace, and dimension filters. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-cloudwatch-metrics/athena-cloudwatch-metrics.yaml b/athena-cloudwatch-metrics/athena-cloudwatch-metrics.yaml new file mode 100644 index 0000000000..1d9d44e84f --- /dev/null +++ b/athena-cloudwatch-metrics/athena-cloudwatch-metrics.yaml @@ -0,0 +1,66 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaCloudwatchMetricsConnector + Description: 'This connector enables Amazon Athena to communicate with Cloudwatch Metrics, making your metrics data accessible via SQL.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.cloudwatch.metrics.MetricsCompositeHandler" + CodeUri: "./target/athena-cloudwatch-metrics-1.0.jar" + Description: "Enables Amazon Athena to communicate with Cloudwatch Metrics, making your metrics data accessible via SQL" + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - cloudwatch:Describe* + - cloudwatch:Get* + - cloudwatch:List* + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-cloudwatch-metrics/pom.xml b/athena-cloudwatch-metrics/pom.xml new file mode 100644 index 0000000000..b2bd8796f8 --- /dev/null +++ b/athena-cloudwatch-metrics/pom.xml @@ -0,0 +1,57 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-cloudwatch-metrics + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.amazonaws + aws-java-sdk-cloudwatch + 1.11.490 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/DimensionSerDe.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/DimensionSerDe.java new file mode 100644 index 0000000000..7752896ebe --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/DimensionSerDe.java @@ -0,0 +1,93 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.services.cloudwatch.model.Dimension; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.ObjectMapper; + +import java.io.IOException; +import java.util.List; + +/** + * Used to serialize and deserialize Cloudwatch Metrics Dimension objects. This is used + * when creating and processing Splits. + */ +public class DimensionSerDe +{ + protected static final String SERIALZIE_DIM_FIELD_NAME = "d"; + private static final ObjectMapper mapper = new ObjectMapper(); + + private DimensionSerDe() {} + + /** + * Serializes the provided List of Dimensions. + * + * @param dim The list of dimensions to serialize. + * @return A String containing the serialized list of Dimensions. + */ + public static String serialize(List dim) + { + try { + return mapper.writeValueAsString(new DimensionHolder(dim)); + } + catch (JsonProcessingException ex) { + throw new RuntimeException(ex); + } + } + + /** + * Deserializes the provided String into a List of Dimensions. + * + * @param serializeDim A serialized list of Dimensions. + * @return The List of Dimensions represented by the serialized string. + */ + public static List deserialize(String serializeDim) + { + try { + return mapper.readValue(serializeDim, DimensionHolder.class).getDimensions(); + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + /** + * Helper which allows us to use Jackson's Object Mapper to serialize a List of Dimensions. + */ + private static class DimensionHolder + { + private final List dimensions; + + @JsonCreator + public DimensionHolder(@JsonProperty("dimensions") List dimensions) + { + this.dimensions = dimensions; + } + + @JsonProperty + public List getDimensions() + { + return dimensions; + } + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricUtils.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricUtils.java new file mode 100644 index 0000000000..862396bd58 --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricUtils.java @@ -0,0 +1,198 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.cloudwatch.model.Dimension; +import com.amazonaws.services.cloudwatch.model.DimensionFilter; +import com.amazonaws.services.cloudwatch.model.GetMetricDataRequest; +import com.amazonaws.services.cloudwatch.model.ListMetricsRequest; +import com.amazonaws.services.cloudwatch.model.Metric; +import com.amazonaws.services.cloudwatch.model.MetricDataQuery; +import com.amazonaws.services.cloudwatch.model.MetricStat; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.Date; +import java.util.List; +import java.util.Map; + +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_VALUE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.METRIC_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.NAMESPACE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.PERIOD_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.STATISTIC_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.TIMESTAMP_FIELD; + +/** + * Helper which prepares and filters Cloudwatch Metrics requests. + */ +public class MetricUtils +{ + private static final Logger logger = LoggerFactory.getLogger(MetricUtils.class); + + //this is a format required by Cloudwatch Metrics + private static final String METRIC_ID = "m1"; + + private MetricUtils() {} + + /** + * Filters metrics who have at least 1 metric dimension that matches DIMENSION_NAME_FIELD and DIMENSION_VALUE_FIELD filters. + * This is just an optimization and isn't fully correct. We depend on the calling engine to apply full constraints. Also + * filters metric name and namespace. + * + * @return True if the supplied metric contains at least 1 Dimension matching the evaluator. + */ + protected static boolean applyMetricConstraints(ConstraintEvaluator evaluator, Metric metric, String statistic) + { + if (!evaluator.apply(NAMESPACE_FIELD, metric.getNamespace())) { + return false; + } + + if (!evaluator.apply(METRIC_NAME_FIELD, metric.getMetricName())) { + return false; + } + + if (statistic != null && !evaluator.apply(STATISTIC_FIELD, statistic)) { + return false; + } + + for (Dimension next : metric.getDimensions()) { + if (evaluator.apply(DIMENSION_NAME_FIELD, next.getName()) && evaluator.apply(DIMENSION_VALUE_FIELD, next.getValue())) { + return true; + } + } + + if (metric.getDimensions().isEmpty() && + evaluator.apply(DIMENSION_NAME_FIELD, null) && + evaluator.apply(DIMENSION_VALUE_FIELD, null)) { + return true; + } + + return false; + } + + /** + * Attempts to push the supplied predicate constraints onto the Cloudwatch Metrics request. + */ + protected static void pushDownPredicate(Constraints constraints, ListMetricsRequest listMetricsRequest) + { + Map summary = constraints.getSummary(); + + ValueSet namespaceConstraint = summary.get(NAMESPACE_FIELD); + if (namespaceConstraint != null && namespaceConstraint.isSingleValue()) { + listMetricsRequest.setNamespace(namespaceConstraint.getSingleValue().toString()); + } + + ValueSet metricConstraint = summary.get(METRIC_NAME_FIELD); + if (metricConstraint != null && metricConstraint.isSingleValue()) { + listMetricsRequest.setMetricName(metricConstraint.getSingleValue().toString()); + } + + ValueSet dimensionNameConstraint = summary.get(DIMENSION_NAME_FIELD); + ValueSet dimensionValueConstraint = summary.get(DIMENSION_VALUE_FIELD); + if (dimensionNameConstraint != null && dimensionNameConstraint.isSingleValue() && + dimensionValueConstraint != null && dimensionValueConstraint.isSingleValue()) { + DimensionFilter filter = new DimensionFilter() + .withName(dimensionNameConstraint.getSingleValue().toString()) + .withValue(dimensionValueConstraint.getSingleValue().toString()); + listMetricsRequest.setDimensions(Collections.singletonList(filter)); + } + } + + /** + * Creates a Cloudwatch Metrics sample data request from the provided inputs + * + * @param readRecordsRequest The RecordReadRequest to make into a Cloudwatch Metrics Data request. + * @return The Cloudwatch Metrics Data request that matches the requested read operation. + */ + protected static GetMetricDataRequest makeGetMetricDataRequest(ReadRecordsRequest readRecordsRequest) + { + Split split = readRecordsRequest.getSplit(); + List dimensions = DimensionSerDe.deserialize(split.getProperty(DimensionSerDe.SERIALZIE_DIM_FIELD_NAME)); + GetMetricDataRequest dataRequest = new GetMetricDataRequest(); + com.amazonaws.services.cloudwatch.model.Metric metric = new com.amazonaws.services.cloudwatch.model.Metric(); + metric.setNamespace(split.getProperty(NAMESPACE_FIELD)); + metric.setMetricName(split.getProperty(METRIC_NAME_FIELD)); + + List dList = new ArrayList<>(); + for (Dimension nextDim : dimensions) { + dList.add(new Dimension().withName(nextDim.getName()).withValue(nextDim.getValue())); + } + metric.setDimensions(dList); + + MetricDataQuery mds = new MetricDataQuery() + .withMetricStat(new MetricStat() + .withMetric(metric) + .withPeriod(Integer.valueOf(split.getProperty(PERIOD_FIELD))) + .withStat(split.getProperty(STATISTIC_FIELD))) + .withId(METRIC_ID); + + dataRequest.withMetricDataQueries(Collections.singletonList(mds)); + + ValueSet timeConstraint = readRecordsRequest.getConstraints().getSummary().get(TIMESTAMP_FIELD); + if (timeConstraint instanceof SortedRangeSet && !timeConstraint.isNullAllowed()) { + //SortedRangeSet is how >, <, between is represented which are easiest and most common when + //searching logs so we attempt to push that down here as an optimization. SQL can represent complex + //overlapping ranges which Cloudwatch can not support so this is not a replacement for applying + //constraints using the ConstraintEvaluator. + + Range basicPredicate = ((SortedRangeSet) timeConstraint).getSpan(); + + if (!basicPredicate.getLow().isNullValue()) { + Long lowerBound = (Long) basicPredicate.getLow().getValue(); + //TODO: confirm timezone handling + logger.info("makeGetMetricsRequest: with startTime " + (lowerBound * 1000) + " " + new Date(lowerBound * 1000)); + dataRequest.withStartTime(new Date(lowerBound * 1000)); + } + else { + //TODO: confirm timezone handling + dataRequest.withStartTime(new Date(0)); + } + + if (!basicPredicate.getHigh().isNullValue()) { + Long upperBound = (Long) basicPredicate.getHigh().getValue(); + //TODO: confirm timezone handling + logger.info("makeGetMetricsRequest: with endTime " + (upperBound * 1000) + " " + new Date(upperBound * 1000)); + dataRequest.withEndTime(new Date(upperBound * 1000)); + } + else { + //TODO: confirm timezone handling + dataRequest.withEndTime(new Date(System.currentTimeMillis())); + } + } + else { + //TODO: confirm timezone handling + dataRequest.withStartTime(new Date(0)); + dataRequest.withEndTime(new Date(System.currentTimeMillis())); + } + + return dataRequest; + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsCompositeHandler.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsCompositeHandler.java new file mode 100644 index 0000000000..6c2999ddf3 --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose MetricsMetadataHandler and MetricsRecordHandler. + */ +public class MetricsCompositeHandler + extends CompositeHandler +{ + public MetricsCompositeHandler() + { + super(new MetricsMetadataHandler(), new MetricsRecordHandler()); + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsExceptionFilter.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsExceptionFilter.java new file mode 100644 index 0000000000..4810c6a017 --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsExceptionFilter.java @@ -0,0 +1,49 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.services.cloudwatch.model.AmazonCloudWatchException; +import com.amazonaws.services.cloudwatch.model.LimitExceededException; + +/** + * Used to identify Exceptions that are related to Cloudwatch Metrics throttling events. + */ +public class MetricsExceptionFilter + implements ThrottlingInvoker.ExceptionFilter +{ + public static final ThrottlingInvoker.ExceptionFilter EXCEPTION_FILTER = new MetricsExceptionFilter(); + + private MetricsExceptionFilter() {} + + @Override + public boolean isMatch(Exception ex) + { + if (ex instanceof AmazonCloudWatchException && ex.getMessage().startsWith("Rate exceeded")) { + return true; + } + + if (ex instanceof AmazonCloudWatchException && ex.getMessage().startsWith("Request has been throttled")) { + return true; + } + + return (ex instanceof LimitExceededException); + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsMetadataHandler.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsMetadataHandler.java new file mode 100644 index 0000000000..fa8e4dc7b0 --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsMetadataHandler.java @@ -0,0 +1,286 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.MetricSamplesTable; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.MetricsTable; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.cloudwatch.AmazonCloudWatch; +import com.amazonaws.services.cloudwatch.AmazonCloudWatchClientBuilder; +import com.amazonaws.services.cloudwatch.model.ListMetricsRequest; +import com.amazonaws.services.cloudwatch.model.ListMetricsResult; +import com.amazonaws.services.cloudwatch.model.Metric; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import static com.amazonaws.athena.connectors.cloudwatch.metrics.MetricsExceptionFilter.EXCEPTION_FILTER; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.METRIC_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.NAMESPACE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.PERIOD_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.STATISTIC_FIELD; + +/** + * Handles metadata requests for the Athena Cloudwatch Metrics Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Provides two tables (metrics and metric_samples) for accessing Cloudwatch Metrics data via the "default" schema. + * 2. Supports Predicate Pushdown into Cloudwatch Metrics for most fields. + * 3. If multiple Metrics (namespace, metric, dimension(s), and statistic) are requested, they can be read in parallel. + */ +public class MetricsMetadataHandler + extends MetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(MetricsMetadataHandler.class); + + //Used to log diagnostic info about this connector + private static final String SOURCE_TYPE = "metrics"; + + //List of available statistics (AVERAGE, p90, etc...). + protected static final List STATISTICS = new ArrayList<>(); + //The schema (aka database) supported by this connector + protected static final String SCHEMA_NAME = "default"; + //Schema for the metrics table + private static final Table METRIC_TABLE; + //Schema for the metric_samples table. + private static final Table METRIC_DATA_TABLE; + //Name of the table which contains details of available metrics. + private static final String METRIC_TABLE_NAME; + //Name of the table which contains metric samples. + private static final String METRIC_SAMPLES_TABLE_NAME; + //Lookup table for resolving table name to Schema. + private static final Map TABLES = new HashMap<>(); + //The default metric period to query (60 seconds) + private static final int DEFAULT_PERIOD_SEC = 60; + //Used to handle throttling events by applying AIMD congestion control + private final ThrottlingInvoker invoker = ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + + private final AmazonCloudWatch metrics; + + static { + //The statistics supported by Cloudwatch Metrics by default + STATISTICS.add("Average"); + STATISTICS.add("Minimum"); + STATISTICS.add("Maximum"); + STATISTICS.add("Sum"); + STATISTICS.add("Sample Count"); + STATISTICS.add("p99"); + STATISTICS.add("p95"); + STATISTICS.add("p90"); + STATISTICS.add("p50"); + STATISTICS.add("p10"); + + METRIC_TABLE = new MetricsTable(); + METRIC_DATA_TABLE = new MetricSamplesTable(); + METRIC_TABLE_NAME = METRIC_TABLE.getName(); + METRIC_SAMPLES_TABLE_NAME = METRIC_DATA_TABLE.getName(); + TABLES.put(METRIC_TABLE_NAME, METRIC_TABLE); + TABLES.put(METRIC_SAMPLES_TABLE_NAME, METRIC_DATA_TABLE); + } + + public MetricsMetadataHandler() + { + super(SOURCE_TYPE); + metrics = AmazonCloudWatchClientBuilder.standard().build(); + } + + @VisibleForTesting + protected MetricsMetadataHandler(AmazonCloudWatch metrics, + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(keyFactory, secretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + this.metrics = metrics; + } + + /** + * Only supports a single, static, schema defined by SCHEMA_NAME. + * + * @see MetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest listSchemasRequest) + { + return new ListSchemasResponse(listSchemasRequest.getCatalogName(), Collections.singletonList(SCHEMA_NAME)); + } + + /** + * Supports a set of static tables defined by: TABLES + * + * @see MetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest listTablesRequest) + { + List tables = new ArrayList<>(); + TABLES.keySet().stream().forEach(next -> tables.add(new TableName(SCHEMA_NAME, next))); + return new ListTablesResponse(listTablesRequest.getCatalogName(), tables); + } + + /** + * Returns the details of the requested static table. + * + * @see MetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + validateTable(getTableRequest.getTableName()); + Table table = TABLES.get(getTableRequest.getTableName().getTableName()); + return new GetTableResponse(getTableRequest.getCatalogName(), + getTableRequest.getTableName(), + table.getSchema(), + table.getPartitionColumns()); + } + + /** + * Our table doesn't support complex layouts or partitioning so we simply make this method a NoOp and the SDK will + * automatically generate a single placeholder partition for us since Athena needs at least 1 partition returned + * if there is potetnailly any data to read. We do this because Cloudwatch Metric's APIs do not support the kind of filtering we need to do + * reasonably scoped partition pruning. Instead we do the pruning at Split generation time and return a single + * partition here. The down side to doing it at Split generation time is that we sacrifice parallelizing Split + * generation. However this is not a significant performance detrement to this connector since we can + * generate Splits rather quickly and easily. + * + * @see MetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + validateTable(request.getTableName()); + //NoOp as we do not support partitioning. + } + + /** + * Each 'metric' in cloudwatch is uniquely identified by a quad of Namespace, List, MetricName, Statistic. As such + * we can parallelize each metric as a unique split. + * + * @see MetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest getSplitsRequest) + throws Exception + { + validateTable(getSplitsRequest.getTableName()); + + //Handle requests for the METRIC_TABLE which requires only 1 split to list available metrics. + if (METRIC_TABLE_NAME.equals(getSplitsRequest.getTableName().getTableName())) { + //The request is just for meta-data about what metrics exist. + Split metricsSplit = Split.newBuilder(makeSpillLocation(getSplitsRequest), makeEncryptionKey()).build(); + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), metricsSplit); + } + + //handle generating splits for reading actual metrics data. + try (ConstraintEvaluator constraintEvaluator = new ConstraintEvaluator(blockAllocator, + METRIC_DATA_TABLE.getSchema(), + getSplitsRequest.getConstraints())) { + ListMetricsRequest listMetricsRequest = new ListMetricsRequest(); + MetricUtils.pushDownPredicate(getSplitsRequest.getConstraints(), listMetricsRequest); + listMetricsRequest.setNextToken(getSplitsRequest.getContinuationToken()); + + String period = getPeriodFromConstraint(getSplitsRequest.getConstraints()); + Set splits = new HashSet<>(); + ListMetricsResult result = invoker.invoke(() -> metrics.listMetrics(listMetricsRequest)); + for (Metric nextMetric : result.getMetrics()) { + for (String nextStatistic : STATISTICS) { + if (MetricUtils.applyMetricConstraints(constraintEvaluator, nextMetric, nextStatistic)) { + splits.add(Split.newBuilder(makeSpillLocation(getSplitsRequest), makeEncryptionKey()) + .add(DimensionSerDe.SERIALZIE_DIM_FIELD_NAME, DimensionSerDe.serialize(nextMetric.getDimensions())) + .add(METRIC_NAME_FIELD, nextMetric.getMetricName()) + .add(NAMESPACE_FIELD, nextMetric.getNamespace()) + .add(STATISTIC_FIELD, nextStatistic) + .add(PERIOD_FIELD, period) + .build()); + } + } + } + + String continuationToken = null; + if (result.getNextToken() != null && + !result.getNextToken().equalsIgnoreCase(listMetricsRequest.getNextToken())) { + continuationToken = result.getNextToken(); + } + + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), splits, continuationToken); + } + } + + /** + * Resolved the metric period to query, using a default if no period constraint is found. + */ + private String getPeriodFromConstraint(Constraints constraints) + { + ValueSet period = constraints.getSummary().get(PERIOD_FIELD); + if (period != null && period.isSingleValue()) { + return String.valueOf(period.getSingleValue()); + } + + return String.valueOf(DEFAULT_PERIOD_SEC); + } + + /** + * Validates that the requested schema and table exist in our static set of supported tables. + */ + private void validateTable(TableName tableName) + { + if (!SCHEMA_NAME.equals(tableName.getSchemaName())) { + throw new RuntimeException("Unknown table " + tableName); + } + + if (TABLES.get(tableName.getTableName()) == null) { + throw new RuntimeException("Unknown table " + tableName); + } + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsRecordHandler.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsRecordHandler.java new file mode 100644 index 0000000000..73434dbda6 --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsRecordHandler.java @@ -0,0 +1,262 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.MetricSamplesTable; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.MetricsTable; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.cloudwatch.AmazonCloudWatch; +import com.amazonaws.services.cloudwatch.AmazonCloudWatchClientBuilder; +import com.amazonaws.services.cloudwatch.model.Dimension; +import com.amazonaws.services.cloudwatch.model.GetMetricDataRequest; +import com.amazonaws.services.cloudwatch.model.GetMetricDataResult; +import com.amazonaws.services.cloudwatch.model.ListMetricsRequest; +import com.amazonaws.services.cloudwatch.model.ListMetricsResult; +import com.amazonaws.services.cloudwatch.model.Metric; +import com.amazonaws.services.cloudwatch.model.MetricDataResult; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Date; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.concurrent.TimeoutException; + +import static com.amazonaws.athena.connector.lambda.data.FieldResolver.DEFAULT; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.MetricsExceptionFilter.EXCEPTION_FILTER; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.MetricsMetadataHandler.STATISTICS; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSIONS_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_VALUE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.METRIC_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.NAMESPACE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.PERIOD_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.STATISTIC_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.TIMESTAMP_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.VALUE_FIELD; + +/** + * Handles data read record requests for the Athena Cloudwatch Metrics Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Reads and maps Cloudwatch Metrics and Metric Samples. + * 2. Attempts to push down time range predicates into Cloudwatch Metrics. + */ +public class MetricsRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(MetricsRecordHandler.class); + + //Used to log diagnostic info about this connector + private static final String SOURCE_TYPE = "metrics"; + //Schema for the metrics table. + private static final Table METRIC_TABLE = new MetricsTable(); + //Schema for the metric_samples table. + private static final Table METRIC_DATA_TABLE = new MetricSamplesTable(); + + //Used to handle throttling events by applying AIMD congestion control + private final ThrottlingInvoker invoker = ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + private final AmazonS3 amazonS3; + private final AmazonCloudWatch metrics; + + public MetricsRecordHandler() + { + this(AmazonS3ClientBuilder.defaultClient(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + AmazonCloudWatchClientBuilder.standard().build()); + } + + @VisibleForTesting + protected MetricsRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, AmazonCloudWatch metrics) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + this.amazonS3 = amazonS3; + this.metrics = metrics; + } + + /** + * Scans Cloudwatch Metrics for the list of available metrics or the samples for a specific metric. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest readRecordsRequest, QueryStatusChecker queryStatusChecker) + throws TimeoutException + { + invoker.setBlockSpiller(blockSpiller); + if (readRecordsRequest.getTableName().getTableName().equalsIgnoreCase(METRIC_TABLE.getName())) { + readMetricsWithConstraint(blockSpiller, readRecordsRequest, queryStatusChecker); + } + else if (readRecordsRequest.getTableName().getTableName().equalsIgnoreCase(METRIC_DATA_TABLE.getName())) { + readMetricSamplesWithConstraint(blockSpiller, readRecordsRequest, queryStatusChecker); + } + } + + /** + * Handles retrieving the list of available metrics when the METRICS_TABLE is queried by listing metrics in Cloudwatch Metrics. + */ + private void readMetricsWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest request, QueryStatusChecker queryStatusChecker) + throws TimeoutException + { + ListMetricsRequest listMetricsRequest = new ListMetricsRequest(); + MetricUtils.pushDownPredicate(request.getConstraints(), listMetricsRequest); + String prevToken; + Set requiredFields = new HashSet<>(); + request.getSchema().getFields().stream().forEach(next -> requiredFields.add(next.getName())); + ValueSet dimensionNameConstraint = request.getConstraints().getSummary().get(DIMENSION_NAME_FIELD); + ValueSet dimensionValueConstraint = request.getConstraints().getSummary().get(DIMENSION_VALUE_FIELD); + do { + prevToken = listMetricsRequest.getNextToken(); + ListMetricsResult result = invoker.invoke(() -> metrics.listMetrics(listMetricsRequest)); + for (Metric nextMetric : result.getMetrics()) { + blockSpiller.writeRows((Block block, int row) -> { + boolean matches = MetricUtils.applyMetricConstraints(blockSpiller.getConstraintEvaluator(), nextMetric, null); + if (matches) { + matches &= block.offerValue(METRIC_NAME_FIELD, row, nextMetric.getMetricName()); + matches &= block.offerValue(NAMESPACE_FIELD, row, nextMetric.getNamespace()); + matches &= block.offerComplexValue(STATISTIC_FIELD, row, DEFAULT, STATISTICS); + + matches &= block.offerComplexValue(DIMENSIONS_FIELD, + row, + (Field field, Object val) -> { + if (field.getName().equals(DIMENSION_NAME_FIELD)) { + return ((Dimension) val).getName(); + } + else if (field.getName().equals(DIMENSION_VALUE_FIELD)) { + return ((Dimension) val).getValue(); + } + + throw new RuntimeException("Unexpected field " + field.getName()); + }, + nextMetric.getDimensions()); + + //This field is 'faked' in that we just use it as a convenient way to filter single dimensions. As such + //we always populate it with the value of the filter if the constraint passed and the filter was singleValue + String dimName = (dimensionNameConstraint == null || !dimensionNameConstraint.isSingleValue()) + ? null : (dimensionNameConstraint.getSingleValue().toString()); + matches &= block.offerValue(DIMENSION_NAME_FIELD, row, dimName); + + //This field is 'faked' in that we just use it as a convenient way to filter single dimensions. As such + //we always populate it with the value of the filter if the constraint passed and the filter was singleValue + String dimValue = (dimensionValueConstraint == null || !dimensionValueConstraint.isSingleValue()) + ? null : dimensionValueConstraint.getSingleValue().toString(); + matches &= block.offerValue(DIMENSION_VALUE_FIELD, row, dimValue); + } + return matches ? 1 : 0; + }); + } + listMetricsRequest.setNextToken(result.getNextToken()); + } + while (listMetricsRequest.getNextToken() != null && !listMetricsRequest.getNextToken().equalsIgnoreCase(prevToken) && queryStatusChecker.isQueryRunning()); + } + + /** + * Handles retrieving the samples for a specific metric from Cloudwatch Metrics. + */ + private void readMetricSamplesWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest request, QueryStatusChecker queryStatusChecker) + throws TimeoutException + { + Split split = request.getSplit(); + List dimensions = DimensionSerDe.deserialize(split.getProperty(DimensionSerDe.SERIALZIE_DIM_FIELD_NAME)); + GetMetricDataRequest dataRequest = MetricUtils.makeGetMetricDataRequest(request); + + String prevToken; + Set requiredFields = new HashSet<>(); + request.getSchema().getFields().stream().forEach(next -> requiredFields.add(next.getName())); + ValueSet dimensionNameConstraint = request.getConstraints().getSummary().get(DIMENSION_NAME_FIELD); + ValueSet dimensionValueConstraint = request.getConstraints().getSummary().get(DIMENSION_NAME_FIELD); + do { + prevToken = dataRequest.getNextToken(); + GetMetricDataResult result = invoker.invoke(() -> metrics.getMetricData(dataRequest)); + for (MetricDataResult nextMetric : result.getMetricDataResults()) { + List timestamps = nextMetric.getTimestamps(); + List values = nextMetric.getValues(); + for (int i = 0; i < nextMetric.getValues().size(); i++) { + int sampleNum = i; + blockSpiller.writeRows((Block block, int row) -> { + /** + * Most constraints were already applied at split generation so we only need to apply + * a subset. + */ + block.offerValue(METRIC_NAME_FIELD, row, split.getProperty(METRIC_NAME_FIELD)); + block.offerValue(NAMESPACE_FIELD, row, split.getProperty(NAMESPACE_FIELD)); + block.offerValue(STATISTIC_FIELD, row, split.getProperty(STATISTIC_FIELD)); + + block.offerComplexValue(DIMENSIONS_FIELD, + row, + (Field field, Object val) -> { + if (field.getName().equals(DIMENSION_NAME_FIELD)) { + return ((Dimension) val).getName(); + } + else if (field.getName().equals(DIMENSION_VALUE_FIELD)) { + return ((Dimension) val).getValue(); + } + + throw new RuntimeException("Unexpected field " + field.getName()); + }, + dimensions); + + //This field is 'faked' in that we just use it as a convenient way to filter single dimensions. As such + //we always populate it with the value of the filter if the constraint passed and the filter was singleValue + String dimName = (dimensionNameConstraint == null || !dimensionNameConstraint.isSingleValue()) + ? null : dimensionNameConstraint.getSingleValue().toString(); + block.offerValue(DIMENSION_NAME_FIELD, row, dimName); + + //This field is 'faked' in that we just use it as a convenient way to filter single dimensions. As such + //we always populate it with the value of the filter if the constraint passed and the filter was singleValue + String dimVal = (dimensionValueConstraint == null || !dimensionValueConstraint.isSingleValue()) + ? null : dimensionValueConstraint.getSingleValue().toString(); + block.offerValue(DIMENSION_VALUE_FIELD, row, dimVal); + + block.offerValue(PERIOD_FIELD, row, Integer.valueOf(split.getProperty(PERIOD_FIELD))); + + boolean matches = true; + block.offerValue(VALUE_FIELD, row, values.get(sampleNum)); + long timestamp = timestamps.get(sampleNum).getTime() / 1000; + block.offerValue(TIMESTAMP_FIELD, row, timestamp); + + return matches ? 1 : 0; + }); + } + } + dataRequest.setNextToken(result.getNextToken()); + } + while (dataRequest.getNextToken() != null && !dataRequest.getNextToken().equalsIgnoreCase(prevToken) && queryStatusChecker.isQueryRunning()); + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/MetricSamplesTable.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/MetricSamplesTable.java new file mode 100644 index 0000000000..02f325a9ac --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/MetricSamplesTable.java @@ -0,0 +1,100 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics.tables; + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.Set; + +/** + * Defines the metadata associated with our static metric_samples table. + *

+ * This table contains the available metric samples for each metric named in the **metrics** table. + * More specifically, the table contains the following columns: + *

+ * **namespace** - A VARCHAR containing the namespace. + * **metric_name** - A VARCHAR containing the metric name. + * **dimensions** - A LIST of STRUCTS comprised of dim_name (VARCHAR) and dim_value (VARCHAR). + * **dim_name** - A VARCHAR convenience field used to easily filter on a single dimension name. + * **dim_value** - A VARCHAR convenience field used to easily filter on a single dimension value. + * **period** - An INT field representing the 'period' of the metric in seconds. (e.g. 60 second metric) + * **timestamp** - A BIGINT field representing the epoch time (in seconds) the metric sample is for. + * **value** - A FLOAT8 field containing the value of the sample. + * **statistic** - A VARCHAR containing the statistic type of the sample. (e.g. AVERAGE, p90, etc..) + */ +public class MetricSamplesTable + extends Table +{ + private final Schema schema; + private final String name; + + public MetricSamplesTable() + { + schema = new SchemaBuilder().newBuilder() + .addStringField(NAMESPACE_FIELD) + .addStringField(METRIC_NAME_FIELD) + .addField(FieldBuilder.newBuilder(DIMENSIONS_FIELD, Types.MinorType.LIST.getType()) + .addField(FieldBuilder.newBuilder(DIMENSIONS_FIELD, Types.MinorType.STRUCT.getType()) + .addStringField(DIMENSION_NAME_FIELD) + .addStringField(DIMENSION_VALUE_FIELD) + .build()) + .build()) + .addStringField(DIMENSION_NAME_FIELD) + .addStringField(DIMENSION_VALUE_FIELD) + .addIntField(PERIOD_FIELD) + .addBigIntField(TIMESTAMP_FIELD) + .addFloat8Field(VALUE_FIELD) + .addStringField(STATISTIC_FIELD) + .addMetadata(NAMESPACE_FIELD, "Metric namespace") + .addMetadata(METRIC_NAME_FIELD, "Metric name") + .addMetadata(DIMENSIONS_FIELD, "Array of Dimensions for the given metric.") + .addMetadata(DIMENSION_NAME_FIELD, "Shortcut field that flattens dimension to allow easier filtering on a single dimension name. This field is left blank unless used in the where clause") + .addMetadata(DIMENSION_VALUE_FIELD, "Shortcut field that flattens dimension to allow easier filtering on a single dimension value. This field is left blank unless used in the where clause.") + .addMetadata(STATISTIC_FIELD, "Statistics type of this value (e.g. Maximum, Minimum, Average, Sample Count)") + .addMetadata(TIMESTAMP_FIELD, "The epoch time (in seconds) the value is for.") + .addMetadata(PERIOD_FIELD, "The period, in seconds, for the metric (e.g. 60 seconds, 120 seconds)") + .addMetadata(VALUE_FIELD, "The value for the sample.") + .build(); + + name = "metric_samples"; + } + + @Override + public String getName() + { + return name; + } + + @Override + public Schema getSchema() + { + return schema; + } + + @Override + public Set getPartitionColumns() + { + return Collections.emptySet(); + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/MetricsTable.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/MetricsTable.java new file mode 100644 index 0000000000..6c76356e38 --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/MetricsTable.java @@ -0,0 +1,88 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics.tables; + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.Set; + +/** + * Defines the metadata associated with our static metrics table. + *

+ * This table contains the available metrics as uniquely defined by a triple of namespace, set, name. + * More specifically, this table contains the following columns. + * * **namespace** - A VARCHAR containing the namespace. + * * **metric_name** - A VARCHAR containing the metric name. + * * **dimensions** - A LIST of STRUCTS comprised of dim_name (VARCHAR) and dim_value (VARCHAR). + * * **statistic** - A List of VARCH statistics (e.g. p90, AVERAGE, etc..) avialable for the metric. + */ +public class MetricsTable + extends Table +{ + private final Schema schema; + private final String name; + + public MetricsTable() + { + schema = new SchemaBuilder().newBuilder() + .addStringField(NAMESPACE_FIELD) + .addStringField(METRIC_NAME_FIELD) + .addField(FieldBuilder.newBuilder(DIMENSIONS_FIELD, Types.MinorType.LIST.getType()) + .addField(FieldBuilder.newBuilder(DIMENSIONS_FIELD, Types.MinorType.STRUCT.getType()) + .addStringField(DIMENSION_NAME_FIELD) + .addStringField(DIMENSION_VALUE_FIELD) + .build()) + .build()) + .addStringField(DIMENSION_NAME_FIELD) + .addStringField(DIMENSION_VALUE_FIELD) + .addListField(STATISTIC_FIELD, Types.MinorType.VARCHAR.getType()) + .addMetadata(NAMESPACE_FIELD, "Metric namespace") + .addMetadata(METRIC_NAME_FIELD, "Metric name") + .addMetadata(STATISTIC_FIELD, "List of statistics available for this metric (e.g. Maximum, Minimum, Average, Sample Count)") + .addMetadata(DIMENSIONS_FIELD, "Array of Dimensions for the given metric.") + .addMetadata(DIMENSION_NAME_FIELD, "Shortcut field that flattens dimension to allow easier filtering for metrics that contain the dimension name. This field is left blank unless used in the where clause.") + .addMetadata(DIMENSION_VALUE_FIELD, "Shortcut field that flattens dimension to allow easier filtering for metrics that contain the dimension value. This field is left blank unless used in the where clause.") + .build(); + + name = "metrics"; + } + + @Override + public String getName() + { + return name; + } + + @Override + public Schema getSchema() + { + return schema; + } + + @Override + public Set getPartitionColumns() + { + return Collections.emptySet(); + } +} diff --git a/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/Table.java b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/Table.java new file mode 100644 index 0000000000..812803d58c --- /dev/null +++ b/athena-cloudwatch-metrics/src/main/java/com/amazonaws/athena/connectors/cloudwatch/metrics/tables/Table.java @@ -0,0 +1,53 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics.tables; + +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Set; + +/** + * Defines some commonly required field names used by all tables and consumers of tables in this connector. + */ +public abstract class Table +{ + //The name of the metric name field. + public static final String METRIC_NAME_FIELD = "metric_name"; + //The name of the namespace field. + public static final String NAMESPACE_FIELD = "namespace"; + //The name of the dimensions field which houses a list of Cloudwatch Metrics Dimensions. + public static final String DIMENSIONS_FIELD = "dimensions"; + //The name of the convenience Dimension name field which gives easy access to 1 dimension name. + public static final String DIMENSION_NAME_FIELD = "dim_name"; + //The name of the convenience Dimension value field which gives easy access to 1 dimension value. + public static final String DIMENSION_VALUE_FIELD = "dim_value"; + //The name of the timestamp field, denoting the time period a particular metric sample was for. + public static final String TIMESTAMP_FIELD = "timestamp"; + //The name of the metric value field which holds the value of a metric sample. + public static final String VALUE_FIELD = "value"; + //The name of the statistic field (e.g. AVERAGE, p90). + public static final String STATISTIC_FIELD = "statistic"; + //The name of the period field (e.g. 60 seconds). + public static final String PERIOD_FIELD = "period"; + + public abstract String getName(); + public abstract Schema getSchema(); + public abstract Set getPartitionColumns(); +} diff --git a/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/DimensionSerDeTest.java b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/DimensionSerDeTest.java new file mode 100644 index 0000000000..279bd7196b --- /dev/null +++ b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/DimensionSerDeTest.java @@ -0,0 +1,51 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.services.cloudwatch.model.Dimension; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.*; + +public class DimensionSerDeTest +{ + private static final Logger logger = LoggerFactory.getLogger(DimensionSerDeTest.class); + private static final String EXPECTED_SERIALIZATION = "{\"dimensions\":[{\"name\":\"dim_name\",\"value\":\"dim_val\"}" + + ",{\"name\":\"dim_name1\",\"value\":\"dim_val1\"},{\"name\":\"dim_name2\",\"value\":\"dim_val2\"}]}"; + + @Test + public void serializeTest() + { + List expected = new ArrayList<>(); + expected.add(new Dimension().withName("dim_name").withValue("dim_val")); + expected.add(new Dimension().withName("dim_name1").withValue("dim_val1")); + expected.add(new Dimension().withName("dim_name2").withValue("dim_val2")); + String actualSerialization = DimensionSerDe.serialize(expected); + logger.info("serializeTest: {}", actualSerialization); + List actual = DimensionSerDe.deserialize(actualSerialization); + assertEquals(EXPECTED_SERIALIZATION, actualSerialization); + assertEquals(expected, actual); + } +} diff --git a/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricUtilsTest.java b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricUtilsTest.java new file mode 100644 index 0000000000..dbd839fac1 --- /dev/null +++ b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricUtilsTest.java @@ -0,0 +1,203 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.services.cloudwatch.model.Dimension; +import com.amazonaws.services.cloudwatch.model.DimensionFilter; +import com.amazonaws.services.cloudwatch.model.GetMetricDataRequest; +import com.amazonaws.services.cloudwatch.model.ListMetricsRequest; +import com.amazonaws.services.cloudwatch.model.Metric; +import com.amazonaws.services.cloudwatch.model.MetricStat; +import org.apache.arrow.vector.types.pojo.Schema; +import com.google.common.collect.ImmutableList; +import org.apache.arrow.vector.types.Types; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static com.amazonaws.athena.connectors.cloudwatch.metrics.DimensionSerDe.SERIALZIE_DIM_FIELD_NAME; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.TestUtils.makeStringEquals; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_VALUE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.METRIC_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.NAMESPACE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.PERIOD_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.STATISTIC_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.TIMESTAMP_FIELD; +import static org.junit.Assert.*; + +public class MetricUtilsTest +{ + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String catalog = "default"; + private BlockAllocator allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void applyMetricConstraints() + { + Schema schema = SchemaBuilder.newBuilder() + .addStringField(NAMESPACE_FIELD) + .addStringField(METRIC_NAME_FIELD) + .addStringField(STATISTIC_FIELD) + .addStringField(DIMENSION_NAME_FIELD) + .addStringField(DIMENSION_VALUE_FIELD) + .build(); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put(NAMESPACE_FIELD, makeStringEquals(allocator, "match1")); + constraintsMap.put(METRIC_NAME_FIELD, makeStringEquals(allocator, "match2")); + constraintsMap.put(STATISTIC_FIELD, makeStringEquals(allocator, "match3")); + constraintsMap.put(DIMENSION_NAME_FIELD, makeStringEquals(allocator, "match4")); + constraintsMap.put(DIMENSION_VALUE_FIELD, makeStringEquals(allocator, "match5")); + + ConstraintEvaluator constraintEvaluator = new ConstraintEvaluator(allocator, schema, new Constraints(constraintsMap)); + + Metric metric = new Metric() + .withNamespace("match1") + .withMetricName("match2") + .withDimensions(new Dimension().withName("match4").withValue("match5")); + String statistic = "match3"; + assertTrue(MetricUtils.applyMetricConstraints(constraintEvaluator, metric, statistic)); + + assertFalse(MetricUtils.applyMetricConstraints(constraintEvaluator, copyMetric(metric).withNamespace("no_match"), statistic)); + assertFalse(MetricUtils.applyMetricConstraints(constraintEvaluator, copyMetric(metric).withMetricName("no_match"), statistic)); + assertFalse(MetricUtils.applyMetricConstraints(constraintEvaluator, + copyMetric(metric).withDimensions(Collections.singletonList(new Dimension().withName("no_match").withValue("match5"))), statistic)); + assertFalse(MetricUtils.applyMetricConstraints(constraintEvaluator, + copyMetric(metric).withDimensions(Collections.singletonList(new Dimension().withName("match4").withValue("no_match"))), statistic)); + assertFalse(MetricUtils.applyMetricConstraints(constraintEvaluator, copyMetric(metric), "no_match")); + } + + private Metric copyMetric(Metric metric) + { + Metric newMetric = new Metric() + .withNamespace(metric.getNamespace()) + .withMetricName(metric.getMetricName()); + + List dims = new ArrayList<>(); + for (Dimension next : metric.getDimensions()) { + dims.add(new Dimension().withName(next.getName()).withValue(next.getValue())); + } + return newMetric.withDimensions(dims); + } + + @Test + public void pushDownPredicate() + { + Map constraintsMap = new HashMap<>(); + constraintsMap.put(NAMESPACE_FIELD, makeStringEquals(allocator, "match1")); + constraintsMap.put(METRIC_NAME_FIELD, makeStringEquals(allocator, "match2")); + constraintsMap.put(STATISTIC_FIELD, makeStringEquals(allocator, "match3")); + constraintsMap.put(DIMENSION_NAME_FIELD, makeStringEquals(allocator, "match4")); + constraintsMap.put(DIMENSION_VALUE_FIELD, makeStringEquals(allocator, "match5")); + + ListMetricsRequest request = new ListMetricsRequest(); + MetricUtils.pushDownPredicate(new Constraints(constraintsMap), request); + + assertEquals("match1", request.getNamespace()); + assertEquals("match2", request.getMetricName()); + assertEquals(1, request.getDimensions().size()); + assertEquals(new DimensionFilter().withName("match4").withValue("match5"), request.getDimensions().get(0)); + } + + @Test + public void makeGetMetricDataRequest() + { + String schema = "schema"; + String table = "table"; + Integer period = 60; + String statistic = "p90"; + String metricName = "metricName"; + String namespace = "namespace"; + + List dimensions = new ArrayList<>(); + dimensions.add(new Dimension().withName("dim_name1").withValue("dim_value1")); + dimensions.add(new Dimension().withName("dim_name2").withValue("dim_value2")); + + Split split = Split.newBuilder(null, null) + .add(NAMESPACE_FIELD, namespace) + .add(METRIC_NAME_FIELD, metricName) + .add(PERIOD_FIELD, String.valueOf(period)) + .add(STATISTIC_FIELD, statistic) + .add(SERIALZIE_DIM_FIELD_NAME, DimensionSerDe.serialize(dimensions)) + .build(); + + Schema schemaForRead = SchemaBuilder.newBuilder().addStringField(METRIC_NAME_FIELD).build(); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put(TIMESTAMP_FIELD, SortedRangeSet.copyOf(Types.MinorType.BIGINT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.BIGINT.getType(), 1L)), false)); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + split, + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + + GetMetricDataRequest actual = MetricUtils.makeGetMetricDataRequest(request); + assertEquals(1, actual.getMetricDataQueries().size()); + assertNotNull(actual.getMetricDataQueries().get(0).getId()); + MetricStat metricStat = actual.getMetricDataQueries().get(0).getMetricStat(); + assertNotNull(metricStat); + assertEquals(metricName, metricStat.getMetric().getMetricName()); + assertEquals(namespace, metricStat.getMetric().getNamespace()); + assertEquals(statistic, metricStat.getStat()); + assertEquals(period, metricStat.getPeriod()); + assertEquals(2, metricStat.getMetric().getDimensions().size()); + assertEquals(1000L, actual.getStartTime().getTime()); + assertTrue(actual.getStartTime().getTime() <= System.currentTimeMillis() + 1_000); + } +} diff --git a/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsMetadataHandlerTest.java b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsMetadataHandlerTest.java new file mode 100644 index 0000000000..bbe9719a40 --- /dev/null +++ b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsMetadataHandlerTest.java @@ -0,0 +1,337 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.cloudwatch.AmazonCloudWatch; +import com.amazonaws.services.cloudwatch.model.ListMetricsRequest; +import com.amazonaws.services.cloudwatch.model.ListMetricsResult; +import com.amazonaws.services.cloudwatch.model.Metric; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static com.amazonaws.athena.connectors.cloudwatch.metrics.DimensionSerDe.SERIALZIE_DIM_FIELD_NAME; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.METRIC_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.NAMESPACE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.PERIOD_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.STATISTIC_FIELD; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class MetricsMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(MetricsMetadataHandlerTest.class); + + private final String defaultSchema = "default"; + private final FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + + private MetricsMetadataHandler handler; + private BlockAllocator allocator; + + @Mock + private AmazonCloudWatch mockMetrics; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + handler = new MetricsMetadataHandler(mockMetrics, new LocalKeyFactory(), mockSecretsManager, mockAthena, "spillBucket", "spillPrefix"); + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doListSchemaNames() + { + logger.info("doListSchemas - enter"); + + ListSchemasRequest req = new ListSchemasRequest(identity, "queryId", "default"); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + logger.info("doListSchemas - {}", res.getSchemas()); + + assertTrue(res.getSchemas().size() == 1); + assertEquals(defaultSchema, res.getSchemas().iterator().next()); + + logger.info("doListSchemas - exit"); + } + + @Test + public void doListTables() + { + logger.info("doListTables - enter"); + + ListTablesRequest req = new ListTablesRequest(identity, "queryId", "default", defaultSchema); + ListTablesResponse res = handler.doListTables(allocator, req); + logger.info("doListTables - {}", res.getTables()); + + assertEquals(2, res.getTables().size()); + assertTrue(res.getTables().contains(new TableName(defaultSchema, "metrics"))); + assertTrue(res.getTables().contains(new TableName(defaultSchema, "metric_samples"))); + + logger.info("doListTables - exit"); + } + + @Test + public void doGetMetricsTable() + { + logger.info("doGetMetricsTable - enter"); + + GetTableRequest metricsTableReq = new GetTableRequest(identity, "queryId", "default", new TableName(defaultSchema, "metrics")); + GetTableResponse metricsTableRes = handler.doGetTable(allocator, metricsTableReq); + logger.info("doGetMetricsTable - {} {}", metricsTableRes.getTableName(), metricsTableRes.getSchema()); + + assertEquals(new TableName(defaultSchema, "metrics"), metricsTableRes.getTableName()); + assertNotNull(metricsTableRes.getSchema()); + assertEquals(6, metricsTableRes.getSchema().getFields().size()); + + logger.info("doGetMetricsTable - exit"); + } + + @Test + public void doGetMetricSamplesTable() + { + logger.info("doGetMetricSamplesTable - enter"); + + GetTableRequest metricsTableReq = new GetTableRequest(identity, + "queryId", + "default", + new TableName(defaultSchema, "metric_samples")); + + GetTableResponse metricsTableRes = handler.doGetTable(allocator, metricsTableReq); + logger.info("doGetMetricSamplesTable - {} {}", metricsTableRes.getTableName(), metricsTableRes.getSchema()); + + assertEquals(new TableName(defaultSchema, "metric_samples"), metricsTableRes.getTableName()); + assertNotNull(metricsTableRes.getSchema()); + assertEquals(9, metricsTableRes.getSchema().getFields().size()); + + logger.info("doGetMetricSamplesTable - exit"); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put(METRIC_NAME_FIELD, + EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add("MyMetric").build()); + + GetTableLayoutRequest req = new GetTableLayoutRequest(identity, + "queryId", + "default", + new TableName(defaultSchema, "metrics"), + new Constraints(constraintsMap), + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout - {}", res.getPartitions().getSchema()); + logger.info("doGetTableLayout - {}", res.getPartitions()); + + assertTrue(res.getPartitions().getRowCount() == 1); + + logger.info("doGetTableLayout - exit"); + } + + @Test + public void doGetMetricsSplits() + throws Exception + { + logger.info("doGetMetricsSplits: enter"); + + Schema schema = SchemaBuilder.newBuilder().addIntField("partitionId").build(); + + Block partitions = allocator.createBlock(schema); + BlockUtils.setValue(partitions.getFieldVector("partitionId"), 1, 1); + partitions.setRowCount(1); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName(defaultSchema, "metrics"), + partitions, + Collections.singletonList("partitionId"), + new Constraints(new HashMap<>()), + continuationToken); + int numContinuations = 0; + do { + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + logger.info("doGetMetricsSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetMetricsSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + assertEquals(1, response.getSplits().size()); + + if (continuationToken != null) { + numContinuations++; + } + } + while (continuationToken != null); + + assertEquals(0, numContinuations); + + logger.info("doGetMetricsSplits: exit"); + } + + @Test + public void doGetMetricSamplesSplits() + throws Exception + { + logger.info("doGetMetricSamplesSplits: enter"); + + String namespaceFilter = "MyNameSpace"; + String statistic = "p90"; + int numMetrics = 10; + + when(mockMetrics.listMetrics(any(ListMetricsRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + ListMetricsRequest request = invocation.getArgumentAt(0, ListMetricsRequest.class); + + //assert that the namespace filter was indeed pushed down + assertEquals(namespaceFilter, request.getNamespace()); + String nextToken = (request.getNextToken() == null) ? "valid" : null; + List metrics = new ArrayList<>(); + + for (int i = 0; i < numMetrics; i++) { + metrics.add(new Metric().withNamespace(namespaceFilter).withMetricName("metric-" + i)); + } + + return new ListMetricsResult().withNextToken(nextToken).withMetrics(metrics); + }); + + Schema schema = SchemaBuilder.newBuilder().addIntField("partitionId").build(); + + Block partitions = allocator.createBlock(schema); + BlockUtils.setValue(partitions.getFieldVector("partitionId"), 1, 1); + partitions.setRowCount(1); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put(NAMESPACE_FIELD, + EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add(namespaceFilter).build()); + constraintsMap.put(STATISTIC_FIELD, + EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add(statistic).build()); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName(defaultSchema, "metric_samples"), + partitions, + Collections.singletonList("partitionId"), + new Constraints(constraintsMap), + continuationToken); + + int numContinuations = 0; + do { + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + logger.info("doGetMetricSamplesSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetMetricSamplesSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + assertEquals(numMetrics, response.getSplits().size()); + for (Split nextSplit : response.getSplits()) { + assertNotNull(nextSplit.getProperty(SERIALZIE_DIM_FIELD_NAME)); + assertNotNull(nextSplit.getProperty(METRIC_NAME_FIELD)); + assertEquals(statistic, nextSplit.getProperty(STATISTIC_FIELD)); + assertEquals("60", nextSplit.getProperty(PERIOD_FIELD)); + } + + if (continuationToken != null) { + numContinuations++; + } + } + while (continuationToken != null); + + assertEquals(1, numContinuations); + + logger.info("doGetMetricSamplesSplits: exit"); + } +} diff --git a/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsRecordHandlerTest.java b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsRecordHandlerTest.java new file mode 100644 index 0000000000..7ae483d36b --- /dev/null +++ b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/MetricsRecordHandlerTest.java @@ -0,0 +1,343 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.MetricSamplesTable; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.MetricsTable; +import com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.cloudwatch.AmazonCloudWatch; +import com.amazonaws.services.cloudwatch.model.Dimension; +import com.amazonaws.services.cloudwatch.model.GetMetricDataRequest; +import com.amazonaws.services.cloudwatch.model.GetMetricDataResult; +import com.amazonaws.services.cloudwatch.model.ListMetricsRequest; +import com.amazonaws.services.cloudwatch.model.ListMetricsResult; +import com.amazonaws.services.cloudwatch.model.Metric; +import com.amazonaws.services.cloudwatch.model.MetricDataQuery; +import com.amazonaws.services.cloudwatch.model.MetricDataResult; +import com.amazonaws.services.cloudwatch.model.MetricStat; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.io.ByteStreams; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.Collections; +import java.util.Date; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; +import java.util.concurrent.atomic.AtomicLong; + +import static com.amazonaws.athena.connectors.cloudwatch.metrics.TestUtils.makeStringEquals; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.DIMENSION_VALUE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.METRIC_NAME_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.NAMESPACE_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.PERIOD_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.metrics.tables.Table.STATISTIC_FIELD; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class MetricsRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(MetricsRecordHandlerTest.class); + //Schema for the metrics table. + private static final Table METRIC_TABLE = new MetricsTable(); + //Schema for the metric_samples table. + private static final Table METRIC_DATA_TABLE = new MetricSamplesTable(); + private static final TableName METRICS_TABLE_NAME = new TableName("default", METRIC_TABLE.getName()); + private static final TableName METRIC_SAMPLES_TABLE_NAME = new TableName("default", METRIC_DATA_TABLE.getName()); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private List mockS3Storage; + private MetricsRecordHandler handler; + private S3BlockSpillReader spillReader; + private BlockAllocator allocator; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + + @Mock + private AmazonCloudWatch mockMetrics; + + @Mock + private AmazonS3 mockS3; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + mockS3Storage = new ArrayList<>(); + allocator = new BlockAllocatorImpl(); + handler = new MetricsRecordHandler(mockS3, mockSecretsManager, mockAthena, mockMetrics); + spillReader = new S3BlockSpillReader(mockS3, allocator); + + when(mockS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + }); + + when(mockS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + }); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void readMetricsWithConstraint() + throws Exception + { + logger.info("readMetricsWithConstraint: enter"); + + String namespace = "namespace"; + String dimName = "dimName"; + String dimValue = "dimValye"; + + int numMetrics = 100; + AtomicLong numCalls = new AtomicLong(0); + when(mockMetrics.listMetrics(any(ListMetricsRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + ListMetricsRequest request = invocation.getArgumentAt(0, ListMetricsRequest.class); + numCalls.incrementAndGet(); + //assert that the namespace filter was indeed pushed down + assertEquals(namespace, request.getNamespace()); + String nextToken = (request.getNextToken() == null) ? "valid" : null; + List metrics = new ArrayList<>(); + + for (int i = 0; i < numMetrics; i++) { + metrics.add(new Metric().withNamespace(namespace).withMetricName("metric-" + i) + .withDimensions(new Dimension().withName(dimName).withValue(dimValue))); + metrics.add(new Metric().withNamespace(namespace + i).withMetricName("metric-" + i)); + } + + return new ListMetricsResult().withNextToken(nextToken).withMetrics(metrics); + }); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put(NAMESPACE_FIELD, makeStringEquals(allocator, namespace)); + constraintsMap.put(DIMENSION_NAME_FIELD, makeStringEquals(allocator, dimName)); + constraintsMap.put(DIMENSION_VALUE_FIELD, makeStringEquals(allocator, dimValue)); + + S3SpillLocation spillLocation = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split split = Split.newBuilder(spillLocation, keyFactory.create()).build(); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + "catalog", + "queryId-" + System.currentTimeMillis(), + METRICS_TABLE_NAME, + METRIC_TABLE.getSchema(), + split, + new Constraints(constraintsMap), + 100_000_000_000L, + 100_000_000_000L//100GB don't expect this to spill + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("readMetricsWithConstraint: rows[{}]", response.getRecordCount()); + + assertEquals(numCalls.get() * numMetrics, response.getRecords().getRowCount()); + logger.info("readMetricsWithConstraint: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("readMetricsWithConstraint: exit"); + } + + @Test + public void readMetricSamplesWithConstraint() + throws Exception + { + logger.info("readMetricSamplesWithConstraint: enter"); + + String namespace = "namespace"; + String metricName = "metricName"; + String statistic = "p90"; + String period = "60"; + String dimName = "dimName"; + String dimValue = "dimValue"; + List dimensions = Collections.singletonList(new Dimension().withName(dimName).withValue(dimValue)); + + int numMetrics = 10; + int numSamples = 10; + AtomicLong numCalls = new AtomicLong(0); + when(mockMetrics.getMetricData(any(GetMetricDataRequest.class))).thenAnswer((InvocationOnMock invocation) -> { + numCalls.incrementAndGet(); + return mockMetricData(invocation, numMetrics, numSamples); + }); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put(NAMESPACE_FIELD, makeStringEquals(allocator, namespace)); + constraintsMap.put(STATISTIC_FIELD, makeStringEquals(allocator, statistic)); + constraintsMap.put(DIMENSION_NAME_FIELD, makeStringEquals(allocator, dimName)); + constraintsMap.put(DIMENSION_VALUE_FIELD, makeStringEquals(allocator, dimValue)); + + S3SpillLocation spillLocation = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split split = Split.newBuilder(spillLocation, keyFactory.create()) + .add(DimensionSerDe.SERIALZIE_DIM_FIELD_NAME, DimensionSerDe.serialize(dimensions)) + .add(METRIC_NAME_FIELD, metricName) + .add(NAMESPACE_FIELD, namespace) + .add(STATISTIC_FIELD, statistic) + .add(PERIOD_FIELD, period) + .build(); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + "catalog", + "queryId-" + System.currentTimeMillis(), + METRIC_SAMPLES_TABLE_NAME, + METRIC_DATA_TABLE.getSchema(), + split, + new Constraints(constraintsMap), + 100_000_000_000L, + 100_000_000_000L//100GB don't expect this to spill + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("readMetricSamplesWithConstraint: rows[{}]", response.getRecordCount()); + + assertEquals(numCalls.get() * numMetrics * numSamples, response.getRecords().getRowCount()); + logger.info("readMetricSamplesWithConstraint: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("readMetricSamplesWithConstraint: exit"); + } + + private GetMetricDataResult mockMetricData(InvocationOnMock invocation, int numMetrics, int numSamples) + { + GetMetricDataRequest request = invocation.getArgumentAt(0, GetMetricDataRequest.class); + + /** + * Confirm that all available criteria were pushed down into Cloudwatch Metrics + */ + List queries = request.getMetricDataQueries(); + assertEquals(1, queries.size()); + MetricStat stat = queries.get(0).getMetricStat(); + assertNotNull(stat.getPeriod()); + assertNotNull(stat.getMetric()); + assertNotNull(stat.getStat()); + assertNotNull(stat.getMetric().getMetricName()); + assertNotNull(stat.getMetric().getNamespace()); + assertNotNull(stat.getMetric().getDimensions()); + assertEquals(1, stat.getMetric().getDimensions().size()); + + String nextToken = (request.getNextToken() == null) ? "valid" : null; + List samples = new ArrayList<>(); + + for (int i = 0; i < numMetrics; i++) { + List values = new ArrayList<>(); + List timestamps = new ArrayList<>(); + for (double j = 0; j < numSamples; j++) { + values.add(j); + timestamps.add(new Date(System.currentTimeMillis() + (int) j)); + } + samples.add(new MetricDataResult().withValues(values).withTimestamps(timestamps)); + } + + return new GetMetricDataResult().withNextToken(nextToken).withMetricDataResults(samples); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/TestUtils.java b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/TestUtils.java new file mode 100644 index 0000000000..58fa82f59e --- /dev/null +++ b/athena-cloudwatch-metrics/src/test/java/com/amazonaws/athena/connectors/cloudwatch/metrics/TestUtils.java @@ -0,0 +1,36 @@ +/*- + * #%L + * athena-cloudwatch-metrics + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch.metrics; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import org.apache.arrow.vector.types.Types; + +public class TestUtils +{ + private TestUtils() {} + + public static ValueSet makeStringEquals(BlockAllocator allocator, String value) + { + return EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add(value).build(); + } +} diff --git a/athena-cloudwatch/LICENSE.txt b/athena-cloudwatch/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-cloudwatch/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-cloudwatch/README.md b/athena-cloudwatch/README.md new file mode 100644 index 0000000000..03edd2e8fc --- /dev/null +++ b/athena-cloudwatch/README.md @@ -0,0 +1,60 @@ +# Amazon Athena Cloudwatch Connector + +This connector enables Amazon Athena to communicate with Cloudwatch, making your log data accessible via SQL. + +## Usage + +### Parameters + +The Athena Cloudwatch Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) + +The connector also supports AIMD Congestion Control for handling throttling events from Cloudwatch via the Athena Query Federation SDK's ThrottlingInvoker construct. You can tweak the default throttling behavior by setting any of the below (optional) environment variables: + +1. **throttle_initial_delay_ms** - (Default: 10ms) This is the initial call delay applied after the first congestion event. +1. **throttle_max_delay_ms** - (Default: 1000ms) This is the max delay between calls. You can derive TPS by dividing it into 1000ms. +1. **throttle_decrease_factor** - (Default: 0.5) This is the factor by which we reduce our call rate. +1. **throttle_increase_ms** - (Default: 10ms) This is the rate at which we decrease the call delay. + + +### Databases & Tables + +The Athena Cloudwatch Connector maps your LogGroups as schemas (aka database) and each LogStream as a table. The connector also maps a special "all_log_streams" View comprised of all LogStreams in the LogGroup. This View allows you to query all the logs in a LogGroup at once instead of search through each LogStream individually. + +Every Table mapped by the Athena Cloudwatch Connector has the following schema which matches the fields provided by Cloudwatch Logs itself. + +1. **log_stream** - A VARCHAR containing the name of the LogStream that the row is from. +2. **time** - An INT64 containing the epoch time of the log line was generated. +3. **message** - A VARCHAR containing the log message itself. + +### Required Permissions + +Review the "Policies" section of the athena-cloudwatch.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +2. CloudWatch Logs Read/Write - The connector uses this access to read your log data in order to satisfy your queries but also to write its own diagnostic logs. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-cloudwatch dir, run `mvn clean install`. +3. From the athena-cloudwatch dir, run `../tools/publish.sh S3_BUCKET_NAME athena-cloudwatch` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) +4. Try running a query like the one below in Athena: +```sql +select * from "lambda:"."/aws/lambda/".all_log_streams limit 100 +``` + +## Performance + +The Athena Cloudwatch Connector will attempt to parallelize queries against Cloudwatch by parallelizing scans of the various log_streams needed for your query. Predicate Pushdown is performed within the Lambda function and also within Cloudwatch Logs for certain time period filters. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-cloudwatch/athena-cloudwatch.yaml b/athena-cloudwatch/athena-cloudwatch.yaml new file mode 100644 index 0000000000..ce4915aae4 --- /dev/null +++ b/athena-cloudwatch/athena-cloudwatch.yaml @@ -0,0 +1,71 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaCloudwatchConnector + Description: 'This connector enables Amazon Athena to communicate with Cloudwatch, making your logs accessible via SQL.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.cloudwatch.CloudwatchCompositeHandler" + CodeUri: "./target/athena-cloudwatch-1.0.jar" + Description: "Enables Amazon Athena to communicate with Cloudwatch, making your log accessible via SQL" + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - logs:Describe* + - logs:Get* + - logs:List* + - logs:StartQuery + - logs:StopQuery + - logs:TestMetricFilter + - logs:FilterLogEvents + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-cloudwatch/pom.xml b/athena-cloudwatch/pom.xml new file mode 100644 index 0000000000..02fa9a54d6 --- /dev/null +++ b/athena-cloudwatch/pom.xml @@ -0,0 +1,57 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-cloudwatch + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.amazonaws + aws-java-sdk-logs + 1.11.490 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchCompositeHandler.java b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchCompositeHandler.java new file mode 100644 index 0000000000..0db8b25753 --- /dev/null +++ b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose CloudwatchMetadataHandler and CloudwatchRecordHandler. + */ +public class CloudwatchCompositeHandler + extends CompositeHandler +{ + public CloudwatchCompositeHandler() + { + super(new CloudwatchMetadataHandler(), new CloudwatchRecordHandler()); + } +} diff --git a/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchExceptionFilter.java b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchExceptionFilter.java new file mode 100644 index 0000000000..c71db552cf --- /dev/null +++ b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchExceptionFilter.java @@ -0,0 +1,45 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.services.logs.model.AWSLogsException; +import com.amazonaws.services.logs.model.LimitExceededException; + +/** + * Used to identify Exceptions that are related to Cloudwatch Logs throttling events. + */ +public class CloudwatchExceptionFilter + implements ThrottlingInvoker.ExceptionFilter +{ + public static final ThrottlingInvoker.ExceptionFilter EXCEPTION_FILTER = new CloudwatchExceptionFilter(); + + private CloudwatchExceptionFilter() {} + + @Override + public boolean isMatch(Exception ex) + { + if (ex instanceof AWSLogsException && ex.getMessage().startsWith("Rate exceeded")) { + return true; + } + + return (ex instanceof LimitExceededException); + } +} diff --git a/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchMetadataHandler.java b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchMetadataHandler.java new file mode 100644 index 0000000000..d88b39d5a2 --- /dev/null +++ b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchMetadataHandler.java @@ -0,0 +1,345 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.logs.AWSLogs; +import com.amazonaws.services.logs.AWSLogsClientBuilder; +import com.amazonaws.services.logs.model.DescribeLogGroupsRequest; +import com.amazonaws.services.logs.model.DescribeLogGroupsResult; +import com.amazonaws.services.logs.model.DescribeLogStreamsRequest; +import com.amazonaws.services.logs.model.DescribeLogStreamsResult; +import com.amazonaws.services.logs.model.LogStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.concurrent.TimeoutException; + +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchExceptionFilter.EXCEPTION_FILTER; + +/** + * Handles metadata requests for the Athena Cloudwatch Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Each LogGroup is treated as a schema (aka database). + * 2. Each LogStream is treated as a table. + * 3. A special 'all_log_streams' view is added which allows you to query all LogStreams in a LogGroup. + * 4. LogStreams area treated as partitions and scanned in parallel. + * 5. Timestamp predicates are pushed into Cloudwatch itself. + */ +public class CloudwatchMetadataHandler + extends MetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(CloudwatchMetadataHandler.class); + + //Used to tag log lines generated by this connector for diagnostic purposes when interacting with Athena. + private static final String sourceType = "cloudwatch"; + //some customers have a very large number of log groups and log streams. In those cases we limit + //the max results as a safety mechanism. They can still be queried but aren't returned in show tables or show databases. + private static final long MAX_RESULTS = 100_000; + //The maximum number of splits that will be generated by a single call to doGetSplits(...) before we paginate. + protected static final int MAX_SPLITS_PER_REQUEST = 1000; + //The name of the special table view which allows you to query all log streams in a LogGroup + protected static final String ALL_LOG_STREAMS_TABLE = "all_log_streams"; + //The name of the log stream field in our response and split objects. + protected static final String LOG_STREAM_FIELD = "log_stream"; + //The name of the log group field in our response and split objects. + protected static final String LOG_GROUP_FIELD = "log_group"; + //The name of the log time field in our response and split objects. + protected static final String LOG_TIME_FIELD = "time"; + //The name of the log message field in our response and split objects. + protected static final String LOG_MSG_FIELD = "message"; + //The name of the log stream size field in our split objects. + protected static final String LOG_STREAM_SIZE_FIELD = "log_stream_bytes"; + //The the schema of all Cloudwatch tables. + protected static final Schema CLOUDWATCH_SCHEMA; + + static { + CLOUDWATCH_SCHEMA = new SchemaBuilder().newBuilder() + .addField(LOG_STREAM_FIELD, Types.MinorType.VARCHAR.getType()) + .addField(LOG_TIME_FIELD, new ArrowType.Int(64, true)) + .addField(LOG_MSG_FIELD, Types.MinorType.VARCHAR.getType()) + //requests to read multiple log streams can be parallelized so lets treat it like a partition + .addMetadata("partitionCols", LOG_STREAM_FIELD) + .build(); + } + + private final AWSLogs awsLogs; + private final ThrottlingInvoker invoker = ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + private final CloudwatchTableResolver tableResolver; + + public CloudwatchMetadataHandler() + { + super(sourceType); + this.awsLogs = AWSLogsClientBuilder.standard().build(); + tableResolver = new CloudwatchTableResolver(invoker, awsLogs, MAX_RESULTS, MAX_RESULTS); + } + + @VisibleForTesting + protected CloudwatchMetadataHandler(AWSLogs awsLogs, + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(keyFactory, secretsManager, athena, sourceType, spillBucket, spillPrefix); + this.awsLogs = awsLogs; + tableResolver = new CloudwatchTableResolver(invoker, awsLogs, MAX_RESULTS, MAX_RESULTS); + } + + /** + * List LogGroups in your Cloudwatch account treating each as a 'schema' (aka database) + * + * @see MetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest listSchemasRequest) + throws TimeoutException + { + DescribeLogGroupsRequest request = new DescribeLogGroupsRequest(); + DescribeLogGroupsResult result; + List schemas = new ArrayList<>(); + do { + if (schemas.size() > MAX_RESULTS) { + throw new RuntimeException("Too many log groups, exceeded max metadata results for schema count."); + } + result = invoker.invoke(() -> awsLogs.describeLogGroups(request)); + result.getLogGroups().forEach(next -> schemas.add(next.getLogGroupName().toLowerCase())); + request.setNextToken(result.getNextToken()); + logger.info("doListSchemaNames: Listing log groups {} {}", result.getNextToken(), schemas.size()); + } + while (result.getNextToken() != null); + + return new ListSchemasResponse(listSchemasRequest.getCatalogName(), schemas); + } + + /** + * List LogStreams within the requested schema (aka LogGroup) in your Cloudwatch account treating each as a 'table'. + * + * @see MetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest listTablesRequest) + throws TimeoutException + { + String logGroupName = tableResolver.validateSchema(listTablesRequest.getSchemaName()); + DescribeLogStreamsRequest request = new DescribeLogStreamsRequest(logGroupName); + DescribeLogStreamsResult result; + List tables = new ArrayList<>(); + do { + if (tables.size() > MAX_RESULTS) { + throw new RuntimeException("Too many log streams, exceeded max metadata results for table count."); + } + result = invoker.invoke(() -> awsLogs.describeLogStreams(request)); + result.getLogStreams().forEach(next -> tables.add(toTableName(listTablesRequest, next))); + request.setNextToken(result.getNextToken()); + logger.info("doListTables: Listing log streams {} {}", result.getNextToken(), tables.size()); + } + while (result.getNextToken() != null); + + //We add a special table that represents all log streams. This is helpful depending on how + //you have your logs organized. + tables.add(new TableName(listTablesRequest.getSchemaName(), ALL_LOG_STREAMS_TABLE)); + + return new ListTablesResponse(listTablesRequest.getCatalogName(), tables); + } + + /** + * Returns the pre-set schema for the request Cloudwatch table (LogStream) and schema (LogGroup) after + * validating that it exists. + * + * @see MetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + TableName tableName = getTableRequest.getTableName(); + tableResolver.validateTable(tableName); + return new GetTableResponse(getTableRequest.getCatalogName(), + getTableRequest.getTableName(), + CLOUDWATCH_SCHEMA, + Collections.singleton(LOG_STREAM_FIELD)); + } + + /** + * We add one additional field to the partition schema. This field is used for our own purposes and ignored + * by Athena but it will get passed to calls to GetSplits(...) which is where we will set it on our Split + * without the need to call Cloudwatch a second time. + * + * @see MetadataHandler + */ + @Override + public void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + partitionSchemaBuilder.addField(LOG_STREAM_SIZE_FIELD, new ArrowType.Int(64, true)); + partitionSchemaBuilder.addField(LOG_GROUP_FIELD, Types.MinorType.VARCHAR.getType()); + } + + /** + * Gets the list of LogStreams that need to be scanned to satisfy the requested table. In most cases this will be just + * 1 LogStream and this results in just 1 partition. If, however, the request is for the special ALL_LOG_STREAMS view + * then all LogStreams in the requested LogGroup (schema) are queried and turned into partitions 1:1. + * + * @note This method applies partition pruning based on the log_stream field. + * @see MetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + CloudwatchTableName cwTableName = tableResolver.validateTable(request.getTableName()); + + DescribeLogStreamsRequest cwRequest = new DescribeLogStreamsRequest(cwTableName.getLogGroupName()); + if (!ALL_LOG_STREAMS_TABLE.equals(cwTableName.getLogStreamName())) { + cwRequest.setLogStreamNamePrefix(cwTableName.getLogStreamName()); + } + + DescribeLogStreamsResult result; + do { + result = invoker.invoke(() -> awsLogs.describeLogStreams(cwRequest)); + for (LogStream next : result.getLogStreams()) { + //Each log stream that matches any possible partition pruning should be added to the partition list. + blockWriter.writeRows((Block block, int rowNum) -> { + boolean matched = block.setValue(LOG_GROUP_FIELD, rowNum, cwRequest.getLogGroupName()); + matched &= block.setValue(LOG_STREAM_FIELD, rowNum, next.getLogStreamName()); + matched &= block.setValue(LOG_STREAM_SIZE_FIELD, rowNum, next.getStoredBytes()); + return matched ? 1 : 0; + }); + } + cwRequest.setNextToken(result.getNextToken()); + } + while (result.getNextToken() != null && queryStatusChecker.isQueryRunning()); + } + + /** + * Each partition is converted into a single Split which means we will potentially read all LogStreams required for + * the query in parallel. + * + * @see MetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + { + int partitionContd = decodeContinuationToken(request); + Set splits = new HashSet<>(); + Block partitions = request.getPartitions(); + for (int curPartition = partitionContd; curPartition < partitions.getRowCount(); curPartition++) { + FieldReader logStreamReader = partitions.getFieldReader(LOG_STREAM_FIELD); + logStreamReader.setPosition(curPartition); + + FieldReader logGroupReader = partitions.getFieldReader(LOG_GROUP_FIELD); + logGroupReader.setPosition(curPartition); + + FieldReader sizeReader = partitions.getFieldReader(LOG_STREAM_SIZE_FIELD); + sizeReader.setPosition(curPartition); + + //Every split must have a unique location if we wish to spill to avoid failures + SpillLocation spillLocation = makeSpillLocation(request); + + Split.Builder splitBuilder = Split.newBuilder(spillLocation, makeEncryptionKey()) + .add(CloudwatchMetadataHandler.LOG_GROUP_FIELD, String.valueOf(logGroupReader.readText())) + .add(CloudwatchMetadataHandler.LOG_STREAM_FIELD, String.valueOf(logStreamReader.readText())) + .add(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD, String.valueOf(sizeReader.readLong())); + + splits.add(splitBuilder.build()); + + if (splits.size() >= MAX_SPLITS_PER_REQUEST) { + //We exceeded the number of split we want to return in a single request, return and provide + //a continuation token. + return new GetSplitsResponse(request.getCatalogName(), + splits, + encodeContinuationToken(curPartition)); + } + } + + return new GetSplitsResponse(request.getCatalogName(), splits, null); + } + + /** + * Used to handle paginated requests. + * + * @return The partition number to resume with. + */ + private int decodeContinuationToken(GetSplitsRequest request) + { + if (request.hasContinuationToken()) { + return Integer.valueOf(request.getContinuationToken()); + } + + //No continuation token present + return 0; + } + + /** + * Used to create pagination tokens by encoding the number of the next partition to process. + * + * @param partition The number of the next partition we should process on the next call. + * @return The encoded continuation token. + */ + private String encodeContinuationToken(int partition) + { + return String.valueOf(partition); + } + + /** + * Helper that converts a LogStream to a TableName by lowercasing the schema of the request and the logstreamname. + * + * @param request The ListTablesRequest to retrieve the schema name from. + * @param logStream The LogStream to turn into a table. + * @return A TableName with both the schema (LogGroup) and the table (LogStream) lowercased. + */ + private TableName toTableName(ListTablesRequest request, LogStream logStream) + { + return new TableName(request.getSchemaName(), logStream.getLogStreamName().toLowerCase()); + } +} diff --git a/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchRecordHandler.java b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchRecordHandler.java new file mode 100644 index 0000000000..78388c20bb --- /dev/null +++ b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchRecordHandler.java @@ -0,0 +1,172 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.logs.AWSLogs; +import com.amazonaws.services.logs.AWSLogsClientBuilder; +import com.amazonaws.services.logs.model.GetLogEventsRequest; +import com.amazonaws.services.logs.model.GetLogEventsResult; +import com.amazonaws.services.logs.model.OutputLogEvent; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import org.apache.arrow.util.VisibleForTesting; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicLong; + +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchExceptionFilter.EXCEPTION_FILTER; +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchMetadataHandler.LOG_GROUP_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchMetadataHandler.LOG_MSG_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchMetadataHandler.LOG_STREAM_FIELD; +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchMetadataHandler.LOG_TIME_FIELD; + +/** + * Handles data read record requests for the Athena Cloudwatch Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Reads and maps Cloudwatch Logs data for a specific LogStream (split) + * 2. Attempts to push down time range predicates into Cloudwatch. + */ +public class CloudwatchRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(CloudwatchRecordHandler.class); + //Used to tag log lines generated by this connector for diagnostic purposes when interacting with Athena. + private static final String sourceType = "cloudwatch"; + //Used to handle Throttling events and apply AIMD congestion control + ThrottlingInvoker invoker = ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + private final AtomicLong count = new AtomicLong(0); + private final AWSLogs awsLogs; + + public CloudwatchRecordHandler() + { + this(AmazonS3ClientBuilder.defaultClient(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + AWSLogsClientBuilder.defaultClient()); + } + + @VisibleForTesting + protected CloudwatchRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, AWSLogs awsLogs) + { + super(amazonS3, secretsManager, athena, sourceType); + this.awsLogs = awsLogs; + } + + /** + * Scans Cloudwatch Logs using the LogStream and optional Time stamp filters. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + throws TimeoutException + { + String continuationToken = null; + TableName tableName = recordsRequest.getTableName(); + Split split = recordsRequest.getSplit(); + invoker.setBlockSpiller(spiller); + do { + final String actualContinuationToken = continuationToken; + GetLogEventsResult logEventsResult = invoker.invoke(() -> awsLogs.getLogEvents( + pushDownConstraints(recordsRequest.getConstraints(), + new GetLogEventsRequest() + .withLogGroupName(split.getProperty(LOG_GROUP_FIELD)) + //We use the property instead of the table name because of the special all_streams table + .withLogStreamName(split.getProperty(LOG_STREAM_FIELD)) + .withNextToken(actualContinuationToken) + ))); + + if (continuationToken == null || !continuationToken.equals(logEventsResult.getNextForwardToken())) { + continuationToken = logEventsResult.getNextForwardToken(); + } + else { + continuationToken = null; + } + + for (OutputLogEvent ole : logEventsResult.getEvents()) { + spiller.writeRows((Block block, int rowNum) -> { + boolean matched = true; + matched &= block.offerValue(LOG_STREAM_FIELD, rowNum, split.getProperty(LOG_STREAM_FIELD)); + matched &= block.offerValue(LOG_TIME_FIELD, rowNum, ole.getTimestamp()); + matched &= block.offerValue(LOG_MSG_FIELD, rowNum, ole.getMessage()); + return matched ? 1 : 0; + }); + } + + logger.info("readWithConstraint: LogGroup[{}] LogStream[{}] Continuation[{}] rows[{}]", + tableName.getSchemaName(), tableName.getTableName(), continuationToken, + logEventsResult.getEvents().size()); + } + while (continuationToken != null && queryStatusChecker.isQueryRunning()); + } + + /** + * Attempts to push down predicates into Cloudwatch Logs by decorating the Cloudwatch Logs request. + * + * @param constraints The constraints for the read as provided by Athena based on the customer's query. + * @param request The Cloudwatch Logs request to inject predicates to. + * @return The decorated Cloudwatch Logs request. + * @note This impl currently only pushing down SortedRangeSet filters (>=, =<, between) on the log time column. + */ + private GetLogEventsRequest pushDownConstraints(Constraints constraints, GetLogEventsRequest request) + { + ValueSet timeConstraint = constraints.getSummary().get(LOG_TIME_FIELD); + if (timeConstraint instanceof SortedRangeSet && !timeConstraint.isNullAllowed()) { + //SortedRangeSet is how >, <, between is represented which are easiest and most common when + //searching logs so we attempt to push that down here as an optimization. SQL can represent complex + //overlapping ranges which Cloudwatch can not support so this is not a replacement for applying + //constraints using the ConstraintEvaluator. + + Range basicPredicate = ((SortedRangeSet) timeConstraint).getSpan(); + + if (!basicPredicate.getLow().isNullValue()) { + Long lowerBound = (Long) basicPredicate.getLow().getValue(); + request.setStartTime(lowerBound); + } + + if (!basicPredicate.getHigh().isNullValue()) { + Long upperBound = (Long) basicPredicate.getHigh().getValue(); + request.setEndTime(upperBound); + } + } + + return request; + } +} diff --git a/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchTableName.java b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchTableName.java new file mode 100644 index 0000000000..7e083ebc71 --- /dev/null +++ b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchTableName.java @@ -0,0 +1,80 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.domain.TableName; + +import java.util.Objects; + +public class CloudwatchTableName +{ + private final String logGroupName; + private final String logStreamName; + + public CloudwatchTableName(String logGroupName, String logStreamName) + { + this.logGroupName = logGroupName; + this.logStreamName = logStreamName; + } + + public String getLogGroupName() + { + return logGroupName; + } + + public String getLogStreamName() + { + return logStreamName; + } + + public TableName toTableName() + { + return new TableName(logGroupName.toLowerCase(), logStreamName.toLowerCase()); + } + + @Override + public String toString() + { + return "CloudwatchTableName{" + + "logGroupName='" + logGroupName + '\'' + + ", logStreamName='" + logStreamName + '\'' + + '}'; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + CloudwatchTableName that = (CloudwatchTableName) o; + return Objects.equals(getLogGroupName(), that.getLogGroupName()) && + Objects.equals(getLogStreamName(), that.getLogStreamName()); + } + + @Override + public int hashCode() + { + return Objects.hash(getLogGroupName(), getLogStreamName()); + } +} diff --git a/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchTableResolver.java b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchTableResolver.java new file mode 100644 index 0000000000..52526f5498 --- /dev/null +++ b/athena-cloudwatch/src/main/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchTableResolver.java @@ -0,0 +1,289 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.services.logs.AWSLogs; +import com.amazonaws.services.logs.model.DescribeLogGroupsRequest; +import com.amazonaws.services.logs.model.DescribeLogGroupsResult; +import com.amazonaws.services.logs.model.DescribeLogStreamsRequest; +import com.amazonaws.services.logs.model.DescribeLogStreamsResult; +import com.amazonaws.services.logs.model.LogGroup; +import com.amazonaws.services.logs.model.LogStream; +import com.google.common.cache.CacheBuilder; +import com.google.common.cache.CacheLoader; +import com.google.common.cache.LoadingCache; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeoutException; + +import static com.amazonaws.athena.connectors.cloudwatch.CloudwatchMetadataHandler.ALL_LOG_STREAMS_TABLE; + +/** + * This class helps with resolving the differences in casing between cloudwatch log and Presto. Presto expects all + * databases, tables, and columns to be lower case. This class allows us to use cloudwatch logGroups and logStreams + * which may have captial letters in them without issue. It does so by caching LogStreams and LogStreams and doing + * a case insentive search over them. It will first try to do a targeted get to reduce the penalty for LogGroups + * and LogStreams which don't have capitalization. It also has an optimization for LAMBDA which is a common + * cause of capitalized LogStreams by doing a targeted replace for LAMBDA's pattern. + */ +public class CloudwatchTableResolver +{ + private static final Logger logger = LoggerFactory.getLogger(CloudwatchTableResolver.class); + + private AWSLogs awsLogs; + //Used to handle Throttling events using an AIMD strategy for congestion control. + private ThrottlingInvoker invoker; + //The LogStream pattern that is capitalized by LAMBDA + private static final String LAMBDA_PATTERN = "$latest"; + //The LogStream pattern to replace + private static final String LAMBDA_ACTUAL_PATTERN = "$LATEST"; + //The schema cache that is presto casing to cloudwatch casing + private final LoadingCache schemaCache; + //The table cache that is presto casing to cloudwatch casing + private final LoadingCache tableCache; + + /** + * Constructs an instance of the table resolver. + * + * @param invoker The ThrottlingInvoker to use to handle throttling events. + * @param awsLogs The AWSLogs client to use for cache misses. + * @param maxSchemaCacheSize The max number of schemas to cache. + * @param maxTableCacheSize The max tables to cache. + */ + public CloudwatchTableResolver(ThrottlingInvoker invoker, AWSLogs awsLogs, long maxSchemaCacheSize, long maxTableCacheSize) + { + this.invoker = invoker; + this.awsLogs = awsLogs; + this.tableCache = CacheBuilder.newBuilder() + .maximumSize(maxTableCacheSize) + .build( + new CacheLoader() + { + public CloudwatchTableName load(TableName schemaName) + throws TimeoutException + { + return loadLogStreams(schemaName.getSchemaName(), schemaName.getTableName()); + } + }); + + this.schemaCache = CacheBuilder.newBuilder() + .maximumSize(maxSchemaCacheSize) + .build( + new CacheLoader() + { + public String load(String schemaName) + throws TimeoutException + { + return loadLogGroups(schemaName); + } + }); + } + + /** + * Loads the requested LogStream as identified by the TableName. + * + * @param logGroup The properly cased schema name. + * @param logStream The table name to validate. + * @return The CloudwatchTableName or null if not found. + * @note This method also primes the cache with other CloudwatchTableNames found along the way while scaning Cloudwatch. + */ + private CloudwatchTableName loadLogStreams(String logGroup, String logStream) + throws TimeoutException + { + //As an optimization, see if the table name is an exact match (meaning likely no casing issues) + CloudwatchTableName result = loadLogStream(logGroup, logStream); + if (result != null) { + return result; + } + + logger.info("loadLogStreams: Did not find a match for the table, falling back to LogGroup scan for {}:{}", + logGroup, logStream); + DescribeLogStreamsRequest validateTableRequest = new DescribeLogStreamsRequest(logGroup); + DescribeLogStreamsResult validateTableResult; + do { + validateTableResult = invoker.invoke(() -> awsLogs.describeLogStreams(validateTableRequest)); + for (LogStream nextStream : validateTableResult.getLogStreams()) { + String logStreamName = nextStream.getLogStreamName(); + CloudwatchTableName nextCloudwatch = new CloudwatchTableName(logGroup, logStreamName); + tableCache.put(nextCloudwatch.toTableName(), nextCloudwatch); + if (nextCloudwatch.getLogStreamName().equalsIgnoreCase(logStreamName)) { + //We stop loading once we find the one we care about. This is an optimization that + //attempt to exploit the fact that we likely access more recent logstreams first. + logger.info("loadLogStreams: Matched {} for {}", nextCloudwatch, logStream); + return nextCloudwatch; + } + } + validateTableRequest.setNextToken(validateTableResult.getNextToken()); + } + while (validateTableResult.getNextToken() != null); + + //We could not find a match + throw new IllegalArgumentException("No such table " + logGroup + " " + logStream); + } + + /** + * Optomizaiton that attempts to load a specific LogStream as identified by the TableName. + * + * @param logGroup The properly cased schema name. + * @param logStream The table name to validate. + * @return The CloudwatchTableName or null if not found. + * @note This method also primes the cache with other CloudwatchTableNames found along the way while scanning Cloudwatch. + */ + private CloudwatchTableName loadLogStream(String logGroup, String logStream) + throws TimeoutException + { + if (ALL_LOG_STREAMS_TABLE.equalsIgnoreCase(logStream)) { + return new CloudwatchTableName(logGroup, ALL_LOG_STREAMS_TABLE); + } + + String effectiveTableName = logStream; + if (effectiveTableName.contains(LAMBDA_PATTERN)) { + logger.info("loadLogStream: Appears to be a lambda log_stream, substituting Lambda pattern {} for {}", + LAMBDA_PATTERN, effectiveTableName); + effectiveTableName = effectiveTableName.replace(LAMBDA_PATTERN, LAMBDA_ACTUAL_PATTERN); + } + + DescribeLogStreamsRequest request = new DescribeLogStreamsRequest(logGroup) + .withLogStreamNamePrefix(effectiveTableName); + DescribeLogStreamsResult result = invoker.invoke(() -> awsLogs.describeLogStreams(request)); + for (LogStream nextStream : result.getLogStreams()) { + String logStreamName = nextStream.getLogStreamName(); + CloudwatchTableName nextCloudwatch = new CloudwatchTableName(logGroup, logStreamName); + if (nextCloudwatch.getLogStreamName().equalsIgnoreCase(logStreamName)) { + logger.info("loadLogStream: Matched {} for {}:{}", nextCloudwatch, logGroup, logStream); + return nextCloudwatch; + } + } + + return null; + } + + /** + * Loads the requested LogGroup as identified by the schemaName. + * + * @param schemaName The schemaName to load. + * @return The actual LogGroup name in cloudwatch. + * @note This method also primes the cache with other LogGroups found along the way while scanning Cloudwatch. + */ + private String loadLogGroups(String schemaName) + throws TimeoutException + { + //As an optimization, see if the table name is an exact match (meaning likely no casing issues) + String result = loadLogGroup(schemaName); + if (result != null) { + return result; + } + + logger.info("loadLogGroups: Did not find a match for the schema, falling back to LogGroup scan for {}", schemaName); + DescribeLogGroupsRequest validateSchemaRequest = new DescribeLogGroupsRequest(); + DescribeLogGroupsResult validateSchemaResult; + do { + validateSchemaResult = invoker.invoke(() -> awsLogs.describeLogGroups(validateSchemaRequest)); + for (LogGroup next : validateSchemaResult.getLogGroups()) { + String nextLogGroupName = next.getLogGroupName(); + schemaCache.put(schemaName.toLowerCase(), nextLogGroupName); + if (nextLogGroupName.equalsIgnoreCase(schemaName)) { + logger.info("loadLogGroups: Matched {} for {}", nextLogGroupName, schemaName); + return nextLogGroupName; + } + } + validateSchemaRequest.setNextToken(validateSchemaResult.getNextToken()); + } + while (validateSchemaResult.getNextToken() != null); + + //We could not find a match + throw new IllegalArgumentException("No such schema " + schemaName); + } + + /** + * Optomizaiton that attempts to load a specific LogStream as identified by the TableName. + * + * @param schemaName The schemaName to load. + * @return The CloudwatchTableName or null if not found. + */ + private String loadLogGroup(String schemaName) + throws TimeoutException + { + DescribeLogGroupsRequest request = new DescribeLogGroupsRequest().withLogGroupNamePrefix(schemaName); + DescribeLogGroupsResult result = invoker.invoke(() -> awsLogs.describeLogGroups(request)); + for (LogGroup next : result.getLogGroups()) { + String nextLogGroupName = next.getLogGroupName(); + if (nextLogGroupName.equalsIgnoreCase(schemaName)) { + logger.info("loadLogGroup: Matched {} for {}", nextLogGroupName, schemaName); + return nextLogGroupName; + } + } + + return null; + } + + /** + * Used to validate and convert the given TableName to a properly cased and qualified CloudwatchTableName. + * + * @param tableName The TableName to validate and convert. + * @return The CloudwatchTableName for the provided TableName or throws if the TableName could not be resolved to a + * CloudwatchTableName. This method mostly handles resolving case mismatches and ensuring the input is a valid entity + * in Cloudwatch. + */ + public CloudwatchTableName validateTable(TableName tableName) + { + String actualSchema = validateSchema(tableName.getSchemaName()); + CloudwatchTableName actual = null; + try { + actual = tableCache.get(new TableName(actualSchema, tableName.getTableName())); + if (actual == null) { + throw new IllegalArgumentException("Unknown table[" + tableName + "]"); + } + + return actual; + } + catch (ExecutionException ex) { + throw new RuntimeException("Exception while attempting to validate table " + tableName, ex); + } + } + + /** + * Used to validate and convert the given schema name to a properly cased and qualified CloudwatchTableName. + * + * @param schema The TableName to validate and convert. + * @return The cloudwatch LogGroup (aka schema name) or throws if the schema name could not be resolved to a + * LogGroup. This method mostly handles resolving case mismatches and ensuring the input is a valid entity + * in Cloudwatch. + */ + public String validateSchema(String schema) + { + String actual = null; + try { + actual = schemaCache.get(schema); + if (actual == null) { + throw new IllegalArgumentException("Unknown schema[" + schema + "]"); + } + + return actual; + } + catch (ExecutionException ex) { + throw new RuntimeException("Exception while attempting to validate schema " + schema, ex); + } + } +} diff --git a/athena-cloudwatch/src/test/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchMetadataHandlerTest.java b/athena-cloudwatch/src/test/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchMetadataHandlerTest.java new file mode 100644 index 0000000000..a9e7ef1671 --- /dev/null +++ b/athena-cloudwatch/src/test/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchMetadataHandlerTest.java @@ -0,0 +1,409 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.logs.AWSLogs; +import com.amazonaws.services.logs.model.DescribeLogGroupsRequest; +import com.amazonaws.services.logs.model.DescribeLogGroupsResult; +import com.amazonaws.services.logs.model.DescribeLogStreamsRequest; +import com.amazonaws.services.logs.model.DescribeLogStreamsResult; +import com.amazonaws.services.logs.model.LogGroup; +import com.amazonaws.services.logs.model.LogStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeoutException; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.verifyNoMoreInteractions; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class CloudwatchMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(CloudwatchMetadataHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private CloudwatchMetadataHandler handler; + private BlockAllocator allocator; + + @Mock + private AWSLogs mockAwsLogs; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + when(mockAwsLogs.describeLogStreams(any(DescribeLogStreamsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + return new DescribeLogStreamsResult().withLogStreams(new LogStream().withLogStreamName("table-9"), + new LogStream().withLogStreamName("table-10")); + }); + + when(mockAwsLogs.describeLogGroups(any(DescribeLogGroupsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + return new DescribeLogGroupsResult().withLogGroups(new LogGroup().withLogGroupName("schema-1"), + new LogGroup().withLogGroupName("schema-20")); + }); + handler = new CloudwatchMetadataHandler(mockAwsLogs, new LocalKeyFactory(), mockSecretsManager, mockAthena, "spillBucket", "spillPrefix"); + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doListSchemaNames() + throws TimeoutException + { + logger.info("doListSchemas - enter"); + + when(mockAwsLogs.describeLogGroups(any(DescribeLogGroupsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + DescribeLogGroupsRequest request = (DescribeLogGroupsRequest) invocationOnMock.getArguments()[0]; + + DescribeLogGroupsResult result = new DescribeLogGroupsResult(); + + Integer nextToken; + if (request.getNextToken() == null) { + nextToken = 1; + } + else if (Integer.valueOf(request.getNextToken()) < 3) { + nextToken = Integer.valueOf(request.getNextToken()) + 1; + } + else { + nextToken = null; + } + + List logGroups = new ArrayList<>(); + if (request.getNextToken() == null || Integer.valueOf(request.getNextToken()) < 3) { + for (int i = 0; i < 10; i++) { + LogGroup nextLogGroup = new LogGroup(); + nextLogGroup.setLogGroupName("schema-" + String.valueOf(i)); + logGroups.add(nextLogGroup); + } + } + + result.withLogGroups(logGroups); + if (nextToken != null) { + result.setNextToken(String.valueOf(nextToken)); + } + + return result; + }); + + ListSchemasRequest req = new ListSchemasRequest(identity, "queryId", "default"); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + logger.info("doListSchemas - {}", res.getSchemas()); + + assertTrue(res.getSchemas().size() == 30); + verify(mockAwsLogs, times(4)).describeLogGroups(any(DescribeLogGroupsRequest.class)); + verifyNoMoreInteractions(mockAwsLogs); + + logger.info("doListSchemas - exit"); + } + + @Test + public void doListTables() + throws TimeoutException + { + logger.info("doListTables - enter"); + + when(mockAwsLogs.describeLogStreams(any(DescribeLogStreamsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + DescribeLogStreamsRequest request = (DescribeLogStreamsRequest) invocationOnMock.getArguments()[0]; + + DescribeLogStreamsResult result = new DescribeLogStreamsResult(); + + Integer nextToken; + if (request.getNextToken() == null) { + nextToken = 1; + } + else if (Integer.valueOf(request.getNextToken()) < 3) { + nextToken = Integer.valueOf(request.getNextToken()) + 1; + } + else { + nextToken = null; + } + + List logStreams = new ArrayList<>(); + if (request.getNextToken() == null || Integer.valueOf(request.getNextToken()) < 3) { + for (int i = 0; i < 10; i++) { + LogStream nextLogStream = new LogStream(); + nextLogStream.setLogStreamName("table-" + String.valueOf(i)); + logStreams.add(nextLogStream); + } + } + + result.withLogStreams(logStreams); + if (nextToken != null) { + result.setNextToken(String.valueOf(nextToken)); + } + + return result; + }); + + ListTablesRequest req = new ListTablesRequest(identity, "queryId", "default", "schema-1"); + ListTablesResponse res = handler.doListTables(allocator, req); + logger.info("doListTables - {}", res.getTables()); + + assertTrue(res.getTables().contains(new TableName("schema-1", "all_log_streams"))); + + assertTrue(res.getTables().size() == 31); + + verify(mockAwsLogs, times(4)).describeLogStreams(any(DescribeLogStreamsRequest.class)); + verify(mockAwsLogs, times(1)).describeLogGroups(any(DescribeLogGroupsRequest.class)); + verifyNoMoreInteractions(mockAwsLogs); + + logger.info("doListTables - exit"); + } + + @Test + public void doGetTable() + { + logger.info("doGetTable - enter"); + String expectedSchema = "schema-20"; + + when(mockAwsLogs.describeLogStreams(any(DescribeLogStreamsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + DescribeLogStreamsRequest request = (DescribeLogStreamsRequest) invocationOnMock.getArguments()[0]; + + assertTrue(request.getLogGroupName().equals(expectedSchema)); + DescribeLogStreamsResult result = new DescribeLogStreamsResult(); + + Integer nextToken; + if (request.getNextToken() == null) { + nextToken = 1; + } + else if (Integer.valueOf(request.getNextToken()) < 3) { + nextToken = Integer.valueOf(request.getNextToken()) + 1; + } + else { + nextToken = null; + } + + List logStreams = new ArrayList<>(); + if (request.getNextToken() == null || Integer.valueOf(request.getNextToken()) < 3) { + for (int i = 0; i < 10; i++) { + LogStream nextLogStream = new LogStream(); + nextLogStream.setLogStreamName("table-" + String.valueOf(i)); + logStreams.add(nextLogStream); + } + } + + result.withLogStreams(logStreams); + if (nextToken != null) { + result.setNextToken(String.valueOf(nextToken)); + } + + return result; + }); + + GetTableRequest req = new GetTableRequest(identity, "queryId", "default", new TableName(expectedSchema, "table-9")); + GetTableResponse res = handler.doGetTable(allocator, req); + logger.info("doGetTable - {} {}", res.getTableName(), res.getSchema()); + + assertEquals(new TableName(expectedSchema, "table-9"), res.getTableName()); + assertTrue(res.getSchema() != null); + + verify(mockAwsLogs, times(1)).describeLogStreams(any(DescribeLogStreamsRequest.class)); + + logger.info("doGetTable - exit"); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + when(mockAwsLogs.describeLogStreams(any(DescribeLogStreamsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + DescribeLogStreamsRequest request = (DescribeLogStreamsRequest) invocationOnMock.getArguments()[0]; + + DescribeLogStreamsResult result = new DescribeLogStreamsResult(); + + Integer nextToken; + if (request.getNextToken() == null) { + nextToken = 1; + } + else if (Integer.valueOf(request.getNextToken()) < 3) { + nextToken = Integer.valueOf(request.getNextToken()) + 1; + } + else { + nextToken = null; + } + + List logStreams = new ArrayList<>(); + if (request.getNextToken() == null || Integer.valueOf(request.getNextToken()) < 3) { + int continuation = request.getNextToken() == null ? 0 : Integer.valueOf(request.getNextToken()); + for (int i = 0 + continuation * 100; i < 300; i++) { + LogStream nextLogStream = new LogStream(); + nextLogStream.setLogStreamName("table-" + String.valueOf(i)); + nextLogStream.setStoredBytes(i * 1000L); + logStreams.add(nextLogStream); + } + } + + result.withLogStreams(logStreams); + if (nextToken != null) { + result.setNextToken(String.valueOf(nextToken)); + } + + return result; + }); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put("log_stream", + EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add("table-10").build()); + + Schema schema = SchemaBuilder.newBuilder().addStringField("log_stream").build(); + + GetTableLayoutRequest req = new GetTableLayoutRequest(identity, + "queryId", + "default", + new TableName("schema-1", "all_log_streams"), + new Constraints(constraintsMap), + schema, + Collections.singleton("log_stream")); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout - {}", res.getPartitions().getSchema()); + logger.info("doGetTableLayout - {}", res.getPartitions()); + + assertTrue(res.getPartitions().getSchema().findField("log_stream") != null); + assertTrue(res.getPartitions().getRowCount() == 1); + + verify(mockAwsLogs, times(4)).describeLogStreams(any(DescribeLogStreamsRequest.class)); + + logger.info("doGetTableLayout - exit"); + } + + @Test + public void doGetSplits() + { + logger.info("doGetSplits: enter"); + + Schema schema = SchemaBuilder.newBuilder() + .addField(CloudwatchMetadataHandler.LOG_STREAM_FIELD, new ArrowType.Utf8()) + .addField(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD, new ArrowType.Int(64, true)) + .addField(CloudwatchMetadataHandler.LOG_GROUP_FIELD, new ArrowType.Utf8()) + .build(); + + Block partitions = allocator.createBlock(schema); + + int num_partitions = 2_000; + for (int i = 0; i < num_partitions; i++) { + BlockUtils.setValue(partitions.getFieldVector(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD), i, 2016L + i); + BlockUtils.setValue(partitions.getFieldVector(CloudwatchMetadataHandler.LOG_STREAM_FIELD), i, "log_stream_" + i); + BlockUtils.setValue(partitions.getFieldVector(CloudwatchMetadataHandler.LOG_GROUP_FIELD), i, "log_group_" + i); + } + partitions.setRowCount(num_partitions); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName("schema", "all_log_streams"), + partitions, + Collections.singletonList(CloudwatchMetadataHandler.LOG_STREAM_FIELD), + new Constraints(new HashMap<>()), + continuationToken); + int numContinuations = 0; + do { + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + logger.info("doGetSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + + for (Split nextSplit : response.getSplits()) { + assertNotNull(nextSplit.getProperty(CloudwatchMetadataHandler.LOG_STREAM_SIZE_FIELD)); + assertNotNull(nextSplit.getProperty(CloudwatchMetadataHandler.LOG_STREAM_FIELD)); + assertNotNull(nextSplit.getProperty(CloudwatchMetadataHandler.LOG_GROUP_FIELD)); + } + + if (continuationToken != null) { + numContinuations++; + } + } + while (continuationToken != null); + + assertTrue(numContinuations > 0); + + logger.info("doGetSplits: exit"); + } +} diff --git a/athena-cloudwatch/src/test/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchRecordHandlerTest.java b/athena-cloudwatch/src/test/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchRecordHandlerTest.java new file mode 100644 index 0000000000..0f5e82043b --- /dev/null +++ b/athena-cloudwatch/src/test/java/com/amazonaws/athena/connectors/cloudwatch/CloudwatchRecordHandlerTest.java @@ -0,0 +1,293 @@ +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.cloudwatch; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.logs.AWSLogs; +import com.amazonaws.services.logs.model.GetLogEventsRequest; +import com.amazonaws.services.logs.model.GetLogEventsResult; +import com.amazonaws.services.logs.model.OutputLogEvent; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class CloudwatchRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(CloudwatchRecordHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private List mockS3Storage; + private CloudwatchRecordHandler handler; + private S3BlockSpillReader spillReader; + private BlockAllocator allocator; + private Schema schemaForRead; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + + @Mock + private AWSLogs mockAwsLogs; + + @Mock + private AmazonS3 mockS3; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + schemaForRead = CloudwatchMetadataHandler.CLOUDWATCH_SCHEMA; + + mockS3Storage = new ArrayList<>(); + allocator = new BlockAllocatorImpl(); + handler = new CloudwatchRecordHandler(mockS3, mockSecretsManager, mockAthena, mockAwsLogs); + spillReader = new S3BlockSpillReader(mockS3, allocator); + + when(mockS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + }); + + when(mockS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + }); + + when(mockAwsLogs.getLogEvents(any(GetLogEventsRequest.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + GetLogEventsRequest request = (GetLogEventsRequest) invocationOnMock.getArguments()[0]; + + //Check that predicate pushdown was propagated to cloudwatch + assertNotNull(request.getStartTime()); + assertNotNull(request.getEndTime()); + + GetLogEventsResult result = new GetLogEventsResult(); + + Integer nextToken; + if (request.getNextToken() == null) { + nextToken = 1; + } + else if (Integer.valueOf(request.getNextToken()) < 3) { + nextToken = Integer.valueOf(request.getNextToken()) + 1; + } + else { + nextToken = null; + } + + List logEvents = new ArrayList<>(); + if (request.getNextToken() == null || Integer.valueOf(request.getNextToken()) < 3) { + long continuation = request.getNextToken() == null ? 0 : Integer.valueOf(request.getNextToken()); + for (int i = 0; i < 100_000; i++) { + OutputLogEvent outputLogEvent = new OutputLogEvent(); + outputLogEvent.setMessage("message-" + (continuation * i)); + outputLogEvent.setTimestamp(i * 100L); + logEvents.add(outputLogEvent); + } + } + + result.withEvents(logEvents); + if (nextToken != null) { + result.setNextForwardToken(String.valueOf(nextToken)); + } + + return result; + }); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doReadRecordsNoSpill() + throws Exception + { + logger.info("doReadRecordsNoSpill: enter"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("time", SortedRangeSet.copyOf(Types.MinorType.BIGINT.getType(), + ImmutableList.of(Range.equal(allocator, Types.MinorType.BIGINT.getType(), 100L)), false)); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("schema", "table"), + schemaForRead, + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), + keyFactory.create()).add(CloudwatchMetadataHandler.LOG_STREAM_FIELD, "table").build(), + new Constraints(constraintsMap), + 100_000_000_000L, + 100_000_000_000L//100GB don't expect this to spill + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsNoSpill: rows[{}]", response.getRecordCount()); + + assertTrue(response.getRecords().getRowCount() == 3); + logger.info("doReadRecordsNoSpill: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("doReadRecordsNoSpill: exit"); + } + + @Test + public void doReadRecordsSpill() + throws Exception + { + logger.info("doReadRecordsSpill: enter"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("time", SortedRangeSet.of( + Range.range(allocator, Types.MinorType.BIGINT.getType(), 100L, true, 100_000_000L, true))); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("schema", "table"), + schemaForRead, + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), + keyFactory.create()).add(CloudwatchMetadataHandler.LOG_STREAM_FIELD, "table").build(), + new Constraints(constraintsMap), + 1_500_000L, //~1.5MB so we should see some spill + 0 + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof RemoteReadRecordsResponse); + + try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) { + logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size()); + + assertTrue(response.getNumberBlocks() > 1); + + int blockNum = 0; + for (SpillLocation next : response.getRemoteBlocks()) { + S3SpillLocation spillLocation = (S3SpillLocation) next; + try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) { + + logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount()); + // assertTrue(++blockNum < response.getRemoteBlocks().size() && block.getRowCount() > 10_000); + + logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0)); + assertNotNull(BlockUtils.rowToString(block, 0)); + } + } + } + + logger.info("doReadRecordsSpill: exit"); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-docdb/LICENSE.txt b/athena-docdb/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-docdb/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-docdb/README.md b/athena-docdb/README.md new file mode 100644 index 0000000000..04268f5a62 --- /dev/null +++ b/athena-docdb/README.md @@ -0,0 +1,95 @@ +# Amazon Athena DocumentDB Connector + +This connector enables Amazon Athena to communicate with your DocumentDB instance(s), making your DocumentDB data accessible via SQL. The also works with any MongoDB compatible endpoint. + +Unlike traditional relational data stores, DocumentDB collections do not have set schema. Each entry can have different fields and data types. While we are investigating the best way to support schema-on-read usecases for this connector, it presently supports two mechanisms for generating traditional table schema information. The default mechanism is for the connector to scan a small number of documents in your collection in order to form a union of all fields and coerce fields with non-overlap data types. This basic schema inference works well for collections that have mostly uniform entries. For more diverse collections, the connector supports retrieving meta-data from the Glue Data Catalog. If the connector sees a database and table which match your DocumentDB database and collection names it will use the corresponding Glue table for schema. We recommend creating your Glue table such that it is a superset of all fields you may want to access from your DocumentDB Collection. + +### Parameters + +The Amazon Athena DocumentDB Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) +5. **disable_glue** - (Optional) If present, with any valye, the connector will no longer attempt to retrieve supplemental metadata from Glue. +6. **glue_catalog** - (Optional) Can be used to target a cross-account Glue catalog. By default the connector will attempt to get metadata from its own Glue account. +7. **default_docdb** If present, this DocDB connection string is used when there is not a catalog specific environment variable (as explained below). (e.g. mongodb://:@:/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0) + +You can also provide one or more properties which define the DocumentDB connection details for the DocumentDB instance(s) you'd like this connector to use. You can do this by setting a Lambda environment variable that corresponds to the catalog name you'd like to use in Athena. For example, if I'd like to query two different DocumentDB instances from Athena in the below queries: + +```sql + select * from "docdb_instance_1".database.table + select * from "docdb_instance_2".database.table + ``` + +To support these two SQL statements we'd need to add two environment variables to our Lambda function: + +1. **docdb_instance_1** - The value should be the DocumentDB connection details in the format of:mongodb://:@:/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0 +2. **docdb_instance_2** - The value should be the DocumentDB connection details in the format of: mongodb://:@:/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0 + +You can also optionally use SecretsManager for part or all of the value for the preceeding connection details. For example, if I set a Lambda environment variable for **docdb_instance_1** to be "mongodb://${docdb_instance_1_creds}@myhostname.com:123/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0" the Athena Federation +SDK will automatically attempt to retrieve a secret from AWS SecretsManager named "docdb_instance_1_creds" and inject that value in place of "${docdb_instance_1_creds}". Basically anything between ${...} is attempted as a secret in SecretsManager. If no such secret exists, the text isn't replaced. + + +### Setting Up Databases & Tables + +To enable a Glue Table for use with DocumentDB, you simply need to have a Glue database and table that matches any DocumentDB Database and Collection that you'd like to supply supplemental metadata for (instead of relying on the DocumentDB Connector's ability to infer schema). The connector's in built schema inference only supports a subset of data types and scans a limited number of documents. You can enable a Glue table to be used for supplemental metadata by setting the below table properties from the Glue Console when editing the Table and databse in question. The only other thing you need to do ensure you use the appropriate data types listed in a later section. + +1. **docdb-metadata-flag** - Flag indicating that the table can be used for supplemental meta-data by the Athena DocDB Connector. The value is unimportant as long as this key is present in the properties of the table. + +### Data Types + +The schema inference feature of this connector will attempt to infer values as one of the following: + +|Apache Arrow DataType|Java/DocDB Type| +|-------------|-----------------| +|VARCHAR|String| +|INT|Integer| +|BIGINT|Long| +|BIT|Boolean| +|FLOAT4|Float| +|FLOAT8|Double| +|TIMESTAMPSEC|Date| +|VARCHAR|ObjectId| +|LIST|List| +|STRUCT|Document| + +Alternatively, if you are using Glue for supplimental metadata you can configure the following types: + +|Glue DataType|Apache Arrow Type| +|-------------|-----------------| +|int|INT| +|bigint|BIGINT| +|double|FLOAT8| +|float|FLOAT4| +|boolean|BIT| +|binary|VARBINARY| +|string|VARCHAR| +|List|LIST| +|Struct|STRUCT| + +### Required Permissions + +Review the "Policies" section of the athena-docdb.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +2. SecretsManager Read Access - If you choose to store redis-endpoint details in SecretsManager you will need to grant the connector access to those secrets. +3. Glue Data Catalog - Since DocumentDB does not have a meta-data store, the connector requires Read-Only access to Glue's DataCatalog for supplemental table schema information. +4. VPC Access - In order to connect to your VPC for the purposes of communicating with your DocumentDB instance(s), the connector needs the ability to attach/detach an interface to the VPC. +5. CloudWatch Logs - This is a somewhat implicit permission when deploying a Lambda function but it needs access to cloudwatch logs for storing logs. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-docdb dir, run `mvn clean install`. +3. From the athena-docdb dir, run `../tools/publish.sh S3_BUCKET_NAME athena-docdb` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) + + +## Performance + +The Athena DocumentDB Connector does not current support parallel scans but will attempt to push down predicates as part of its DocumentDB queries. + diff --git a/athena-docdb/athena-docdb.yaml b/athena-docdb/athena-docdb.yaml new file mode 100644 index 0000000000..49b3d2e1c3 --- /dev/null +++ b/athena-docdb/athena-docdb.yaml @@ -0,0 +1,98 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaDocumentDBConnector + Description: This connector enables Amazon Athena to communicate with your DocumentDB instance(s), making your DocumentDB data accessible via SQL. + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.2 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + Default: athena-federation-spill + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: 'If set to ''false'' data spilled to S3 is encrypted with AES GCM' + Default: 'false' + Type: String + SecurityGroupIds: + Description: 'One or more SecurityGroup IDs corresponding to the SecurityGroup that should be applied to the Lambda function. (e.g. sg1,sg2,sg3)' + Type: 'List' + SubnetIds: + Description: 'One or more Subnet IDs corresponding to the Subnet that the Lambda function can use to access you data source. (e.g. subnet1,subnet2)' + Type: 'List' + SecretNameOrPrefix: + Description: 'The name or prefix of a set of names within Secrets Manager that this function should have access to. (e.g. hbase-*).' + Type: String + DocDBConnectionString: + Description: 'The DocDB connection details to use by default if not catalog specific connection is defined and optionally using SecretsManager (e.g. ${secret_name}).' + Type: String + Default: "e.g. mongodb://:@:/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0" +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + default_docdb: !Ref DocDBConnectionString + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.docdb.DocDBCompositeHandler" + CodeUri: "./target/athena-docdb-1.0.jar" + Description: "Enables Amazon Athena to communicate with DocumentDB, making your DocumentDB data accessible via SQL." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - secretsmanager:GetSecretValue + Effect: Allow + Resource: !Sub 'arn:aws:secretsmanager:*:*:secret:${SecretNameOrPrefix}' + Version: '2012-10-17' + - Statement: + - Action: + - glue:GetTableVersions + - glue:GetPartitions + - glue:GetTables + - glue:GetTableVersion + - glue:GetDatabases + - glue:GetTable + - glue:GetPartition + - glue:GetDatabase + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket + #VPCAccessPolicy allows our connector to run in a VPC so that it can access your data source. + - VPCAccessPolicy: {} + VpcConfig: + SecurityGroupIds: !Ref SecurityGroupIds + SubnetIds: !Ref SubnetIds \ No newline at end of file diff --git a/athena-docdb/pom.xml b/athena-docdb/pom.xml new file mode 100644 index 0000000000..4efd98b3ae --- /dev/null +++ b/athena-docdb/pom.xml @@ -0,0 +1,57 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-docdb + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + org.mongodb + mongo-java-driver + 3.10.2 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBCompositeHandler.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBCompositeHandler.java new file mode 100644 index 0000000000..df5342f7f5 --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose DocDBMetadataHandler and DocDBRecordHandler. + */ +public class DocDBCompositeHandler + extends CompositeHandler +{ + public DocDBCompositeHandler() + { + super(new DocDBMetadataHandler(), new DocDBRecordHandler()); + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBConnectionFactory.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBConnectionFactory.java new file mode 100644 index 0000000000..715f2a4104 --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBConnectionFactory.java @@ -0,0 +1,93 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.mongodb.client.MongoClient; +import com.mongodb.client.MongoClients; +import org.apache.arrow.util.VisibleForTesting; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +/** + * Creates and Caches HBase Connection Instances, using the connection string as the cache key. + * + * @Note Connection String format is expected to be like: + * mongodb://:@:/?ssl=true&ssl_ca_certs=&replicaSet= + */ +public class DocDBConnectionFactory +{ + private static final Logger logger = LoggerFactory.getLogger(DocDBConnectionFactory.class); + private final Map clientCache = new HashMap<>(); + + /** + * Used to get an existing, pooled, connection or to create a new connection + * for the given connection string. + * + * @param connStr MongoClient connection details, format is expected to be: + * mongodb://:@:/?ssl=true&ssl_ca_certs=&replicaSet= + * @return A MongoClient connection if the connection succeeded, else the function will throw. + */ + public synchronized MongoClient getOrCreateConn(String connStr) + { + logger.info("getOrCreateConn: enter"); + MongoClient result = clientCache.get(connStr); + + if (result == null || !connectionTest(result)) { + result = MongoClients.create(connStr); + clientCache.put(connStr, result); + } + + logger.info("getOrCreateConn: exit"); + return result; + } + + /** + * Runs a 'quick' test on the connection and then returns it if it passes. + */ + private boolean connectionTest(MongoClient conn) + { + try { + logger.info("connectionTest: Testing connection started."); + conn.listDatabaseNames(); + logger.info("connectionTest: Testing connection completed - success."); + return true; + } + catch (RuntimeException ex) { + logger.warn("getOrCreateConn: Exception while testing existing connection.", ex); + } + logger.info("connectionTest: Testing connection completed - fail."); + return false; + } + + /** + * Injects a connection into the client cache. + * + * @param conStr The connection string (aka the cache key) + * @param conn The connection to inject into the client cache, most often a Mock used in testing. + */ + @VisibleForTesting + protected synchronized void addConnection(String conStr, MongoClient conn) + { + clientCache.put(conStr, conn); + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBFieldResolver.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBFieldResolver.java new file mode 100644 index 0000000000..f805333ace --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBFieldResolver.java @@ -0,0 +1,54 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.bson.Document; + +import java.util.List; + +/** + * Used to resolve DocDB complex structures to Apache Arrow Types. + * + * @see com.amazonaws.athena.connector.lambda.data.FieldResolver + */ +public class DocDBFieldResolver + implements FieldResolver +{ + protected static final FieldResolver DEFAULT_FIELD_RESOLVER = new DocDBFieldResolver(); + + private DocDBFieldResolver() {} + + @Override + public Object getFieldValue(Field field, Object value) + { + Types.MinorType minorType = Types.getMinorTypeForArrowType(field.getType()); + if (minorType == Types.MinorType.LIST) { + return TypeUtils.coerce(field.getChildren().get(0), ((List) value).iterator()); + } + else if (value instanceof Document) { + Object rawVal = ((Document) value).get(field.getName()); + return TypeUtils.coerce(field, rawVal); + } + throw new RuntimeException("Expected LIST or Document type but found " + minorType); + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBMetadataHandler.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBMetadataHandler.java new file mode 100644 index 0000000000..5432061497 --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBMetadataHandler.java @@ -0,0 +1,250 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.metadata.glue.GlueFieldLexer; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.AWSGlueClientBuilder; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.mongodb.client.MongoClient; +import com.mongodb.client.MongoCursor; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +/** + * Handles metadata requests for the Athena DocumentDB Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Uses a Glue table property (docfb-metadata-flag) to indicate that the table (whose name matched the DocDB collection + * name) can indeed be used to supplement metadata from DocDB itself. + * 2. Attempts to resolve sensitive fields such as DocDB connection strings via SecretsManager so that you can substitute + * variables with values from by doing something like: + * mongodb://${docdb_instance_1_creds}@myhostname.com:123/?ssl=true&ssl_ca_certs=rds-combined-ca-bundle.pem&replicaSet=rs0 + */ +public class DocDBMetadataHandler + extends GlueMetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(DocDBMetadataHandler.class); + + //Used to denote the 'type' of this connector for diagnostic purposes. + private static final String SOURCE_TYPE = "documentdb"; + //The Env variable name used to indicate that we want to disable the use of Glue DataCatalog for supplemental + //metadata and instead rely solely on the connector's schema inference capabilities. + private static final String GLUE_ENV_VAR = "disable_glue"; + //Field name used to store the connection string as a property on Split objects. + protected static final String DOCDB_CONN_STR = "connStr"; + //The Env variable name used to store the default DocDB connection string if no catalog specific + //env variable is set. + private static final String DEFAULT_DOCDB = "default_docdb"; + //The Glue table property that indicates that a table matching the name of an DocDB table + //is indeed enabled for use by this connector. + private static final String DOCDB_METADATA_FLAG = "docdb-metadata-flag"; + //Used to filter out Glue tables which lack a docdb metadata flag. + private static final TableFilter TABLE_FILTER = (Table table) -> table.getParameters().containsKey(DOCDB_METADATA_FLAG); + //The number of documents to scan when attempting to infer schema from an DocDB collection. + private static final int SCHEMA_INFERRENCE_NUM_DOCS = 10; + + private final AWSGlue glue; + private final DocDBConnectionFactory connectionFactory; + + public DocDBMetadataHandler() + { + super((System.getenv(GLUE_ENV_VAR) == null) ? AWSGlueClientBuilder.standard().build() : null, SOURCE_TYPE); + glue = getAwsGlue(); + connectionFactory = new DocDBConnectionFactory(); + } + + @VisibleForTesting + protected DocDBMetadataHandler(AWSGlue glue, + DocDBConnectionFactory connectionFactory, + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(glue, keyFactory, secretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + this.glue = glue; + this.connectionFactory = connectionFactory; + } + + private MongoClient getOrCreateConn(MetadataRequest request) + { + String endpoint = resolveSecrets(getConnStr(request)); + return connectionFactory.getOrCreateConn(endpoint); + } + + /** + * Retrieves the DocDB connection details from an env variable matching the catalog name, if no such + * env variable exists we fall back to the default env variable defined by DEFAULT_DOCDB. + */ + private String getConnStr(MetadataRequest request) + { + String conStr = System.getenv(request.getCatalogName()); + if (conStr == null) { + logger.info("getConnStr: No environment variable found for catalog {} , using default {}", + request.getCatalogName(), DEFAULT_DOCDB); + conStr = System.getenv(DEFAULT_DOCDB); + } + return conStr; + } + + /** + * List databases in your DocumentDB instance treating each as a 'schema' (aka database) + * + * @see GlueMetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest request) + { + List schemas = new ArrayList<>(); + MongoClient client = getOrCreateConn(request); + try (MongoCursor itr = client.listDatabaseNames().iterator()) { + while (itr.hasNext()) { + schemas.add(itr.next()); + } + + return new ListSchemasResponse(request.getCatalogName(), schemas); + } + } + + /** + * List collections in the requested schema in your DocumentDB instance treating the requested schema as an DocumentDB + * database. + * + * @see GlueMetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest request) + { + MongoClient client = getOrCreateConn(request); + List tables = new ArrayList<>(); + + try (MongoCursor itr = client.getDatabase(request.getSchemaName()).listCollectionNames().iterator()) { + while (itr.hasNext()) { + tables.add(new TableName(request.getSchemaName(), itr.next())); + } + + return new ListTablesResponse(request.getCatalogName(), tables); + } + } + + /** + * If Glue is enabled as a source of supplemental metadata we look up the requested Schema/Table in Glue and + * filters out any results that don't have the DOCDB_METADATA_FLAG set. If no matching results were found in Glue, + * then we resort to inferring the schema of the DocumentDB collection using SchemaUtils.inferSchema(...). If there + * is no such table in DocumentDB the operation will fail. + * + * @see GlueMetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request) + throws Exception + { + logger.info("doGetTable: enter", request.getTableName()); + Schema schema = null; + try { + if (glue != null) { + schema = super.doGetTable(blockAllocator, request, TABLE_FILTER).getSchema(); + logger.info("doGetTable: Retrieved schema for table[{}] from AWS Glue.", request.getTableName()); + } + } + catch (RuntimeException ex) { + logger.warn("doGetTable: Unable to retrieve table[{}:{}] from AWS Glue.", + request.getTableName().getSchemaName(), + request.getTableName().getTableName(), + ex); + } + + if (schema == null) { + logger.info("doGetTable: Inferring schema for table[{}].", request.getTableName()); + MongoClient client = getOrCreateConn(request); + schema = SchemaUtils.inferSchema(client, request.getTableName(), SCHEMA_INFERRENCE_NUM_DOCS); + } + return new GetTableResponse(request.getCatalogName(), request.getTableName(), schema); + } + + /** + * Our table doesn't support complex layouts or partitioning so we simply make this method a NoOp. + * + * @see GlueMetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + //NoOp as we do not support partitioning. + } + + /** + * Since our connector does not support parallel scans we generate a single Split and include the connection details + * as a property on the split so that the RecordHandler has easy access to it. + * + * @see GlueMetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest request) + { + //Every split must have a unique location if we wish to spill to avoid failures + SpillLocation spillLocation = makeSpillLocation(request); + + //Since our connector does not support parallel reads we return a fixed split. + return new GetSplitsResponse(request.getCatalogName(), + Split.newBuilder(spillLocation, makeEncryptionKey()) + .add(DOCDB_CONN_STR, getConnStr(request)) + .build()); + } + + /** + * @see GlueMetadataHandler + */ + @Override + protected Field convertField(String name, String glueType) + { + return GlueFieldLexer.lex(name, glueType); + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBRecordHandler.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBRecordHandler.java new file mode 100644 index 0000000000..73ea87c9f1 --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/DocDBRecordHandler.java @@ -0,0 +1,169 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.mongodb.client.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoCursor; +import com.mongodb.client.MongoDatabase; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.bson.Document; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Map; +import java.util.concurrent.atomic.AtomicLong; + +import static com.amazonaws.athena.connectors.docdb.DocDBFieldResolver.DEFAULT_FIELD_RESOLVER; +import static com.amazonaws.athena.connectors.docdb.DocDBMetadataHandler.DOCDB_CONN_STR; + +/** + * Handles data read record requests for the Athena DocumentDB Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Attempts to resolve sensitive configuration fields such as HBase connection string via SecretsManager so that you can + * substitute variables with values from by doing something like hostname:port:password=${my_secret} + */ +public class DocDBRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(DocDBRecordHandler.class); + + //Used to denote the 'type' of this connector for diagnostic purposes. + private static final String SOURCE_TYPE = "documentdb"; + //Controls the page size for fetching batches of documents from the MongoDB client. + private static final int MONGO_QUERY_BATCH_SIZE = 100; + + private final DocDBConnectionFactory connectionFactory; + + public DocDBRecordHandler() + { + this(AmazonS3ClientBuilder.defaultClient(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + new DocDBConnectionFactory()); + } + + @VisibleForTesting + protected DocDBRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, DocDBConnectionFactory connectionFactory) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + this.connectionFactory = connectionFactory; + } + + /** + * Gets the special DOCDB_CONN_STR property from the provided split and uses its contents to getOrCreate + * a MongoDB client connection. + * + * @param split The split to that we need to read and this the DocDB instance to connecto ro. + * @return A MongoClient connected to the request DB instance. + * @note This method attempts to resolve any SecretsManager secrets that are using in the connection string and denoted + * by ${secret_name}. + */ + private MongoClient getOrCreateConn(Split split) + { + String conStr = split.getProperty(DOCDB_CONN_STR); + if (conStr == null) { + throw new RuntimeException(DOCDB_CONN_STR + " Split property is null! Unable to create connection."); + } + String endpoint = resolveSecrets(conStr); + return connectionFactory.getOrCreateConn(endpoint); + } + + /** + * Scans DocumentDB using the scan settings set on the requested Split by DocDBeMetadataHandler. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + TableName tableName = recordsRequest.getTableName(); + Map constraintSummary = recordsRequest.getConstraints().getSummary(); + + MongoClient client = getOrCreateConn(recordsRequest.getSplit()); + MongoDatabase db = client.getDatabase(tableName.getSchemaName()); + MongoCollection table = db.getCollection(tableName.getTableName()); + + Document query = QueryUtils.makeQuery(recordsRequest.getSchema(), constraintSummary); + Document output = QueryUtils.makeProjection(recordsRequest.getSchema()); + + logger.info("readWithConstraint: query[{}] projection[{}]", query, output); + + final MongoCursor iterable = table + .find(query) + .projection(output) + .batchSize(MONGO_QUERY_BATCH_SIZE).iterator(); + + long numRows = 0; + AtomicLong numResultRows = new AtomicLong(0); + while (iterable.hasNext() && queryStatusChecker.isQueryRunning()) { + numRows++; + spiller.writeRows((Block block, int rowNum) -> { + Document doc = iterable.next(); + + boolean matched = true; + for (Field nextField : recordsRequest.getSchema().getFields()) { + Object value = TypeUtils.coerce(nextField, doc.get(nextField.getName())); + Types.MinorType fieldType = Types.getMinorTypeForArrowType(nextField.getType()); + try { + switch (fieldType) { + case LIST: + case STRUCT: + matched &= block.offerComplexValue(nextField.getName(), rowNum, DEFAULT_FIELD_RESOLVER, value); + break; + default: + matched &= block.offerValue(nextField.getName(), rowNum, value); + break; + } + if (!matched) { + return 0; + } + } + catch (Exception ex) { + throw new RuntimeException("Error while processing field " + nextField.getName(), ex); + } + } + + numResultRows.getAndIncrement(); + return 1; + }); + } + + logger.info("readWithConstraint: numRows[{}] numResultRows[{}]", numRows, numResultRows.get()); + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/QueryUtils.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/QueryUtils.java new file mode 100644 index 0000000000..fdf2b17191 --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/QueryUtils.java @@ -0,0 +1,247 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +/* + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * + * @note Portions of this file are attributable to: + * https://github.com/prestodb/presto/blob/master/presto-mongodb/src/main/java/com/facebook/presto/mongodb/MongoSession.java + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.arrow.vector.util.Text; +import org.bson.Document; + +import java.util.ArrayList; +import java.util.List; +import java.util.Map; + +import static com.google.common.base.Preconditions.checkState; +import static com.google.common.base.Verify.verify; +import static java.util.stream.Collectors.toList; + +/** + * Collection of helper methods which build Documents for use in DocumentDB queries, including: + * 1. Projections + * 2. Predicates + * 3. Queries (a collection of predicates) + */ +public final class QueryUtils +{ + private static final String OR_OP = "$or"; + private static final String AND_OP = "$and"; + private static final String NOT_OP = "$not"; + private static final String NOR_OP = "$nor"; + + private static final String EQ_OP = "$eq"; + private static final String NOT_EQ_OP = "$ne"; + private static final String EXISTS_OP = "$exists"; + private static final String GTE_OP = "$gte"; + private static final String GT_OP = "$gt"; + private static final String LT_OP = "$lt"; + private static final String LTE_OP = "$lte"; + private static final String IN_OP = "$in"; + private static final String NOTIN_OP = "$nin"; + + private QueryUtils() + { + } + + /** + * Given a Schema create a projection document which can be used to request only specific Document fields + * from DocumentDB. + * + * @param schema The schema containing the requested projection. + * @return A Document matching the requested field projections. + */ + public static Document makeProjection(Schema schema) + { + Document output = new Document(); + for (Field field : schema.getFields()) { + output.append(field.getName(), 1); + } + return output; + } + + /** + * Given a set of Constraints and the projection Schema, create the Query Document that can be used to + * push predicates into DocumentDB. + * + * @param schema The schema containing the requested projection. + * @param constraintSummary The set of constraints to apply to the query. + * @return The Document to use as the query. + */ + public static Document makeQuery(Schema schema, Map constraintSummary) + { + Document query = new Document(); + for (Map.Entry entry : constraintSummary.entrySet()) { + Document doc = makePredicate(schema.findField(entry.getKey()), entry.getValue()); + if (doc != null) { + query.putAll(doc); + } + } + + return query; + } + + /** + * Converts a single field constraint into a Document for use in a DocumentDB query. + * + * @param field The field for the given ValueSet constraint. + * @param constraint The constraint to apply to the given field. + * @return A Document describing the constraint for pushing down into DocumentDB. + */ + public static Document makePredicate(Field field, ValueSet constraint) + { + String name = field.getName(); + + if (constraint.isNone()) { + return documentOf(name, isNullPredicate()); + } + + if (constraint.isAll()) { + return documentOf(name, isNotNullPredicate()); + } + + if (constraint.isNullAllowed()) { + //TODO: support nulls mixed with discrete value constraints + return null; + } + + if (constraint instanceof EquatableValueSet) { + Block block = ((EquatableValueSet) constraint).getValues(); + List singleValues = new ArrayList<>(); + + FieldReader fieldReader = block.getFieldReaders().get(0); + for (int i = 0; i < block.getRowCount(); i++) { + Document nextEqVal = new Document(); + Object value = fieldReader.readObject(); + nextEqVal.put(EQ_OP, convert(value)); + singleValues.add(singleValues); + } + + return orPredicate(singleValues.stream() + .map(next -> new Document(name, next)) + .collect(toList())); + } + + List singleValues = new ArrayList<>(); + List disjuncts = new ArrayList<>(); + for (Range range : constraint.getRanges().getOrderedRanges()) { + if (range.isSingleValue()) { + singleValues.add(convert(range.getSingleValue())); + } + else { + Document rangeConjuncts = new Document(); + if (!range.getLow().isLowerUnbounded()) { + switch (range.getLow().getBound()) { + case ABOVE: + rangeConjuncts.put(GT_OP, convert(range.getLow().getValue())); + break; + case EXACTLY: + rangeConjuncts.put(GTE_OP, convert(range.getLow().getValue())); + break; + case BELOW: + throw new IllegalArgumentException("Low Marker should never use BELOW bound: " + range); + default: + throw new AssertionError("Unhandled bound: " + range.getLow().getBound()); + } + } + if (!range.getHigh().isUpperUnbounded()) { + switch (range.getHigh().getBound()) { + case ABOVE: + throw new IllegalArgumentException("High Marker should never use ABOVE bound: " + range); + case EXACTLY: + rangeConjuncts.put(LTE_OP, convert(range.getHigh().getValue())); + break; + case BELOW: + rangeConjuncts.put(LT_OP, convert(range.getHigh().getValue())); + break; + default: + throw new AssertionError("Unhandled bound: " + range.getHigh().getBound()); + } + } + // If rangeConjuncts is null, then the range was ALL, which should already have been checked for + verify(!rangeConjuncts.isEmpty()); + disjuncts.add(rangeConjuncts); + } + } + + // Add back all of the possible single values either as an equality or an IN predicate + if (singleValues.size() == 1) { + disjuncts.add(documentOf(EQ_OP, singleValues.get(0))); + } + else if (singleValues.size() > 1) { + disjuncts.add(documentOf(IN_OP, singleValues)); + } + + return orPredicate(disjuncts.stream() + .map(disjunct -> new Document(name, disjunct)) + .collect(toList())); + } + + private static Document documentOf(String key, Object value) + { + return new Document(key, value); + } + + private static Document orPredicate(List values) + { + checkState(!values.isEmpty()); + if (values.size() == 1) { + return values.get(0); + } + return new Document(OR_OP, values); + } + + private static Document isNullPredicate() + { + return documentOf(EXISTS_OP, true).append(EQ_OP, null); + } + + private static Document isNotNullPredicate() + { + return documentOf(NOT_EQ_OP, null); + } + + private static Object convert(Object value) + { + if (value instanceof Text) { + return ((Text) value).toString(); + } + return value; + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/SchemaUtils.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/SchemaUtils.java new file mode 100644 index 0000000000..9a2117dc7f --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/SchemaUtils.java @@ -0,0 +1,158 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.mongodb.client.MongoClient; +import com.mongodb.client.MongoCursor; +import com.mongodb.client.MongoDatabase; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.FieldType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.bson.Document; +import org.bson.types.ObjectId; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.Date; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +/** + * Collection of helpful utilities that handle DocumentDB schema inference, type, and naming conversion. + */ +public class SchemaUtils +{ + private static final Logger logger = LoggerFactory.getLogger(SchemaUtils.class); + + private SchemaUtils() {} + + /** + * This method will produce an Apache Arrow Schema for the given TableName and DocumentDB connection + * by scanning up to the requested number of rows and using basic schema inference to determine + * data types. + * + * @param client The DocumentDB connection to use for the scan operation. + * @param table The DocumentDB TableName for which to produce an Apache Arrow Schema. + * @param numObjToSample The number of records to scan as part of producing the Schema. + * @return An Apache Arrow Schema representing the schema of the HBase table. + * @note The resulting schema is a union of the schema of every row that is scanned. Presently the code does not + * attempt to resolve conflicts if unique field has different types across documents. It is recommend that you + * use AWS Glue to define a schema for tables which may have such conflicts. In the future we may enhance this method + * to use a reasonable default (like String) and coerce heterogeneous fields to avoid query failure but forcing + * explicit handling by defining Schema in AWS Glue is likely a better approach. + */ + public static Schema inferSchema(MongoClient client, TableName table, int numObjToSample) + { + MongoDatabase db = client.getDatabase(table.getSchemaName()); + + try (MongoCursor docs = db.getCollection(table.getTableName()).find().batchSize(numObjToSample) + .maxScan(numObjToSample).limit(numObjToSample).iterator()) { + if (!docs.hasNext()) { + return SchemaBuilder.newBuilder().build(); + } + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + + Set discoveredColumns = new HashSet<>(); + while (docs.hasNext()) { + Document doc = docs.next(); + for (String key : doc.keySet()) { + if (!discoveredColumns.contains(key)) { + schemaBuilder.addField(getArrowField(key, doc.get(key))); + discoveredColumns.add(key); + } + } + } + + return schemaBuilder.build(); + } + } + + /** + * Infers the type of a single DocumentDB document field. + * + * @param key The key of the field we are attempting to infer. + * @param value A value from the key whose type we are attempting to infer. + * @return The Apache Arrow field definition of the inferred key/value. + */ + public static Field getArrowField(String key, Object value) + { + if (value instanceof String) { + return new Field(key, FieldType.nullable(Types.MinorType.VARCHAR.getType()), null); + } + else if (value instanceof Integer) { + return new Field(key, FieldType.nullable(Types.MinorType.INT.getType()), null); + } + else if (value instanceof Long) { + return new Field(key, FieldType.nullable(Types.MinorType.BIGINT.getType()), null); + } + else if (value instanceof Boolean) { + return new Field(key, FieldType.nullable(Types.MinorType.BIT.getType()), null); + } + else if (value instanceof Float) { + return new Field(key, FieldType.nullable(Types.MinorType.FLOAT4.getType()), null); + } + else if (value instanceof Double) { + return new Field(key, FieldType.nullable(Types.MinorType.FLOAT8.getType()), null); + } + else if (value instanceof Date) { + return new Field(key, FieldType.nullable(Types.MinorType.DATEMILLI.getType()), null); + } + else if (value instanceof ObjectId) { + return new Field(key, FieldType.nullable(Types.MinorType.VARCHAR.getType()), null); + } + else if (value instanceof List) { + Field child; + if (((List) value).isEmpty()) { + try { + Object subVal = ((List) value).getClass() + .getTypeParameters()[0].getGenericDeclaration().newInstance(); + child = getArrowField("", subVal); + } + catch (IllegalAccessException | InstantiationException ex) { + throw new RuntimeException(ex); + } + } + else { + child = getArrowField("", ((List) value).get(0)); + } + return new Field(key, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(child)); + } + else if (value instanceof Document) { + List children = new ArrayList<>(); + Document doc = (Document) value; + for (String childKey : doc.keySet()) { + Object childVal = doc.get(childKey); + Field child = getArrowField(childKey, childVal); + children.add(child); + } + return new Field(key, FieldType.nullable(Types.MinorType.STRUCT.getType()), children); + } + + String className = value.getClass() == null ? "null" : value.getClass().getName(); + throw new RuntimeException("Unknown type[" + className + "] for field[" + key + "]"); + } +} diff --git a/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/TypeUtils.java b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/TypeUtils.java new file mode 100644 index 0000000000..0ae8391ab9 --- /dev/null +++ b/athena-docdb/src/main/java/com/amazonaws/athena/connectors/docdb/TypeUtils.java @@ -0,0 +1,92 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.bson.types.ObjectId; + +/** + * Helper class with useful methods for type conversion and coercion. + */ +public class TypeUtils +{ + private TypeUtils() {} + + /** + * Allows for coercing types in the event that schema has evolved or there were other data issues. + * + * @param field The field that we are coercing the value into. + * @param origVal The value to coerce + * @return The coerced value. + * @note This method does only basic coercion today but will likely support more advanced + * coercions in the future as a way of dealing with schema evolution. + */ + public static Object coerce(Field field, Object origVal) + { + if (origVal == null) { + return origVal; + } + + if (origVal instanceof ObjectId) { + return origVal.toString(); + } + + ArrowType arrowType = field.getType(); + Types.MinorType minorType = Types.getMinorTypeForArrowType(arrowType); + + switch (minorType) { + case VARCHAR: + if (origVal instanceof String) { + return origVal; + } + else { + return String.valueOf(origVal); + } + case FLOAT8: + if (origVal instanceof Integer) { + return Double.valueOf((int) origVal); + } + else if (origVal instanceof Float) { + return Double.valueOf((float) origVal); + } + return origVal; + case FLOAT4: + if (origVal instanceof Integer) { + return Float.valueOf((int) origVal); + } + else if (origVal instanceof Double) { + return ((Double) origVal).floatValue(); + } + return origVal; + case INT: + if (origVal instanceof Float) { + return ((Float) origVal).intValue(); + } + else if (origVal instanceof Double) { + return ((Double) origVal).intValue(); + } + return origVal; + default: + return origVal; + } + } +} diff --git a/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBConnectionFactoryTest.java b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBConnectionFactoryTest.java new file mode 100644 index 0000000000..9ef845a11c --- /dev/null +++ b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBConnectionFactoryTest.java @@ -0,0 +1,58 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.mongodb.client.MongoClient; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; + +import static org.junit.Assert.*; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +public class DocDBConnectionFactoryTest +{ + private DocDBConnectionFactory connectionFactory; + + @Before + public void setUp() + throws Exception + { + connectionFactory = new DocDBConnectionFactory(); + } + + @Test + public void clientCacheHitTest() + throws IOException + { + MongoClient mockConn = mock(MongoClient.class); + when(mockConn.listDatabaseNames()).thenReturn(null); + + connectionFactory.addConnection("conStr", mockConn); + MongoClient conn = connectionFactory.getOrCreateConn("conStr"); + + assertEquals(mockConn, conn); + verify(mockConn, times(1)).listDatabaseNames(); + } +} diff --git a/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBMetadataHandlerTest.java b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBMetadataHandlerTest.java new file mode 100644 index 0000000000..f572ff6573 --- /dev/null +++ b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBMetadataHandlerTest.java @@ -0,0 +1,311 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.mongodb.client.FindIterable; +import com.mongodb.client.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoDatabase; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.bson.Document; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.anyInt; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class DocDBMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(DocDBMetadataHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String catalog = "default"; + private DocDBMetadataHandler handler; + private BlockAllocator allocator; + + @Mock + private DocDBConnectionFactory connectionFactory; + + @Mock + private MongoClient mockClient; + + @Mock + private AWSGlue awsGlue; + + @Mock + private AWSSecretsManager secretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + when(connectionFactory.getOrCreateConn(anyString())).thenReturn(mockClient); + + handler = new DocDBMetadataHandler(awsGlue, connectionFactory, new LocalKeyFactory(), secretsManager, mockAthena, "spillBucket", "spillPrefix"); + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doListSchemaNames() + { + logger.info("doListSchemaNames: enter"); + + List schemaNames = new ArrayList<>(); + schemaNames.add("schema1"); + schemaNames.add("schema2"); + schemaNames.add("schema3"); + + when(mockClient.listDatabaseNames()).thenReturn(StubbingCursor.iterate(schemaNames)); + + ListSchemasRequest req = new ListSchemasRequest(identity, "queryId", "default"); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + + logger.info("doListSchemas - {}", res.getSchemas()); + assertEquals(schemaNames, new ArrayList<>(res.getSchemas())); + + logger.info("doListSchemaNames: exit"); + } + + @Test + public void doListTables() + { + logger.info("doListTables - enter"); + + String schema = "schema1"; + + List tableNames = new ArrayList<>(); + tableNames.add("table1"); + tableNames.add("table2"); + tableNames.add("table3"); + + MongoDatabase mockDatabase = mock(MongoDatabase.class); + when(mockClient.getDatabase(eq(schema))).thenReturn(mockDatabase); + when(mockDatabase.listCollectionNames()).thenReturn(StubbingCursor.iterate(tableNames)); + + ListTablesRequest req = new ListTablesRequest(identity, "queryId", "default", schema); + ListTablesResponse res = handler.doListTables(allocator, req); + logger.info("doListTables - {}", res.getTables()); + + for (TableName next : res.getTables()) { + assertEquals(schema, next.getSchemaName()); + assertTrue(tableNames.contains(next.getTableName())); + } + assertEquals(tableNames.size(), res.getTables().size()); + + logger.info("doListTables - exit"); + } + + /** + * TODO: Add more types. + */ + @Test + public void doGetTable() + throws Exception + { + logger.info("doGetTable - enter"); + + String schema = "schema1"; + String table = "table1"; + + List documents = new ArrayList<>(); + + Document doc1 = new Document(); + documents.add(doc1); + doc1.put("stringCol", "stringVal"); + doc1.put("intCol", 1); + doc1.put("doubleCol", 2.2D); + doc1.put("longCol", 100L); + + Document doc2 = new Document(); + documents.add(doc2); + doc2.put("stringCol2", "stringVal"); + doc2.put("intCol2", 1); + doc2.put("doubleCol2", 2.2D); + doc2.put("longCol2", 100L); + + Document doc3 = new Document(); + documents.add(doc3); + doc3.put("stringCol", "stringVal"); + doc3.put("intCol2", 1); + doc3.put("doubleCol", 2.2D); + doc3.put("longCol2", 100L); + + MongoDatabase mockDatabase = mock(MongoDatabase.class); + MongoCollection mockCollection = mock(MongoCollection.class); + FindIterable mockIterable = mock(FindIterable.class); + when(mockClient.getDatabase(eq(schema))).thenReturn(mockDatabase); + when(mockDatabase.getCollection(eq(table))).thenReturn(mockCollection); + when(mockCollection.find()).thenReturn(mockIterable); + when(mockIterable.limit(anyInt())).thenReturn(mockIterable); + when(mockIterable.maxScan(anyInt())).thenReturn(mockIterable); + when(mockIterable.batchSize(anyInt())).thenReturn(mockIterable); + when(mockIterable.iterator()).thenReturn(new StubbingCursor(documents.iterator())); + + GetTableRequest req = new GetTableRequest(identity, "queryId", catalog, new TableName(schema, table)); + GetTableResponse res = handler.doGetTable(allocator, req); + logger.info("doGetTable - {}", res); + + assertEquals(8, res.getSchema().getFields().size()); + + Field stringCol = res.getSchema().findField("stringCol"); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(stringCol.getType())); + + Field stringCol2 = res.getSchema().findField("stringCol2"); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(stringCol2.getType())); + + Field intCol = res.getSchema().findField("intCol"); + assertEquals(Types.MinorType.INT, Types.getMinorTypeForArrowType(intCol.getType())); + + Field intCol2 = res.getSchema().findField("intCol2"); + assertEquals(Types.MinorType.INT, Types.getMinorTypeForArrowType(intCol2.getType())); + + Field doubleCol = res.getSchema().findField("doubleCol"); + assertEquals(Types.MinorType.FLOAT8, Types.getMinorTypeForArrowType(doubleCol.getType())); + + Field doubleCol2 = res.getSchema().findField("doubleCol2"); + assertEquals(Types.MinorType.FLOAT8, Types.getMinorTypeForArrowType(doubleCol2.getType())); + + Field longCol = res.getSchema().findField("longCol"); + assertEquals(Types.MinorType.BIGINT, Types.getMinorTypeForArrowType(longCol.getType())); + + Field longCol2 = res.getSchema().findField("longCol2"); + assertEquals(Types.MinorType.BIGINT, Types.getMinorTypeForArrowType(longCol2.getType())); + + logger.info("doGetTable - exit"); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + Schema schema = SchemaBuilder.newBuilder().build(); + GetTableLayoutRequest req = new GetTableLayoutRequest(identity, + "queryId", + "default", + new TableName("schema1", "table1"), + new Constraints(new HashMap<>()), + schema, + Collections.EMPTY_SET); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout - {}", res); + Block partitions = res.getPartitions(); + for (int row = 0; row < partitions.getRowCount() && row < 10; row++) { + logger.info("doGetTableLayout:{} {}", row, BlockUtils.rowToString(partitions, row)); + } + + assertTrue(partitions.getRowCount() > 0); + + logger.info("doGetTableLayout: partitions[{}]", partitions.getRowCount()); + } + + @Test + public void doGetSplits() + { + logger.info("doGetSplits: enter"); + + List partitionCols = new ArrayList<>(); + + Block partitions = BlockUtils.newBlock(allocator, "partitionId", Types.MinorType.INT.getType(), 0); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName("schema", "table_name"), + partitions, + partitionCols, + new Constraints(new HashMap<>()), + null); + + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + + logger.info("doGetSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", + new Object[] {continuationToken, response.getSplits().size()}); + + assertTrue("Continuation criteria violated", response.getSplits().size() == 1); + assertTrue("Continuation criteria violated", response.getContinuationToken() == null); + + logger.info("doGetSplits: exit"); + } +} diff --git a/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBRecordHandlerTest.java b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBRecordHandlerTest.java new file mode 100644 index 0000000000..6e5bd9d55b --- /dev/null +++ b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocDBRecordHandlerTest.java @@ -0,0 +1,373 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import com.google.common.io.ByteStreams; +import com.mongodb.client.FindIterable; +import com.mongodb.client.MongoClient; +import com.mongodb.client.MongoCollection; +import com.mongodb.client.MongoDatabase; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.bson.Document; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.mockito.stubbing.Answer; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static com.amazonaws.athena.connectors.docdb.DocDBMetadataHandler.DOCDB_CONN_STR; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyInt; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class DocDBRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(DocDBRecordHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String catalog = "default"; + private String conStr = "connectionString"; + private DocDBRecordHandler handler; + private BlockAllocator allocator; + private List mockS3Storage = new ArrayList<>(); + private AmazonS3 amazonS3; + private S3BlockSpillReader spillReader; + private Schema schemaForRead; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + + @Mock + private DocDBConnectionFactory connectionFactory; + + @Mock + private MongoClient mockClient; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + { + logger.info("setUpBefore - enter"); + + schemaForRead = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .addField("col2", new ArrowType.Utf8()) + .addField("col3", new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)) + .addField("int", Types.MinorType.INT.getType()) + .addField("tinyint", Types.MinorType.TINYINT.getType()) + .addField("smallint", Types.MinorType.SMALLINT.getType()) + .addField("bigint", Types.MinorType.BIGINT.getType()) + .addField("uint1", Types.MinorType.UINT1.getType()) + .addField("uint2", Types.MinorType.UINT2.getType()) + .addField("uint4", Types.MinorType.UINT4.getType()) + .addField("uint8", Types.MinorType.UINT8.getType()) + .addField("float4", Types.MinorType.FLOAT4.getType()) + .addField("float8", Types.MinorType.FLOAT8.getType()) + .addField("bit", Types.MinorType.BIT.getType()) + .addField("varchar", Types.MinorType.VARCHAR.getType()) + .addField("varbinary", Types.MinorType.VARBINARY.getType()) + .addField("decimal", new ArrowType.Decimal(10, 2)) + .addField("decimalLong", new ArrowType.Decimal(36, 2)) + .addStructField("struct") + .addChildField("struct", "struct_string", Types.MinorType.VARCHAR.getType()) + .addChildField("struct", "struct_int", Types.MinorType.INT.getType()) + .addListField("list", Types.MinorType.VARCHAR.getType()) + .build(); + + when(connectionFactory.getOrCreateConn(anyString())).thenReturn(mockClient); + + allocator = new BlockAllocatorImpl(); + + amazonS3 = mock(AmazonS3.class); + + when(amazonS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer(new Answer() + { + @Override + public Object answer(InvocationOnMock invocationOnMock) + throws Throwable + { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + DocDBRecordHandlerTest.ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + } + }); + + when(amazonS3.getObject(anyString(), anyString())) + .thenAnswer(new Answer() + { + @Override + public Object answer(InvocationOnMock invocationOnMock) + throws Throwable + { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + } + }); + + handler = new DocDBRecordHandler(amazonS3, mockSecretsManager, mockAthena, connectionFactory); + spillReader = new S3BlockSpillReader(amazonS3, allocator); + + logger.info("setUpBefore - exit"); + } + + @After + public void after() + { + allocator.close(); + } + + private Document makeDocument(Schema schema, int seed) + { + Document doc = new Document(); + doc.put("stringCol", "stringVal"); + doc.put("intCol", 1); + doc.put("col3", 22.0D); + return doc; + } + + @Test + public void doReadRecordsNoSpill() + throws Exception + { + logger.info("doReadRecordsNoSpill: enter"); + + String schema = "schema1"; + String table = "table1"; + + List documents = new ArrayList<>(); + + int docNum = 11; + Document doc1 = DocumentGenerator.makeRandomRow(schemaForRead.getFields(), docNum++); + documents.add(doc1); + doc1.put("col3", 22.0D); + + Document doc2 = DocumentGenerator.makeRandomRow(schemaForRead.getFields(), docNum++); + documents.add(doc2); + doc2.put("col3", 22.0D); + + Document doc3 = DocumentGenerator.makeRandomRow(schemaForRead.getFields(), docNum++); + documents.add(doc3); + doc3.put("col3", 21.0D); + + MongoDatabase mockDatabase = mock(MongoDatabase.class); + MongoCollection mockCollection = mock(MongoCollection.class); + FindIterable mockIterable = mock(FindIterable.class); + when(mockClient.getDatabase(eq(schema))).thenReturn(mockDatabase); + when(mockDatabase.getCollection(eq(table))).thenReturn(mockCollection); + when(mockCollection.find(any(Document.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + logger.info("doReadRecordsNoSpill: query[{}]", invocationOnMock.getArguments()[0]); + return mockIterable; + }); + when(mockIterable.projection(any(Document.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + logger.info("doReadRecordsNoSpill: projection[{}]", invocationOnMock.getArguments()[0]); + return mockIterable; + }); + when(mockIterable.batchSize(anyInt())).thenReturn(mockIterable); + when(mockIterable.iterator()).thenReturn(new StubbingCursor(documents.iterator())); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("col3", SortedRangeSet.copyOf(Types.MinorType.FLOAT8.getType(), + ImmutableList.of(Range.equal(allocator, Types.MinorType.FLOAT8.getType(), 22.0D)), false)); + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + Split.newBuilder(splitLoc, keyFactory.create()).add(DOCDB_CONN_STR, conStr).build(), + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsNoSpill: rows[{}]", response.getRecordCount()); + + assertTrue(response.getRecords().getRowCount() == 2); + logger.info("doReadRecordsNoSpill: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("doReadRecordsNoSpill: exit"); + } + + @Test + public void doReadRecordsSpill() + throws Exception + { + logger.info("doReadRecordsSpill: enter"); + + String schema = "schema1"; + String table = "table1"; + + List documents = new ArrayList<>(); + + for (int docNum = 0; docNum < 20_000; docNum++) { + documents.add(DocumentGenerator.makeRandomRow(schemaForRead.getFields(), docNum)); + } + + MongoDatabase mockDatabase = mock(MongoDatabase.class); + MongoCollection mockCollection = mock(MongoCollection.class); + FindIterable mockIterable = mock(FindIterable.class); + when(mockClient.getDatabase(eq(schema))).thenReturn(mockDatabase); + when(mockDatabase.getCollection(eq(table))).thenReturn(mockCollection); + when(mockCollection.find(any(Document.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + logger.info("doReadRecordsNoSpill: query[{}]", invocationOnMock.getArguments()[0]); + return mockIterable; + }); + when(mockIterable.projection(any(Document.class))).thenAnswer((InvocationOnMock invocationOnMock) -> { + logger.info("doReadRecordsNoSpill: projection[{}]", invocationOnMock.getArguments()[0]); + return mockIterable; + }); + when(mockIterable.batchSize(anyInt())).thenReturn(mockIterable); + when(mockIterable.iterator()).thenReturn(new StubbingCursor(documents.iterator())); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("col3", SortedRangeSet.copyOf(Types.MinorType.FLOAT8.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.FLOAT8.getType(), -10000D)), false)); + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + Split.newBuilder(splitLoc, keyFactory.create()).add(DOCDB_CONN_STR, conStr).build(), + new Constraints(constraintsMap), + 1_500_000L, //~1.5MB so we should see some spill + 0L + ); + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof RemoteReadRecordsResponse); + + try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) { + logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size()); + + assertTrue(response.getNumberBlocks() > 1); + + int blockNum = 0; + for (SpillLocation next : response.getRemoteBlocks()) { + S3SpillLocation spillLocation = (S3SpillLocation) next; + try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) { + + logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount()); + // assertTrue(++blockNum < response.getRemoteBlocks().size() && block.getRowCount() > 10_000); + + logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0)); + assertNotNull(BlockUtils.rowToString(block, 0)); + } + } + } + + logger.info("doReadRecordsSpill: exit"); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocumentGenerator.java b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocumentGenerator.java new file mode 100644 index 0000000000..4ad482eac7 --- /dev/null +++ b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/DocumentGenerator.java @@ -0,0 +1,115 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.bson.Document; + +import java.util.ArrayList; +import java.util.List; + +public class DocumentGenerator +{ + private DocumentGenerator() {} + + /** + * This should be replaced with something that actually reads useful data. + */ + public static Document makeRandomRow(List fields, int seed) + { + Document result = new Document(); + + for (Field next : fields) { + boolean negative = seed % 2 == 1; + Types.MinorType minorType = Types.getMinorTypeForArrowType(next.getType()); + switch (minorType) { + case INT: + int iVal = seed * (negative ? -1 : 1); + result.put(next.getName(), iVal); + break; + case TINYINT: + case SMALLINT: + int stVal = (seed % 4) * (negative ? -1 : 1); + result.put(next.getName(), stVal); + break; + case UINT1: + case UINT2: + case UINT4: + case UINT8: + int uiVal = seed % 4; + result.put(next.getName(), uiVal); + break; + case FLOAT4: + float fVal = seed * 1.1f * (negative ? -1 : 1); + result.put(next.getName(), fVal); + break; + case FLOAT8: + case DECIMAL: + double d8Val = seed * 1.1D * (negative ? -1 : 1); + result.put(next.getName(), d8Val); + break; + case BIT: + boolean bVal = seed % 2 == 0; + result.put(next.getName(), bVal); + break; + case BIGINT: + long lVal = seed * 1L * (negative ? -1 : 1); + result.put(next.getName(), lVal); + break; + case VARCHAR: + String vVal = "VarChar" + seed; + result.put(next.getName(), vVal); + break; + case VARBINARY: + byte[] binaryVal = ("VarChar" + seed).getBytes(); + result.put(next.getName(), binaryVal); + break; + case STRUCT: + result.put(next.getName(), makeRandomRow(next.getChildren(), seed)); + break; + case LIST: + //TODO: pretty dirty way of generating lists should refactor this to support better generation + Types.MinorType listType = Types.getMinorTypeForArrowType(next.getChildren().get(0).getType()); + switch (listType) { + case VARCHAR: + List listVarChar = new ArrayList<>(); + listVarChar.add("VarChar" + seed); + listVarChar.add("VarChar" + seed + 1); + result.put(next.getName(), listVarChar); + break; + case INT: + List listIVal = new ArrayList<>(); + listIVal.add(seed * (negative ? -1 : 1)); + listIVal.add(seed * (negative ? -1 : 1) + 1); + result.put(next.getName(), listIVal); + break; + default: + throw new RuntimeException(minorType + " is not supported in list"); + } + break; + default: + throw new RuntimeException(minorType + " is not supported"); + } + } + + return result; + } +} diff --git a/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/StubbingCursor.java b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/StubbingCursor.java new file mode 100644 index 0000000000..f95668827c --- /dev/null +++ b/athena-docdb/src/test/java/com/amazonaws/athena/connectors/docdb/StubbingCursor.java @@ -0,0 +1,132 @@ +/*- + * #%L + * athena-mongodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.docdb; + +import com.mongodb.Block; +import com.mongodb.Function; +import com.mongodb.ServerAddress; +import com.mongodb.ServerCursor; +import com.mongodb.client.MongoCursor; +import com.mongodb.client.MongoIterable; + +import java.util.Collection; +import java.util.Iterator; +import java.util.function.Consumer; + +public class StubbingCursor + implements MongoCursor +{ + private Iterator values; + + public StubbingCursor(Iterator result) + { + this.values = result; + } + + @Override + public void remove() + { + throw new UnsupportedOperationException(); + } + + @Override + public void forEachRemaining(Consumer action) + { + + } + + @Override + public void close() + { + + } + + @Override + public boolean hasNext() + { + return values.hasNext(); + } + + @Override + public T next() + { + return values.next(); + } + + @Override + public T tryNext() + { + throw new UnsupportedOperationException(); + } + + @Override + public ServerCursor getServerCursor() + { + throw new UnsupportedOperationException(); + } + + @Override + public ServerAddress getServerAddress() + { + throw new UnsupportedOperationException(); + } + + public static MongoIterable iterate(Collection result) + { + return new MongoIterable() + { + @Override + public MongoCursor iterator() + { + return new StubbingCursor<>(result.iterator()); + } + + @Override + public X first() + { + throw new UnsupportedOperationException(); + } + + @Override + public MongoIterable map(Function function) + { + throw new UnsupportedOperationException(); + } + + @Override + public void forEach(Block block) + { + throw new UnsupportedOperationException(); + } + + @Override + public > A into(A objects) + { + throw new UnsupportedOperationException(); + } + + @Override + public MongoIterable batchSize(int i) + { + return this; + } + }; + } +} diff --git a/athena-dynamodb/LICENSE.txt b/athena-dynamodb/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-dynamodb/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-dynamodb/README.md b/athena-dynamodb/README.md new file mode 100644 index 0000000000..1bf551e9a5 --- /dev/null +++ b/athena-dynamodb/README.md @@ -0,0 +1,60 @@ +# Amazon Athena DynamoDB Connector + +This connector enables Amazon Athena to communicate with DynamoDB, making your tables accessible via SQL. + +## Usage + +### Parameters + +The Athena DynamoDB Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large +responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key +generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. +Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) +5. **disable_glue** - (Optional) If set to false, the connector will no longer attempt to retrieve supplemental metadata from Glue. +6. **glue_catalog** - (Optional) Can be used to target a cross-account Glue catalog. By default the connector will attempt to get metadata from its own Glue account. + +### Setting Up Databases & Tables in Glue + +To enable a Glue Table for use with DynamoDB, you simply need to have a table that matches any DynamoDB Table that you'd like to supply supplemental metadata for (instead of relying on the DynamoDB +Connector's limited ability to infer schema). You can enable a Glue table to be used for supplemental metadata by setting one of the below table properties from the Glue Console when editing the Table in +question. These properties are automatically set if you use Glue's DynamoDB Crawler. The only other thing you need to do is ensure you use the appropriate data types when defining manually or validate +the columns and types that the Crawler discovered. + +1. **dynamodb** - String indicating that the table can be used for supplemental meta-data by the Athena DynamoDB Connector. This string can be in any one of the following places: + 1. in the table properites/parameters under a field called "classification" (exact match). + 2. in the table's storage descriptor's location field (substring match). + 3. in the table's storage descriptor's parameters under a field called "classification" (exact match). +2. **dynamo-db-flag** - String indicating that the *database* contains tables used for supplemental meta-data by the Athena DynamoDB Connector. This is required for any Glue databases other than "default" +and is useful for filtering out irrelevant databases in accounts that have lots of them. This string should be in the Location URI of the Glue Database (substring match). + + +### Required Permissions + +Review the "Policies" section of the athena-dynamodb.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. DynamoDB Read Access - The connector uses the DescribeTable, ListSchemas, ListTables, Query, and Scan APIs. +2. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +3. Glue Data Catalog - Since DynamoDB does not have a meta-data store, the connector requires Read-Only access to Glue's DataCatalog for supplemental table schema information. +4. CloudWatch Logs - This is a somewhat implicit permission when deploying a Lambda function but it needs access to cloudwatch logs for storing logs. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from +source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-dynamodb dir, run `mvn clean install`. +3. From the athena-dynamodb dir, run `../tools/publish.sh S3_BUCKET_NAME athena-dynamodb` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command +is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the +connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) + +## Performance + +The Athena DynamoDB Connector does support parallel scans and will attempt to push down predicates as part of its DynamoDB queries. A hash key predicate with X distinct values will result in X Query +calls to DynamoDB. All other predicate scenarios will results in Y number of Scan calls where Y is heuristically determined based on the size of your table and its provisioned throughput. \ No newline at end of file diff --git a/athena-dynamodb/athena-dynamodb.yaml b/athena-dynamodb/athena-dynamodb.yaml new file mode 100644 index 0000000000..821600a123 --- /dev/null +++ b/athena-dynamodb/athena-dynamodb.yaml @@ -0,0 +1,77 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaDynamoDBConnector + Description: 'This connector enables Amazon Athena to communicate with DynamoDB, making your tables accessible via SQL.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.dynamodb.DynamoDBCompositeHandler" + CodeUri: "./target/athena-dynamodb-1.0.jar" + Description: "Enables Amazon Athena to communicate with DynamoDB, making your tables accessible via SQL" + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - dynamodb:DescribeTable + - dynamodb:ListSchemas + - dynamodb:ListTables + - dynamodb:Query + - dynamodb:Scan + - glue:GetTableVersions + - glue:GetPartitions + - glue:GetTables + - glue:GetTableVersion + - glue:GetDatabases + - glue:GetTable + - glue:GetPartition + - glue:GetDatabase + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-dynamodb/native-libs/libsqlite4java-linux-amd64-1.0.392.so b/athena-dynamodb/native-libs/libsqlite4java-linux-amd64-1.0.392.so new file mode 100644 index 0000000000..884615789b Binary files /dev/null and b/athena-dynamodb/native-libs/libsqlite4java-linux-amd64-1.0.392.so differ diff --git a/athena-dynamodb/native-libs/libsqlite4java-linux-i386-1.0.392.so b/athena-dynamodb/native-libs/libsqlite4java-linux-i386-1.0.392.so new file mode 100644 index 0000000000..15e7469e38 Binary files /dev/null and b/athena-dynamodb/native-libs/libsqlite4java-linux-i386-1.0.392.so differ diff --git a/athena-dynamodb/native-libs/libsqlite4java-osx-1.0.392.dylib b/athena-dynamodb/native-libs/libsqlite4java-osx-1.0.392.dylib new file mode 100644 index 0000000000..0276162614 Binary files /dev/null and b/athena-dynamodb/native-libs/libsqlite4java-osx-1.0.392.dylib differ diff --git a/athena-dynamodb/native-libs/sqlite4java-win32-x64-1.0.392.dll b/athena-dynamodb/native-libs/sqlite4java-win32-x64-1.0.392.dll new file mode 100644 index 0000000000..70d258f29b Binary files /dev/null and b/athena-dynamodb/native-libs/sqlite4java-win32-x64-1.0.392.dll differ diff --git a/athena-dynamodb/native-libs/sqlite4java-win32-x86-1.0.392.dll b/athena-dynamodb/native-libs/sqlite4java-win32-x86-1.0.392.dll new file mode 100644 index 0000000000..c988e5a697 Binary files /dev/null and b/athena-dynamodb/native-libs/sqlite4java-win32-x86-1.0.392.dll differ diff --git a/athena-dynamodb/pom.xml b/athena-dynamodb/pom.xml new file mode 100644 index 0000000000..bf5ec3c5a9 --- /dev/null +++ b/athena-dynamodb/pom.xml @@ -0,0 +1,99 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-dynamodb + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.amazonaws + aws-java-sdk-dynamodb + ${aws-sdk.version} + + + com.amazonaws + DynamoDBLocal + 1.11.477 + test + + + + org.hamcrest + hamcrest + 2.1 + test + + + + + + dynamodb-local-oregon + DynamoDB Local Release Repository + https://s3-us-west-2.amazonaws.com/dynamodb-local/release + + false + + + + + + + + org.apache.maven.plugins + maven-dependency-plugin + 2.10 + + + copy + test-compile + + copy-dependencies + + + test + so,dll,dylib + ${project.basedir}/native-libs + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBCompositeHandler.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBCompositeHandler.java new file mode 100644 index 0000000000..2278e3315c --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose DynamoDBMetadataHandler and DynamoDBRecordHandler. + */ +public class DynamoDBCompositeHandler + extends CompositeHandler +{ + public DynamoDBCompositeHandler() + { + super(new DynamoDBMetadataHandler(), new DynamoDBRecordHandler()); + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBMetadataHandler.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBMetadataHandler.java new file mode 100644 index 0000000000..c3db4e90e1 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBMetadataHandler.java @@ -0,0 +1,467 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.glue.GlueFieldLexer; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants; +import com.amazonaws.athena.connectors.dynamodb.model.DynamoDBTable; +import com.amazonaws.athena.connectors.dynamodb.resolver.DynamoDBTableResolver; +import com.amazonaws.athena.connectors.dynamodb.util.DDBPredicateUtils; +import com.amazonaws.athena.connectors.dynamodb.util.DDBTableUtils; +import com.amazonaws.athena.connectors.dynamodb.util.DDBTypeUtils; +import com.amazonaws.athena.connectors.dynamodb.util.IncrementingValueNameProducer; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder; +import com.amazonaws.services.dynamodbv2.document.ItemUtils; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.AWSGlueClientBuilder; +import com.amazonaws.services.glue.model.Database; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.util.json.Jackson; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableList; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.LinkedHashSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.concurrent.TimeoutException; + +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.DEFAULT_SCHEMA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_NAMES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_VALUES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.HASH_KEY_NAME_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.INDEX_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.NON_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.PARTITION_TYPE_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.QUERY_PARTITION_TYPE; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.RANGE_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.RANGE_KEY_NAME_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SCAN_PARTITION_TYPE; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_COUNT_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_ID_PROPERTY; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.TABLE_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.throttling.DynamoDBExceptionFilter.EXCEPTION_FILTER; + +/** + * Handles metadata requests for the Athena DynamoDB Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Glue DataCatalog is used for schema information by default unless disabled. If disabled or the table
+ * is not found, it falls back to doing a small table scan and derives a schema from that.
+ * 2. Determines if the data splits will need to perform DDB Queries or Scans.
+ * 3. Splits up the hash key into distinct Query splits if possible, otherwise falls back to creating Scan splits.
+ * 4. Also determines the best index to use (if available) if the available predicates align with Key Attributes.
+ * 5. Creates scan splits that support Parallel Scan and tries to choose the optimal number of splits.
+ * 6. Pushes down all other predicates into ready-to-use filter expressions to pass to DDB. + */ +public class DynamoDBMetadataHandler + extends GlueMetadataHandler +{ + @VisibleForTesting + static final int MAX_SPLITS_PER_REQUEST = 1000; + private static final Logger logger = LoggerFactory.getLogger(DynamoDBMetadataHandler.class); + static final String DYNAMODB = "dynamodb"; + private static final String sourceType = "ddb"; + private static final String GLUE_ENV = "disable_glue"; + // defines the value that should be present in the Glue Database URI to enable the DB for DynamoDB. + static final String DYNAMO_DB_FLAG = "dynamo-db-flag"; + // used to filter out Glue tables which lack indications of being used for DDB. + private static final TableFilter TABLE_FILTER = (Table table) -> table.getStorageDescriptor().getLocation().contains(DYNAMODB) + || (table.getParameters() != null && DYNAMODB.equals(table.getParameters().get("classification"))) + || (table.getStorageDescriptor().getParameters() != null && DYNAMODB.equals(table.getStorageDescriptor().getParameters().get("classification"))); + // used to filter out Glue databases which lack the DYNAMO_DB_FLAG in the URI. + private static final DatabaseFilter DB_FILTER = (Database database) -> (database.getLocationUri() != null && database.getLocationUri().contains(DYNAMO_DB_FLAG)) + || DEFAULT_SCHEMA.equals(database.getName()); + + private final ThrottlingInvoker invoker = ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + private final AmazonDynamoDB ddbClient; + private final AWSGlue glueClient; + private final DynamoDBTableResolver tableResolver; + + public DynamoDBMetadataHandler() + { + super((System.getenv(GLUE_ENV) == null || !Boolean.parseBoolean(System.getenv(GLUE_ENV))) ? AWSGlueClientBuilder.standard().build() : null, + sourceType); + ddbClient = AmazonDynamoDBClientBuilder.standard().build(); + glueClient = getAwsGlue(); + tableResolver = new DynamoDBTableResolver(invoker, ddbClient); + } + + @VisibleForTesting + DynamoDBMetadataHandler(EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix, + AmazonDynamoDB ddbClient, + AWSGlue glueClient) + { + super(glueClient, keyFactory, secretsManager, athena, sourceType, spillBucket, spillPrefix); + this.glueClient = glueClient; + this.ddbClient = ddbClient; + this.tableResolver = new DynamoDBTableResolver(invoker, ddbClient); + } + + /** + * Since DynamoDB does not have "schemas" or "databases", this lists all the Glue databases (if not + * disabled) that contain {@value #DYNAMO_DB_FLAG} in their URIs . Otherwise returns just a "default" schema. + * + * @see GlueMetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator allocator, ListSchemasRequest request) + throws Exception + { + if (glueClient != null) { + try { + return super.doListSchemaNames(allocator, request, DB_FILTER); + } + catch (RuntimeException e) { + logger.warn("doListSchemaNames: Unable to retrieve schemas from AWSGlue.", e); + } + } + + return new ListSchemasResponse(request.getCatalogName(), ImmutableList.of("default")); + } + + /** + * Lists all Glue tables (if not disabled) in the schema specified that indicate use for DynamoDB metadata. + * Indications for DynamoDB use in Glue are:
+ * 1. The top level table properties/parameters contains a key called "classification" with value {@value #DYNAMODB}.
+ * 2. Or the storage descriptor's location field contains {@value #DYNAMODB}.
+ * 3. Or the storage descriptor has a parameter called "classification" with value {@value #DYNAMODB}. + *

+ * If the specified schema is "default", this also returns an intersection with actual tables in DynamoDB. + * + * @see GlueMetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator allocator, ListTablesRequest request) + throws Exception + { + // LinkedHashSet for consistent ordering + Set combinedTables = new LinkedHashSet<>(); + if (glueClient != null) { + try { + // does not validate that the tables are actually DDB tables + combinedTables.addAll(super.doListTables(allocator, request, TABLE_FILTER).getTables()); + } + catch (RuntimeException e) { + logger.warn("doListTables: Unable to retrieve tables from AWSGlue in database/schema {}", request.getSchemaName(), e); + } + } + + // add tables that may not be in Glue (if listing the default schema) + if (DynamoDBConstants.DEFAULT_SCHEMA.equals(request.getSchemaName())) { + combinedTables.addAll(tableResolver.listTables()); + } + return new ListTablesResponse(request.getCatalogName(), new ArrayList<>(combinedTables)); + } + + /** + * Fetches a table's schema from Glue DataCatalog if present and not disabled, otherwise falls + * back to doing a small table scan derives a schema from that. + * + * @see GlueMetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator allocator, GetTableRequest request) + throws Exception + { + if (glueClient != null) { + try { + // does not validate that the table is actually a DDB table + return super.doGetTable(allocator, request); + } + catch (RuntimeException e) { + logger.debug("doGetTable: Unable to retrieve table {} from AWSGlue in database/schema {}", request.getTableName().getSchemaName(), e); + } + } + + // ignore database/schema name since there are no databases/schemas in DDB + Schema schema = tableResolver.getTableSchema(request.getTableName().getTableName()); + return new GetTableResponse(request.getCatalogName(), request.getTableName(), schema); + } + + /** + * Generates a partition schema with metadata derived from available predicates. This metadata will be + * copied to splits in the #doGetSplits call. At this point it is determined whether we can partition + * by hash key or fall back to a full table scan. + * + * @see GlueMetadataHandler + */ + @Override + public void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + DynamoDBTable table = null; + try { + table = tableResolver.getTableMetadata(request.getTableName().getTableName()); + } + catch (TimeoutException e) { + throw new RuntimeException(e); + } + // add table name so we don't have to do case insensitive resolution again + partitionSchemaBuilder.addMetadata(TABLE_METADATA, table.getName()); + Map summary = request.getConstraints().getSummary(); + DynamoDBTable index = DDBPredicateUtils.getBestIndexForPredicates(table, summary); + String hashKeyName = index.getHashKey(); + ValueSet hashKeyValueSet = summary.get(hashKeyName); + List hashKeyValues = (hashKeyValueSet != null) ? DDBPredicateUtils.getHashKeyAttributeValues(hashKeyValueSet) : Collections.emptyList(); + + Set alreadyFilteredColumns = new HashSet<>(); + List valueAccumulator = new ArrayList<>(); + IncrementingValueNameProducer valueNameProducer = new IncrementingValueNameProducer(); + if (!hashKeyValues.isEmpty()) { + // can "partition" on hash key + partitionSchemaBuilder.addField(hashKeyName, hashKeyValueSet.getType()); + partitionSchemaBuilder.addMetadata(HASH_KEY_NAME_METADATA, hashKeyName); + alreadyFilteredColumns.add(hashKeyName); + partitionSchemaBuilder.addMetadata(PARTITION_TYPE_METADATA, QUERY_PARTITION_TYPE); + if (!table.equals(index)) { + partitionSchemaBuilder.addMetadata(INDEX_METADATA, index.getName()); + } + + // add range key filter if there is one + Optional rangeKey = index.getRangeKey(); + if (rangeKey.isPresent()) { + String rangeKeyName = rangeKey.get(); + if (summary.containsKey(rangeKeyName)) { + String rangeKeyFilter = DDBPredicateUtils.generateSingleColumnFilter(rangeKeyName, summary.get(rangeKeyName), valueAccumulator, valueNameProducer); + partitionSchemaBuilder.addMetadata(RANGE_KEY_NAME_METADATA, rangeKeyName); + partitionSchemaBuilder.addMetadata(RANGE_KEY_FILTER_METADATA, rangeKeyFilter); + alreadyFilteredColumns.add(rangeKeyName); + } + } + } + else { + // always fall back to a scan + partitionSchemaBuilder.addField(SEGMENT_COUNT_METADATA, Types.MinorType.INT.getType()); + partitionSchemaBuilder.addMetadata(PARTITION_TYPE_METADATA, SCAN_PARTITION_TYPE); + } + + precomputeAdditionalMetadata(alreadyFilteredColumns, summary, valueAccumulator, valueNameProducer, partitionSchemaBuilder); + } + + /** + * Generates hash key partitions if possible or generates a single partition with the heuristically + * determined optimal scan segment count specified inside of it + * + * @see GlueMetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + // TODO consider caching this repeated work in #enhancePartitionSchema + DynamoDBTable table = tableResolver.getTableMetadata(request.getTableName().getTableName()); + Map summary = request.getConstraints().getSummary(); + DynamoDBTable index = DDBPredicateUtils.getBestIndexForPredicates(table, summary); + String hashKeyName = index.getHashKey(); + ValueSet hashKeyValueSet = summary.get(hashKeyName); + List hashKeyValues = (hashKeyValueSet != null) ? DDBPredicateUtils.getHashKeyAttributeValues(hashKeyValueSet) : Collections.emptyList(); + + if (!hashKeyValues.isEmpty()) { + for (Object hashKeyValue : hashKeyValues) { + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(hashKeyName, rowNum, hashKeyValue); + //we added 1 partition per hashkey value + return 1; + }); + } + } + else { + // always fall back to a scan, need to return at least one partition so stick the segment count in it + int segmentCount = DDBTableUtils.getNumSegments(table.getProvisionedReadCapacity(), table.getApproxTableSizeInBytes()); + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(SEGMENT_COUNT_METADATA, rowNum, segmentCount); + return 1; + }); + } + } + + /* + Injects additional metadata into the partition schema like a non-key filter expression for additional DDB-side filtering + */ + private void precomputeAdditionalMetadata(Set columnsToIgnore, Map predicates, List accumulator, + IncrementingValueNameProducer valueNameProducer, SchemaBuilder partitionsSchemaBuilder) + { + // precompute non-key filter + String filterExpression = DDBPredicateUtils.generateFilterExpression(columnsToIgnore, predicates, accumulator, valueNameProducer); + if (filterExpression != null) { + partitionsSchemaBuilder.addMetadata(NON_KEY_FILTER_METADATA, filterExpression); + } + + if (!accumulator.isEmpty()) { + // add in mappings for aliased columns and value placeholders + Map aliasedColumns = new HashMap<>(); + for (String column : predicates.keySet()) { + aliasedColumns.put(DDBPredicateUtils.aliasColumn(column), column); + } + Map expressionValueMapping = new HashMap<>(); + // IncrementingValueNameProducer is repeatable for simplicity + IncrementingValueNameProducer valueNameProducer2 = new IncrementingValueNameProducer(); + for (AttributeValue value : accumulator) { + expressionValueMapping.put(valueNameProducer2.getNext(), value); + } + partitionsSchemaBuilder.addMetadata(EXPRESSION_NAMES_METADATA, Jackson.toJsonString(aliasedColumns)); + partitionsSchemaBuilder.addMetadata(EXPRESSION_VALUES_METADATA, Jackson.toJsonString(expressionValueMapping)); + } + } + + /** + * Copies data from partitions and creates splits, serializing as necessary for later calls to RecordHandler#readWithContraint. + * This API supports pagination. + * + * @see GlueMetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + { + int partitionContd = decodeContinuationToken(request); + Set splits = new HashSet<>(); + Block partitions = request.getPartitions(); + Map partitionMetadata = partitions.getSchema().getCustomMetadata(); + // copy all partition metadata to the split + Map splitMetadata = new HashMap<>(partitionMetadata); + String partitionType = partitionMetadata.get(PARTITION_TYPE_METADATA); + if (partitionType == null) { + throw new IllegalStateException(String.format("No metadata %s defined in Schema %s", PARTITION_TYPE_METADATA, partitions.getSchema())); + } + if (QUERY_PARTITION_TYPE.equals(partitionType)) { + String hashKeyName = partitionMetadata.get(HASH_KEY_NAME_METADATA); + FieldReader hashKeyValueReader = partitions.getFieldReader(hashKeyName); + // one split per hash key value (since one DDB query can only take one hash key value) + for (int curPartition = partitionContd; curPartition < partitions.getRowCount(); curPartition++) { + hashKeyValueReader.setPosition(curPartition); + + //Every split must have a unique location if we wish to spill to avoid failures + SpillLocation spillLocation = makeSpillLocation(request); + + Object hashKeyValue = DDBTypeUtils.convertArrowTypeIfNecessary(hashKeyValueReader.readObject()); + String hashKeyValueJSON = Jackson.toJsonString(ItemUtils.toAttributeValue(hashKeyValue)); + splitMetadata.put(hashKeyName, hashKeyValueJSON); + + splitMetadata.put(SEGMENT_COUNT_METADATA, String.valueOf(partitions.getRowCount())); + + splits.add(new Split(spillLocation, makeEncryptionKey(), splitMetadata)); + + if (splits.size() == MAX_SPLITS_PER_REQUEST && curPartition != partitions.getRowCount() - 1) { + // We've reached max page size and this is not the last partition + // so send the page back + return new GetSplitsResponse(request.getCatalogName(), + splits, + encodeContinuationToken(curPartition)); + } + } + return new GetSplitsResponse(request.getCatalogName(), splits, null); + } + else if (SCAN_PARTITION_TYPE.equals(partitionType)) { + FieldReader segmentCountReader = partitions.getFieldReader(SEGMENT_COUNT_METADATA); + int segmentCount = segmentCountReader.readInteger(); + for (int curPartition = partitionContd; curPartition < segmentCount; curPartition++) { + SpillLocation spillLocation = makeSpillLocation(request); + + splitMetadata.put(SEGMENT_ID_PROPERTY, String.valueOf(curPartition)); + splitMetadata.put(SEGMENT_COUNT_METADATA, String.valueOf(segmentCount)); + + splits.add(new Split(spillLocation, makeEncryptionKey(), splitMetadata)); + + if (splits.size() == MAX_SPLITS_PER_REQUEST && curPartition != segmentCount - 1) { + // We've reached max page size and this is not the last partition + // so send the page back + return new GetSplitsResponse(request.getCatalogName(), + splits, + encodeContinuationToken(curPartition)); + } + } + return new GetSplitsResponse(request.getCatalogName(), splits, null); + } + else { + throw new IllegalStateException("Unexpected partition type " + partitionType); + } + } + + /** + * @see GlueMetadataHandler + */ + @Override + protected Field convertField(String name, String glueType) + { + return GlueFieldLexer.lex(name, glueType); + } + + /* + Used to handle paginated requests. Returns the partition number to resume with. + */ + private int decodeContinuationToken(GetSplitsRequest request) + { + if (request.hasContinuationToken()) { + return Integer.valueOf(request.getContinuationToken()) + 1; + } + + //No continuation token present + return 0; + } + + /* + Used to create pagination tokens by encoding the number of the next partition to process. + */ + private String encodeContinuationToken(int partition) + { + return String.valueOf(partition); + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBRecordHandler.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBRecordHandler.java new file mode 100644 index 0000000000..e4b5856969 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBRecordHandler.java @@ -0,0 +1,327 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb; + +import com.amazonaws.AmazonWebServiceRequest; +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connectors.dynamodb.util.DDBPredicateUtils; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDBClientBuilder; +import com.amazonaws.services.dynamodbv2.document.ItemUtils; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.amazonaws.services.dynamodbv2.model.QueryRequest; +import com.amazonaws.services.dynamodbv2.model.QueryResult; +import com.amazonaws.services.dynamodbv2.model.ScanRequest; +import com.amazonaws.services.dynamodbv2.model.ScanResult; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.util.json.Jackson; +import com.fasterxml.jackson.core.type.TypeReference; +import com.google.common.cache.CacheBuilder; +import com.google.common.cache.CacheLoader; +import com.google.common.cache.LoadingCache; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.AtomicReference; + +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_NAMES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_VALUES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.HASH_KEY_NAME_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.INDEX_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.NON_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.RANGE_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_COUNT_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_ID_PROPERTY; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.TABLE_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.throttling.DynamoDBExceptionFilter.EXCEPTION_FILTER; +import static com.google.common.base.Preconditions.checkArgument; + +/** + * Handles data read record requests for the Athena DynamoDB Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Reads and maps DynamoDB data for a specific split. The split can either represent a single hash key + * or a table scan segment.
+ * 2. Attempts to push down all predicates into DynamoDB to reduce read cost and bytes over the wire. + */ +public class DynamoDBRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(DynamoDBRecordHandler.class); + private static final String sourceType = "ddb"; + + private static final String HASH_KEY_VALUE_ALIAS = ":hashKeyValue"; + + private static final TypeReference> STRING_MAP_TYPE_REFERENCE = new TypeReference>() {}; + private static final TypeReference> ATTRIBUTE_VALUE_MAP_TYPE_REFERENCE = new TypeReference>() {}; + + private final LoadingCache invokerCache = CacheBuilder.newBuilder().build( + new CacheLoader() { + @Override + public ThrottlingInvoker load(String tableName) + throws Exception + { + return ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + } + }); + private final AmazonDynamoDB ddbClient; + + public DynamoDBRecordHandler() + { + super(sourceType); + this.ddbClient = AmazonDynamoDBClientBuilder.standard().build(); + } + + @VisibleForTesting + DynamoDBRecordHandler(AmazonDynamoDB ddbClient, AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, String sourceType) + { + super(amazonS3, secretsManager, athena, sourceType); + this.ddbClient = ddbClient; + } + + /** + * Reads data from DynamoDB by submitting either a Query or a Scan, depending + * on the type of split, and includes any filters specified in the split. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + throws ExecutionException + { + Split split = recordsRequest.getSplit(); + // use the property instead of the request table name because of case sensitivity + String tableName = split.getProperty(TABLE_METADATA); + invokerCache.get(tableName).setBlockSpiller(spiller); + Iterator> itemIterator = getIterator(split, tableName); + + long numRows = 0; + AtomicLong numResultRows = new AtomicLong(0); + while (itemIterator.hasNext()) { + if (!queryStatusChecker.isQueryRunning()) { + // we can stop processing because the query waiting for this data has already terminated + return; + } + numRows++; + spiller.writeRows((Block block, int rowNum) -> { + Map item = itemIterator.next(); + if (item == null) { + // this can happen regardless of the hasNext() check above for the very first iteration since itemIterator + // had not made any DDB calls yet and there may be zero items returned when it does + return 0; + } + + boolean matched = true; + numResultRows.getAndIncrement(); + for (Field nextField : recordsRequest.getSchema().getFields()) { + Object value = ItemUtils.toSimpleValue(item.get(nextField.getName())); + Types.MinorType fieldType = Types.getMinorTypeForArrowType(nextField.getType()); + try { + switch (fieldType) { + case LIST: + // DDB may return Set so coerce to List + List valueAsList = value != null ? new ArrayList((Collection) value) : null; + matched &= block.offerComplexValue(nextField.getName(), + rowNum, + FieldResolver.DEFAULT, + valueAsList); + break; + case STRUCT: + matched &= block.offerComplexValue(nextField.getName(), + rowNum, + (Field field, Object val) -> ((Map) val).get(field.getName()), value); + break; + default: + matched &= block.offerValue(nextField.getName(), rowNum, value); + break; + } + + if (!matched) { + return 0; + } + } + catch (Exception ex) { + throw new RuntimeException("Error while processing field " + nextField.getName(), ex); + } + } + return 1; + }); + } + + logger.info("readWithConstraint: numRows[{}] numResultRows[{}]", numRows, numResultRows.get()); + } + + /* + Converts a split into a Query or Scan request + */ + private AmazonWebServiceRequest buildReadRequest(Split split, String tableName) + { + validateExpectedMetadata(split.getProperties()); + // prepare filters + String rangeKeyFilter = split.getProperty(RANGE_KEY_FILTER_METADATA); + String nonKeyFilter = split.getProperty(NON_KEY_FILTER_METADATA); + Map expressionAttributeNames = new HashMap<>(); + Map expressionAttributeValues = new HashMap<>(); + if (rangeKeyFilter != null || nonKeyFilter != null) { + try { + expressionAttributeNames.putAll(Jackson.getObjectMapper().readValue(split.getProperty(EXPRESSION_NAMES_METADATA), STRING_MAP_TYPE_REFERENCE)); + expressionAttributeValues.putAll(Jackson.getObjectMapper().readValue(split.getProperty(EXPRESSION_VALUES_METADATA), ATTRIBUTE_VALUE_MAP_TYPE_REFERENCE)); + } + catch (IOException e) { + throw new RuntimeException(e); + } + } + + boolean isQuery = split.getProperty(SEGMENT_ID_PROPERTY) == null; + + if (isQuery) { + // prepare key condition expression + String indexName = split.getProperty(INDEX_METADATA); + String hashKeyName = split.getProperty(HASH_KEY_NAME_METADATA); + String hashKeyAlias = DDBPredicateUtils.aliasColumn(hashKeyName); + String keyConditionExpression = hashKeyAlias + " = " + HASH_KEY_VALUE_ALIAS; + if (rangeKeyFilter != null) { + keyConditionExpression += " AND " + rangeKeyFilter; + } + expressionAttributeNames.put(hashKeyAlias, hashKeyName); + expressionAttributeValues.put(HASH_KEY_VALUE_ALIAS, Jackson.fromJsonString(split.getProperty(hashKeyName), AttributeValue.class)); + + return new QueryRequest() + .withTableName(tableName) + .withIndexName(indexName) + .withKeyConditionExpression(keyConditionExpression) + .withFilterExpression(nonKeyFilter) + .withExpressionAttributeNames(expressionAttributeNames) + .withExpressionAttributeValues(expressionAttributeValues); + } + else { + int segmentId = Integer.parseInt(split.getProperty(SEGMENT_ID_PROPERTY)); + int segmentCount = Integer.parseInt(split.getProperty(SEGMENT_COUNT_METADATA)); + + return new ScanRequest() + .withTableName(tableName) + .withSegment(segmentId) + .withTotalSegments(segmentCount) + .withFilterExpression(nonKeyFilter) + .withExpressionAttributeNames(expressionAttributeNames.isEmpty() ? null : expressionAttributeNames) + .withExpressionAttributeValues(expressionAttributeValues.isEmpty() ? null : expressionAttributeValues); + } + } + + /* + Creates an iterator that can iterate through a Query or Scan, sending paginated requests as necessary + */ + private Iterator> getIterator(Split split, String tableName) + { + AmazonWebServiceRequest request = buildReadRequest(split, tableName); + return new Iterator>() { + AtomicReference> lastKeyEvaluated = new AtomicReference<>(); + AtomicReference>> currentPageIterator = new AtomicReference<>(); + + @Override + public boolean hasNext() + { + return currentPageIterator.get() == null + || currentPageIterator.get().hasNext() + || lastKeyEvaluated.get() != null; + } + + @Override + public Map next() + { + if (currentPageIterator.get() != null && currentPageIterator.get().hasNext()) { + return currentPageIterator.get().next(); + } + Iterator> iterator; + try { + if (request instanceof QueryRequest) { + QueryRequest paginatedRequest = ((QueryRequest) request).withExclusiveStartKey(lastKeyEvaluated.get()); + if (logger.isDebugEnabled()) { + logger.debug("Invoking DDB with Query request: {}", request); + } + QueryResult queryResult = invokerCache.get(tableName).invoke(() -> ddbClient.query(paginatedRequest)); + lastKeyEvaluated.set(queryResult.getLastEvaluatedKey()); + iterator = queryResult.getItems().iterator(); + } + else { + ScanRequest paginatedRequest = ((ScanRequest) request).withExclusiveStartKey(lastKeyEvaluated.get()); + if (logger.isDebugEnabled()) { + logger.debug("Invoking DDB with Scan request: {}", request); + } + ScanResult scanResult = invokerCache.get(tableName).invoke(() -> ddbClient.scan(paginatedRequest)); + lastKeyEvaluated.set(scanResult.getLastEvaluatedKey()); + iterator = scanResult.getItems().iterator(); + } + } + catch (TimeoutException | ExecutionException e) { + throw new RuntimeException(e); + } + currentPageIterator.set(iterator); + if (iterator.hasNext()) { + return iterator.next(); + } + else { + return null; + } + } + }; + } + + /* + Validates that the required metadata is present for split processing + */ + private void validateExpectedMetadata(Map metadata) + { + boolean isQuery = !metadata.containsKey(SEGMENT_ID_PROPERTY); + if (isQuery) { + checkArgument(metadata.containsKey(HASH_KEY_NAME_METADATA), "Split missing expected metadata [%s]", HASH_KEY_NAME_METADATA); + } + else { + checkArgument(metadata.containsKey(SEGMENT_COUNT_METADATA), "Split missing expected metadata [%s]", SEGMENT_COUNT_METADATA); + } + if (metadata.containsKey(RANGE_KEY_FILTER_METADATA) || metadata.containsKey(NON_KEY_FILTER_METADATA)) { + checkArgument(metadata.containsKey(EXPRESSION_NAMES_METADATA), "Split missing expected metadata [%s] when filters are present", EXPRESSION_NAMES_METADATA); + checkArgument(metadata.containsKey(EXPRESSION_VALUES_METADATA), "Split missing expected metadata [%s] when filters are present", EXPRESSION_VALUES_METADATA); + } + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/constants/DynamoDBConstants.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/constants/DynamoDBConstants.java new file mode 100644 index 0000000000..62587d76af --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/constants/DynamoDBConstants.java @@ -0,0 +1,40 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.constants; + +public final class DynamoDBConstants +{ + private DynamoDBConstants() {} + + public static final String DEFAULT_SCHEMA = "default"; + public static final String PARTITION_TYPE_METADATA = "partitionType"; + public static final String QUERY_PARTITION_TYPE = "query"; + public static final String SCAN_PARTITION_TYPE = "scan"; + public static final String SEGMENT_COUNT_METADATA = "segmentCount"; + public static final String SEGMENT_ID_PROPERTY = "segmentId"; + public static final String TABLE_METADATA = "table"; + public static final String INDEX_METADATA = "index"; + public static final String HASH_KEY_NAME_METADATA = "hashKeyName"; + public static final String RANGE_KEY_NAME_METADATA = "rangeKeyName"; + public static final String RANGE_KEY_FILTER_METADATA = "rangeKeyFilter"; + public static final String NON_KEY_FILTER_METADATA = "nonKeyFilter"; + public static final String EXPRESSION_NAMES_METADATA = "expressionAttributeNames"; + public static final String EXPRESSION_VALUES_METADATA = "expressionAttributeValues"; +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/model/DynamoDBTable.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/model/DynamoDBTable.java new file mode 100644 index 0000000000..29b9c96ba8 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/model/DynamoDBTable.java @@ -0,0 +1,122 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.model; + +import com.amazonaws.services.dynamodbv2.model.AttributeDefinition; +import com.google.common.collect.ImmutableList; + +import java.util.List; +import java.util.Objects; +import java.util.Optional; + +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Strings.isNullOrEmpty; +import static java.util.Objects.requireNonNull; + +/** + * A model class to store table metadata in an easy to consume manner. + */ +public class DynamoDBTable +{ + private final String name; + private final String hashKey; + private final Optional rangeKey; + private final List knownAttributeDefinitions; + private final List indexes; + private final long approxTableSizeInBytes; + private final long approxItemCount; + private final long provisionedReadCapacity; + + public DynamoDBTable( + String name, + String hashKey, + Optional rangeKey, + List knownAttributeDefinitions, + List indexes, + long approxTableSizeInBytes, + long approxItemCount, + long provisionedReadCapacity) + { + checkArgument(!isNullOrEmpty(name), "name is null or is empty"); + this.hashKey = requireNonNull(hashKey, "hashKey is null"); + this.rangeKey = requireNonNull(rangeKey, "rangeKey is null"); + this.knownAttributeDefinitions = requireNonNull(knownAttributeDefinitions, "knownAttributeDefinitions is null"); + this.name = requireNonNull(name, "name is null"); + this.indexes = ImmutableList.copyOf(requireNonNull(indexes, "indexes is null")); + this.approxTableSizeInBytes = approxTableSizeInBytes; + this.approxItemCount = approxItemCount; + this.provisionedReadCapacity = provisionedReadCapacity; + } + + public String getName() + { + return name; + } + + public String getHashKey() + { + return hashKey; + } + + public Optional getRangeKey() + { + return rangeKey; + } + + public List getKnownAttributeDefinitions() + { + return knownAttributeDefinitions; + } + + public List getIndexes() + { + return indexes; + } + + public long getApproxTableSizeInBytes() + { + return approxTableSizeInBytes; + } + + public long getProvisionedReadCapacity() + { + return provisionedReadCapacity; + } + + @Override + public int hashCode() + { + return Objects.hash(name); + } + + @Override + public boolean equals(Object obj) + { + if (this == obj) { + return true; + } + if ((obj == null) || (getClass() != obj.getClass())) { + return false; + } + + DynamoDBTable other = (DynamoDBTable) obj; + return Objects.equals(this.name, other.name); + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/resolver/DynamoDBTableResolver.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/resolver/DynamoDBTableResolver.java new file mode 100644 index 0000000000..d1eb433d14 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/resolver/DynamoDBTableResolver.java @@ -0,0 +1,163 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.resolver; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connectors.dynamodb.model.DynamoDBTable; +import com.amazonaws.athena.connectors.dynamodb.util.DDBTableUtils; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; +import com.amazonaws.services.dynamodbv2.model.ListTablesRequest; +import com.amazonaws.services.dynamodbv2.model.ListTablesResult; +import com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException; +import com.google.common.collect.ArrayListMultimap; +import com.google.common.collect.Multimap; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.Locale; +import java.util.Optional; +import java.util.concurrent.TimeoutException; + +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.DEFAULT_SCHEMA; +import static com.google.common.collect.ImmutableList.toImmutableList; + +/** + * This class helps with resolving the differences in casing between DynamoDB and Presto. Presto expects all + * databases, tables, and columns to be lower case. This class allows us to resolve DynamoDB tables + * which may have captial letters in them without issue. It does so by fetching all table names and doing + * a case insensitive search over them. It will first try to do a targeted get to reduce the penalty for + * tables which don't have capitalization. + * + * TODO add caching + */ +public class DynamoDBTableResolver +{ + private static final Logger logger = LoggerFactory.getLogger(DynamoDBTableResolver.class); + + private AmazonDynamoDB ddbClient; + // used to handle Throttling events using an AIMD strategy for congestion control. + private ThrottlingInvoker invoker; + + public DynamoDBTableResolver(ThrottlingInvoker invoker, AmazonDynamoDB ddbClient) + { + this.invoker = invoker; + this.ddbClient = ddbClient; + } + + /** + * Fetches the list of tables from DynamoDB via paginated ListTables calls + * + * @return the list of tables in DynamoDB + */ + public List listTables() + throws TimeoutException + { + List tables = new ArrayList<>(); + String nextToken = null; + do { + ListTablesRequest ddbRequest = new ListTablesRequest() + .withExclusiveStartTableName(nextToken); + ListTablesResult result = invoker.invoke(() -> ddbClient.listTables(ddbRequest)); + tables.addAll(result.getTableNames().stream().map(table -> new TableName(DEFAULT_SCHEMA, table)).collect(toImmutableList())); + nextToken = result.getLastEvaluatedTableName(); + } + while (nextToken != null); + return tables; + } + + /** + * Fetches table schema by first doing a Scan on the given table name, falling back to case insensitive + * resolution if the table isn't found. Delegates actual schema derivation to {@link + * DDBTableUtils#peekTableForSchema}. + * + * @param tableName the case insensitive table name + * @return the table's schema + */ + public Schema getTableSchema(String tableName) + throws TimeoutException + { + try { + return DDBTableUtils.peekTableForSchema(tableName, invoker, ddbClient); + } + catch (ResourceNotFoundException e) { + Optional caseInsensitiveMatch = tryCaseInsensitiveSearch(tableName); + if (caseInsensitiveMatch.isPresent()) { + return DDBTableUtils.peekTableForSchema(caseInsensitiveMatch.get(), invoker, ddbClient); + } + else { + throw e; + } + } + } + + /** + * Fetches table metadata by first doing a DescribeTable on the given table table, falling back to case + * insensitive resolution if the table isn't found. + * + * @param tableName the case insensitive table name + * @return the table's metadata + */ + public DynamoDBTable getTableMetadata(String tableName) + throws TimeoutException + { + try { + return DDBTableUtils.getTable(tableName, invoker, ddbClient); + } + catch (ResourceNotFoundException e) { + Optional caseInsensitiveMatch = tryCaseInsensitiveSearch(tableName); + if (caseInsensitiveMatch.isPresent()) { + return DDBTableUtils.getTable(caseInsensitiveMatch.get(), invoker, ddbClient); + } + else { + throw e; + } + } + } + + /* + Performs a case insensitive table search by listing the tables, mapping them to their lowercase transformation, + and then mapping the given tableName back to a unique table. To prevent ambiguity, an IllegalStateException is + thrown if multiple tables map to the given tableName. + */ + private Optional tryCaseInsensitiveSearch(String tableName) + throws TimeoutException + { + logger.info("Table {} not found. Falling back to case insensitive search.", tableName); + Multimap lowerCaseNameMapping = ArrayListMultimap.create(); + for (TableName nextTableName : listTables()) { + lowerCaseNameMapping.put(nextTableName.getTableName().toLowerCase(Locale.ENGLISH), nextTableName.getTableName()); + } + Collection mappedNames = lowerCaseNameMapping.get(tableName); + if (mappedNames.size() > 1) { + throw new IllegalStateException(String.format("Multiple tables resolved from case insensitive name %s: %s", tableName, mappedNames)); + } + else if (mappedNames.size() == 1) { + return Optional.of(mappedNames.iterator().next()); + } + else { + return Optional.empty(); + } + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/throttling/DynamoDBExceptionFilter.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/throttling/DynamoDBExceptionFilter.java new file mode 100644 index 0000000000..9f289f6aa3 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/throttling/DynamoDBExceptionFilter.java @@ -0,0 +1,42 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.throttling; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.services.dynamodbv2.model.LimitExceededException; +import com.amazonaws.services.dynamodbv2.model.ProvisionedThroughputExceededException; + +/** + * Used by {@link ThrottlingInvoker} to determine which DynamoDB exceptions are thrown for throttling. + */ +public class DynamoDBExceptionFilter + implements ThrottlingInvoker.ExceptionFilter +{ + public static final ThrottlingInvoker.ExceptionFilter EXCEPTION_FILTER = new DynamoDBExceptionFilter(); + + private DynamoDBExceptionFilter() {} + + @Override + public boolean isMatch(Exception ex) + { + return ex instanceof LimitExceededException + || ex instanceof ProvisionedThroughputExceededException; + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBPredicateUtils.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBPredicateUtils.java new file mode 100644 index 0000000000..16f1c3e1ad --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBPredicateUtils.java @@ -0,0 +1,297 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.util; + +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connectors.dynamodb.model.DynamoDBTable; +import com.amazonaws.services.dynamodbv2.document.ItemUtils; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.google.common.base.Joiner; +import com.google.common.collect.ImmutableList; + +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.stream.Stream; + +import static com.google.common.base.Preconditions.checkState; +import static com.google.common.collect.ImmutableList.toImmutableList; +import static com.google.common.collect.Iterables.getOnlyElement; + +/** + * Provides utility methods relating to predicate handling. + */ +public class DDBPredicateUtils +{ + private DDBPredicateUtils() {} + + private static final Joiner AND_JOINER = Joiner.on(" AND "); + private static final Joiner COMMA_JOINER = Joiner.on(","); + private static final Joiner OR_JOINER = Joiner.on(" OR "); + + /** + * Attempts to pick an optimal index (if any) from the given predicates. Returns the original table if + * one was not found. + * + * @param table the original table + * @param predicates the predicates + * @return the optimal index if found, otherwise the original table + */ + public static DynamoDBTable getBestIndexForPredicates(DynamoDBTable table, Map predicates) + { + Set columnNames = predicates.keySet(); + + ImmutableList.Builder hashKeyMatchesBuilder = ImmutableList.builder(); + // if the original table has a hash key matching a predicate, start with that + if (columnNames.contains(table.getHashKey())) { + hashKeyMatchesBuilder.add(table); + } + + // get indices with hash keys that match a predicate + table.getIndexes().stream() + .filter(index -> columnNames.contains(index.getHashKey()) && !getHashKeyAttributeValues(predicates.get(index.getHashKey())).isEmpty()) + .forEach(hashKeyMatchesBuilder::add); + List hashKeyMatches = hashKeyMatchesBuilder.build(); + + // if the original table has a range key matching a predicate, start with that + ImmutableList.Builder rangeKeyMatchesBuilder = ImmutableList.builder(); + if (table.getRangeKey().isPresent() && columnNames.contains(table.getRangeKey().get())) { + rangeKeyMatchesBuilder.add(table); + } + + // get indices with range keys that match a predicate + table.getIndexes().stream() + .filter(index -> index.getRangeKey().isPresent() && columnNames.contains(index.getRangeKey().get())) + .forEach(rangeKeyMatchesBuilder::add); + List rangeKeyMatches = rangeKeyMatchesBuilder.build(); + + // return first index where both hash and range key can be specified with predicates + for (DynamoDBTable index : hashKeyMatches) { + if (rangeKeyMatches.contains(index)) { + return index; + } + } + // else return the first index with a hash key predicate, or the original table if there are none + return hashKeyMatches.isEmpty() ? table : hashKeyMatches.get(0); + } + + /** + * Generates a list of distinct values from the given {@link ValueSet} or an empty list if not possible. + * + * @param valueSet the value set to generate from + * @return the list of distinct values + */ + public static List getHashKeyAttributeValues(ValueSet valueSet) + { + if (valueSet.isSingleValue()) { + return ImmutableList.of(valueSet.getSingleValue()); + } + else if (valueSet instanceof SortedRangeSet) { + List ranges = valueSet.getRanges().getOrderedRanges(); + ImmutableList.Builder attributeValues = ImmutableList.builder(); + for (Range range : ranges) { + if (range.isSingleValue()) { + attributeValues.add(range.getSingleValue()); + } + else { + // DDB Query can't handle non-equality conditions for the hash key + return ImmutableList.of(); + } + } + return attributeValues.build(); + } + else if (valueSet instanceof EquatableValueSet) { + EquatableValueSet equatableValueSet = (EquatableValueSet) valueSet; + if (equatableValueSet.isWhiteList()) { + ImmutableList.Builder values = ImmutableList.builder(); + for (int pos = 0; pos < equatableValueSet.getValueBlock().getRowCount(); pos++) { + values.add(equatableValueSet.getValue(pos)); + } + return values.build(); + } + } + + return ImmutableList.of(); + } + + /** + * Generates a simple alias for a column to satisfy filter expressions. + * + * @see + * https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ExpressionAttributeNames.html + * @param columnName the input column name + * @return the aliased column name + */ + public static String aliasColumn(String columnName) + { + return "#" + columnName; + } + + /* + Adds a value to the value accumulator. + */ + private static void bindValue(Object value, List accumulator) + { + accumulator.add(ItemUtils.toAttributeValue(DDBTypeUtils.convertArrowTypeIfNecessary(value))); + } + + /* + Adds a value to the value accumulator and also returns an expression with given operands. + */ + private static String toPredicate(String columnName, String operator, Object value, List accumulator, String valueName) + { + bindValue(value, accumulator); + return columnName + " " + operator + " " + valueName; + } + + /** + * Generates a filter expression for a single column given a {@link ValueSet} predicate for that column. + * + * @param originalColumnName the column name + * @param predicate the associated predicate + * @param accumulator the value accumulator to add values to + * @param valueNameProducer the value name producer to generate value aliases with + * @return the generated filter expression + */ + public static String generateSingleColumnFilter(String originalColumnName, ValueSet predicate, List accumulator, + IncrementingValueNameProducer valueNameProducer) + { + String columnName = aliasColumn(originalColumnName); + + if (predicate.isNone()) { + return "(attribute_not_exists(" + columnName + ") OR " + toPredicate(columnName, "=", null, accumulator, valueNameProducer.getNext()) + ")"; + } + + if (predicate.isAll()) { + return "(attribute_exists(" + columnName + ") AND " + toPredicate(columnName, "<>", null, accumulator, valueNameProducer.getNext()) + ")"; + } + + List disjuncts = new ArrayList<>(); + List singleValues = new ArrayList<>(); + boolean isWhitelist = true; + if (predicate instanceof SortedRangeSet) { + for (Range range : predicate.getRanges().getOrderedRanges()) { + checkState(!range.isAll()); // Already checked + if (range.isSingleValue()) { + singleValues.add(range.getLow().getValue()); + } + else { + List rangeConjuncts = new ArrayList<>(); + if (!range.getLow().isLowerUnbounded()) { + switch (range.getLow().getBound()) { + case ABOVE: + rangeConjuncts.add(toPredicate(columnName, ">", range.getLow().getValue(), accumulator, valueNameProducer.getNext())); + break; + case EXACTLY: + rangeConjuncts.add(toPredicate(columnName, ">=", range.getLow().getValue(), accumulator, valueNameProducer.getNext())); + break; + case BELOW: + throw new IllegalArgumentException("Low marker should never use BELOW bound"); + default: + throw new AssertionError("Unhandled lower bound: " + range.getLow().getBound()); + } + } + if (!range.getHigh().isUpperUnbounded()) { + switch (range.getHigh().getBound()) { + case ABOVE: + throw new IllegalArgumentException("High marker should never use ABOVE bound"); + case EXACTLY: + rangeConjuncts.add(toPredicate(columnName, "<=", range.getHigh().getValue(), accumulator, valueNameProducer.getNext())); + break; + case BELOW: + rangeConjuncts.add(toPredicate(columnName, "<", range.getHigh().getValue(), accumulator, valueNameProducer.getNext())); + break; + default: + throw new AssertionError("Unhandled upper bound: " + range.getHigh().getBound()); + } + } + // If rangeConjuncts is null, then the range was ALL, which should already have been checked for + checkState(!rangeConjuncts.isEmpty()); + disjuncts.add("(" + AND_JOINER.join(rangeConjuncts) + ")"); + } + } + } + else { + EquatableValueSet equatablePredicate = (EquatableValueSet) predicate; + isWhitelist = equatablePredicate.isWhiteList(); + long valueCount = equatablePredicate.getValueBlock().getRowCount(); + for (int i = 0; i < valueCount; i++) { + singleValues.add(equatablePredicate.getValue(i)); + } + } + + // Add back all of the possible single values either as an equality or an IN predicate + if (singleValues.size() == 1) { + disjuncts.add(toPredicate(columnName, isWhitelist ? "=" : "<>", getOnlyElement(singleValues), accumulator, valueNameProducer.getNext())); + } + else if (singleValues.size() > 1) { + for (Object value : singleValues) { + bindValue(value, accumulator); + } + String values = COMMA_JOINER.join(Stream.generate(valueNameProducer::getNext).limit(singleValues.size()).collect(toImmutableList())); + disjuncts.add((isWhitelist ? "" : "NOT ") + columnName + " IN (" + values + ")"); + } + + // at this point we should have some disjuncts + checkState(!disjuncts.isEmpty()); + + // add nullability disjuncts + if (predicate.isNullAllowed()) { + disjuncts.add("attribute_not_exists(" + columnName + ") OR " + toPredicate(columnName, "=", null, accumulator, valueNameProducer.getNext())); + } + + // DDB doesn't like redundant parentheses + if (disjuncts.size() == 1) { + return disjuncts.get(0); + } + + return "(" + OR_JOINER.join(disjuncts) + ")"; + } + + /** + * Generates a combined filter expression for the given predicates. + * + * @param columnsToIgnore the columns to not generate filters for + * @param predicates the map of columns to predicates + * @param accumulator the value accumulator to add values to + * @param valueNameProducer the value name producer to generate value aliases with + * @return the combined filter expression + */ + public static String generateFilterExpression(Set columnsToIgnore, Map predicates, List accumulator, + IncrementingValueNameProducer valueNameProducer) + { + ImmutableList.Builder builder = ImmutableList.builder(); + for (Map.Entry predicate : predicates.entrySet()) { + String columnName = predicate.getKey(); + if (!columnsToIgnore.contains(columnName)) { + builder.add(generateSingleColumnFilter(columnName, predicate.getValue(), accumulator, valueNameProducer)); + } + } + ImmutableList filters = builder.build(); + if (!filters.isEmpty()) { + return AND_JOINER.join(filters); + } + return null; + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBTableUtils.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBTableUtils.java new file mode 100644 index 0000000000..25700e5e14 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBTableUtils.java @@ -0,0 +1,216 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.util; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connectors.dynamodb.model.DynamoDBTable; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; +import com.amazonaws.services.dynamodbv2.document.ItemUtils; +import com.amazonaws.services.dynamodbv2.model.AttributeDefinition; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.amazonaws.services.dynamodbv2.model.DescribeTableRequest; +import com.amazonaws.services.dynamodbv2.model.GlobalSecondaryIndexDescription; +import com.amazonaws.services.dynamodbv2.model.KeySchemaElement; +import com.amazonaws.services.dynamodbv2.model.KeyType; +import com.amazonaws.services.dynamodbv2.model.LocalSecondaryIndexDescription; +import com.amazonaws.services.dynamodbv2.model.ScanRequest; +import com.amazonaws.services.dynamodbv2.model.ScanResult; +import com.amazonaws.services.dynamodbv2.model.TableDescription; +import com.google.common.collect.ImmutableList; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.Set; +import java.util.concurrent.TimeoutException; + +/** + * Provides utility methods relating to table handling. + */ +public final class DDBTableUtils +{ + private static final Logger logger = LoggerFactory.getLogger(DDBTableUtils.class); + + // for scan segmentation calculation + private static final long PSUEDO_CAPACITY_FOR_ON_DEMAND = 40_000; + private static final int MAX_SCAN_SEGMENTS = 1000000; + private static final int MIN_SCAN_SEGMENTS = 1; + private static final long MAX_BYTES_PER_SEGMENT = 1024L * 1024L * 1024L; + private static final double MIN_IO_PER_SEGMENT = 100.0; + private static final int SCHEMA_INFERENCE_NUM_RECORDS = 4; + + private DDBTableUtils() {} + + /** + * Fetches metadata for a DynamoDB table + * + * @param tableName the (case sensitive) table name + * @param invoker the ThrottlingInvoker to call DDB with + * @param ddbClient the DDB client to use + * @return the table metadata + */ + public static DynamoDBTable getTable(String tableName, ThrottlingInvoker invoker, AmazonDynamoDB ddbClient) + throws TimeoutException + { + DescribeTableRequest request = new DescribeTableRequest().withTableName(tableName); + TableDescription table = invoker.invoke(() -> ddbClient.describeTable(request).getTable()); + + KeyNames keys = getKeys(table.getKeySchema()); + + // get data statistics + long approxTableSizeInBytes = table.getTableSizeBytes(); + long approxItemCount = table.getItemCount(); + final long provisionedReadCapacity = table.getProvisionedThroughput() != null ? table.getProvisionedThroughput().getReadCapacityUnits() : PSUEDO_CAPACITY_FOR_ON_DEMAND; + + // get secondary indexes + List localSecondaryIndexes = table.getLocalSecondaryIndexes() != null ? table.getLocalSecondaryIndexes() : ImmutableList.of(); + List globalSecondaryIndexes = table.getGlobalSecondaryIndexes() != null ? table.getGlobalSecondaryIndexes() : ImmutableList.of(); + ImmutableList.Builder indices = ImmutableList.builder(); + localSecondaryIndexes.forEach(i -> { + KeyNames indexKeys = getKeys(i.getKeySchema()); + indices.add(new DynamoDBTable(i.getIndexName(), indexKeys.getHashKey(), indexKeys.getRangeKey(), table.getAttributeDefinitions(), ImmutableList.of(), i.getIndexSizeBytes(), i.getItemCount(), + provisionedReadCapacity)); + }); + globalSecondaryIndexes.forEach(i -> { + KeyNames indexKeys = getKeys(i.getKeySchema()); + indices.add(new DynamoDBTable(i.getIndexName(), indexKeys.getHashKey(), indexKeys.getRangeKey(), table.getAttributeDefinitions(), ImmutableList.of(), i.getIndexSizeBytes(), i.getItemCount(), + i.getProvisionedThroughput() != null ? i.getProvisionedThroughput().getReadCapacityUnits() : PSUEDO_CAPACITY_FOR_ON_DEMAND)); + }); + + return new DynamoDBTable(tableName, keys.getHashKey(), keys.getRangeKey(), table.getAttributeDefinitions(), indices.build(), approxTableSizeInBytes, approxItemCount, provisionedReadCapacity); + } + + /* + Parses the key attributes from the given list of KeySchemaElements + */ + private static KeyNames getKeys(List keys) + { + String hashKey = null; + String rangeKey = null; + for (KeySchemaElement key : keys) { + if (key.getKeyType().equals(KeyType.HASH.toString())) { + hashKey = key.getAttributeName(); + } + else if (key.getKeyType().equals(KeyType.RANGE.toString())) { + rangeKey = key.getAttributeName(); + } + } + return new KeyNames(hashKey, rangeKey); + } + + /** + * Derives an Arrow {@link Schema} for the given table by performing a small table scan and mapping the returned + * attribute values' types to Arrow types. If the table is empty, only attributes found in the table's metadata + * are added to the return schema. + * + * @param tableName the table to derive a schema for + * @param invoker the ThrottlingInvoker to call DDB with + * @param ddbClient the DDB client to use + * @return the table's derived schema + */ + public static Schema peekTableForSchema(String tableName, ThrottlingInvoker invoker, AmazonDynamoDB ddbClient) + throws TimeoutException + { + ScanRequest scanRequest = new ScanRequest().withTableName(tableName).withLimit(SCHEMA_INFERENCE_NUM_RECORDS); + ScanResult scanResult = invoker.invoke(() -> ddbClient.scan(scanRequest)); + List> items = scanResult.getItems(); + Set discoveredColumns = new HashSet<>(); + SchemaBuilder schemaBuilder = new SchemaBuilder(); + if (!items.isEmpty()) { + for (Map item : items) { + for (Map.Entry column : item.entrySet()) { + if (!discoveredColumns.contains(column.getKey()) && !Boolean.TRUE.equals(column.getValue().getNULL())) { + schemaBuilder.addField(DDBTypeUtils.getArrowField(column.getKey(), ItemUtils.toSimpleValue(column.getValue()))); + discoveredColumns.add(column.getKey()); + } + } + } + } + else { + // there's no items, so use any attributes defined in the table metadata + DynamoDBTable table = getTable(tableName, invoker, ddbClient); + for (AttributeDefinition attributeDefinition : table.getKnownAttributeDefinitions()) { + schemaBuilder.addField(DDBTypeUtils.getArrowFieldFromDDBType(attributeDefinition.getAttributeName(), attributeDefinition.getAttributeType())); + } + } + return schemaBuilder.build(); + } + + /** + * This hueristic determines an optimal segment count to perform Parallel Scans with using the table's capacity + * and size. + * + * @see + * https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Scan.html#Scan.ParallelScan + * @param tableNormalizedReadThroughput the provisioned read capacity for the table + * @param currentTableSizeBytes the table's approximate size in bytes + * @return an optimal segment count + */ + public static int getNumSegments(long tableNormalizedReadThroughput, long currentTableSizeBytes) + { + // Segments for size + int numSegmentsForSize = (int) (currentTableSizeBytes / MAX_BYTES_PER_SEGMENT); + logger.debug("Would use {} segments for size", numSegmentsForSize); + + // Segments for total throughput + int numSegmentsForThroughput = (int) (tableNormalizedReadThroughput / MIN_IO_PER_SEGMENT); + logger.debug("Would use {} segments for throughput", numSegmentsForThroughput); + + // Take the larger + int numSegments = Math.max(numSegmentsForSize, numSegmentsForThroughput); + + // Fit to bounds + numSegments = Math.min(numSegments, MAX_SCAN_SEGMENTS); + numSegments = Math.max(numSegments, MIN_SCAN_SEGMENTS); + + logger.debug("Using computed number of segments: {}", numSegments); + return numSegments; + } + + /* + Simple convenient holder for key data + */ + private static class KeyNames + { + private String hashKey; + private String rangeKey; + + private KeyNames(String hashKey, String rangeKey) + { + this.hashKey = hashKey; + this.rangeKey = rangeKey; + } + + private String getHashKey() + { + return hashKey; + } + + private Optional getRangeKey() + { + return Optional.ofNullable(rangeKey); + } + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBTypeUtils.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBTypeUtils.java new file mode 100644 index 0000000000..34f8eb582c --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/DDBTypeUtils.java @@ -0,0 +1,161 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.util; + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.FieldType; +import org.apache.arrow.vector.util.Text; +import org.joda.time.DateTimeZone; +import org.joda.time.LocalDateTime; + +import java.math.BigDecimal; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Provides utility methods relating to type handling. + */ +public final class DDBTypeUtils +{ + // DDB attribute "types" + private static final String STRING = "S"; + private static final String NUMBER = "N"; + private static final String BOOLEAN = "BOOL"; + private static final String BINARY = "B"; + private static final String STRING_SET = "SS"; + private static final String NUMBER_SET = "NS"; + private static final String BINARY_SET = "BS"; + private static final String LIST = "L"; + private static final String MAP = "M"; + + private DDBTypeUtils() {} + + /** + * Converts a given field's Java type to a corresponding Arrow type. + * + * @param key the name of the field + * @param value the valie of the field + * @return the converted Arrow field + */ + public static Field getArrowField(String key, Object value) + { + if (value instanceof String) { + return new Field(key, FieldType.nullable(Types.MinorType.VARCHAR.getType()), null); + } + else if (value instanceof byte[]) { + return new Field(key, FieldType.nullable(Types.MinorType.VARBINARY.getType()), null); + } + else if (value instanceof Boolean) { + return new Field(key, FieldType.nullable(Types.MinorType.BIT.getType()), null); + } + else if (value instanceof BigDecimal) { + return new Field(key, FieldType.nullable(new ArrowType.Decimal(38, 9)), null); + } + else if (value instanceof List || value instanceof Set) { + Field child; + if (((Collection) value).isEmpty()) { + try { + Object subVal = ((Collection) value).getClass() + .getTypeParameters()[0].getGenericDeclaration().newInstance(); + child = getArrowField("", subVal); + } + catch (IllegalAccessException | InstantiationException ex) { + throw new RuntimeException(ex); + } + } + else { + child = getArrowField("", ((Collection) value).iterator().next()); + } + return new Field(key, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(child)); + } + else if (value instanceof Map) { + List children = new ArrayList<>(); + Map doc = (Map) value; + for (String childKey : doc.keySet()) { + Object childVal = doc.get(childKey); + Field child = getArrowField(childKey, childVal); + children.add(child); + } + return new Field(key, FieldType.nullable(Types.MinorType.STRUCT.getType()), children); + } + + String className = value.getClass() == null ? "null" : value.getClass().getName(); + throw new RuntimeException("Unknown type[" + className + "] for field[" + key + "]"); + } + + /** + * Converts certain Arrow POJOs to Java POJOs to make downstream conversion easier. + * + * @param object the input object + * @return the converted-to object if convertible, otherwise the original object + */ + public static Object convertArrowTypeIfNecessary(Object object) + { + if (object instanceof Text) { + return object.toString(); + } + else if (object instanceof LocalDateTime) { + return ((LocalDateTime) object).toDateTime(DateTimeZone.UTC).getMillis(); + } + return object; + } + + /** + * Converts from DynamoDB Attribute Type to Arrow type. + * @param attributeName the DDB Attribute name + * @param attributeType the DDB Attribute type + * @return the converted-to Arrow Field + */ + public static Field getArrowFieldFromDDBType(String attributeName, String attributeType) + { + switch (attributeType) { + case STRING: + return new Field(attributeName, FieldType.nullable(Types.MinorType.VARCHAR.getType()), null); + case NUMBER: + return new Field(attributeName, FieldType.nullable(new ArrowType.Decimal(38, 9)), null); + case BOOLEAN: + return new Field(attributeName, FieldType.nullable(Types.MinorType.BIT.getType()), null); + case BINARY: + return new Field(attributeName, FieldType.nullable(Types.MinorType.VARBINARY.getType()), null); + case STRING_SET: + return new Field(attributeName, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(new Field("", FieldType.nullable(Types.MinorType.VARCHAR.getType()), null))); + case NUMBER_SET: + return new Field(attributeName, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(new Field("", FieldType.nullable(new ArrowType.Decimal(38, 9)), null))); + case BINARY_SET: + return new Field(attributeName, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(new Field("", FieldType.nullable(Types.MinorType.VARBINARY.getType()), null))); + case LIST: + return new Field(attributeName, FieldType.nullable(Types.MinorType.LIST.getType()), null); + case MAP: + return new Field(attributeName, FieldType.nullable(Types.MinorType.STRUCT.getType()), null); + default: + throw new RuntimeException("Unknown type[" + attributeType + "] for field[" + attributeName + "]"); + } + } +} diff --git a/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/IncrementingValueNameProducer.java b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/IncrementingValueNameProducer.java new file mode 100644 index 0000000000..bfda748837 --- /dev/null +++ b/athena-dynamodb/src/main/java/com/amazonaws/athena/connectors/dynamodb/util/IncrementingValueNameProducer.java @@ -0,0 +1,41 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb.util; + +/** + * A simple, repeatable name producer used to alias values in DynamoDB filter expressions. + * + * @see + * https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Expressions.ExpressionAttributeValues.html + */ +public class IncrementingValueNameProducer +{ + private int i = 0; + + /** + * Returns the next alias. + * + * @return the next alias + */ + public String getNext() + { + return ":v" + i++; + } +} diff --git a/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBMetadataHandlerTest.java b/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBMetadataHandlerTest.java new file mode 100644 index 0000000000..211f00db30 --- /dev/null +++ b/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBMetadataHandlerTest.java @@ -0,0 +1,435 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb; + +import com.amazonaws.AmazonServiceException; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.dynamodbv2.document.ItemUtils; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.model.GetTablesResult; +import com.amazonaws.services.glue.model.StorageDescriptor; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.util.json.Jackson; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import com.google.common.collect.Iterables; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.joda.time.Days; +import org.joda.time.LocalDateTime; +import org.joda.time.MutableDateTime; +import org.junit.After; +import org.junit.Before; +import org.junit.Ignore; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_NAMES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_VALUES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.HASH_KEY_NAME_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.INDEX_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.NON_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.PARTITION_TYPE_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.QUERY_PARTITION_TYPE; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.RANGE_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.RANGE_KEY_NAME_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SCAN_PARTITION_TYPE; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_COUNT_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_ID_PROPERTY; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.DEFAULT_SCHEMA; +import static com.amazonaws.athena.connectors.dynamodb.DynamoDBMetadataHandler.MAX_SPLITS_PER_REQUEST; +import static org.hamcrest.MatcherAssert.assertThat; +import static org.hamcrest.Matchers.equalTo; +import static org.hamcrest.Matchers.is; +import static org.junit.Assert.assertEquals; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.when; + +/** + * Glue logic is tested by GlueMetadataHandlerTest in SDK + */ +@RunWith(MockitoJUnitRunner.class) +public class DynamoDBMetadataHandlerTest + extends TestBase +{ + private static final Logger logger = LoggerFactory.getLogger(DynamoDBMetadataHandlerTest.class); + + @Mock + private AWSGlue glueClient; + + @Mock + private AWSSecretsManager secretsManager; + + @Mock + private AmazonAthena athena; + + private DynamoDBMetadataHandler handler; + + private BlockAllocator allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + handler = new DynamoDBMetadataHandler(new LocalKeyFactory(), secretsManager, athena, "spillBucket", "spillPrefix", ddbClient, glueClient); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void doListSchemaNamesDynamo() + throws Exception + { + logger.info("doListSchemaNamesDynamo: enter"); + + when(glueClient.getDatabases(any())).thenThrow(new AmazonServiceException("")); + + ListSchemasRequest req = new ListSchemasRequest(TEST_IDENTITY, TEST_QUERY_ID, TEST_CATALOG_NAME); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + + logger.info("doListSchemas - {}", res.getSchemas()); + + assertThat(new ArrayList<>(res.getSchemas()), equalTo(Collections.singletonList(DEFAULT_SCHEMA))); + + logger.info("doListSchemaNamesDynamo: exit"); + } + + @Test + public void doListTablesGlueAndDynamo() + throws Exception + { + logger.info("doListTablesGlueAndDynamo: enter"); + + List tableNames = new ArrayList<>(); + tableNames.add("table1"); + tableNames.add("table2"); + tableNames.add("table3"); + + GetTablesResult mockResult = new GetTablesResult(); + List

tableList = new ArrayList<>(); + tableList.add(new Table().withName("table1") + .withParameters(ImmutableMap.of("classification", "dynamodb")) + .withStorageDescriptor(new StorageDescriptor() + .withLocation("some.location"))); + tableList.add(new Table().withName("table2") + .withParameters(ImmutableMap.of()) + .withStorageDescriptor(new StorageDescriptor() + .withLocation("some.location") + .withParameters(ImmutableMap.of("classification", "dynamodb")))); + tableList.add(new Table().withName("table3") + .withParameters(ImmutableMap.of()) + .withStorageDescriptor(new StorageDescriptor() + .withLocation("arn:aws:dynamodb:us-east-1:012345678910:table/table3"))); + tableList.add(new Table().withName("notADynamoTable").withParameters(ImmutableMap.of()).withStorageDescriptor( + new StorageDescriptor().withParameters(ImmutableMap.of()).withLocation("some_location"))); + mockResult.setTableList(tableList); + when(glueClient.getTables(any())).thenReturn(mockResult); + + ListTablesRequest req = new ListTablesRequest(TEST_IDENTITY, TEST_QUERY_ID, TEST_CATALOG_NAME, DEFAULT_SCHEMA); + ListTablesResponse res = handler.doListTables(allocator, req); + + logger.info("doListTables - {}", res.getTables()); + + List expectedTables = tableNames.stream().map(table -> new TableName(DEFAULT_SCHEMA, table)).collect(Collectors.toList()); + expectedTables.add(TEST_TABLE_NAME); + expectedTables.add(new TableName(DEFAULT_SCHEMA, "Test_table2")); + + assertThat(new HashSet<>(res.getTables()), equalTo(new HashSet<>(expectedTables))); + + logger.info("doListTablesGlueAndDynamo: exit"); + } + + @Test + public void doGetTable() + throws Exception + { + logger.info("doGetTable: enter"); + + when(glueClient.getTable(any())).thenThrow(new AmazonServiceException("")); + + GetTableRequest req = new GetTableRequest(TEST_IDENTITY, TEST_QUERY_ID, TEST_CATALOG_NAME, TEST_TABLE_NAME); + GetTableResponse res = handler.doGetTable(allocator, req); + + logger.info("doGetTable - {}", res.getSchema()); + + assertThat(res.getTableName().getSchemaName(), equalTo(DEFAULT_SCHEMA)); + assertThat(res.getTableName().getTableName(), equalTo(TEST_TABLE)); + assertThat(res.getSchema().getFields().size(), equalTo(10)); + + logger.info("doGetTable: exit"); + } + + @Test + public void doGetEmptyTable() + throws Exception + { + logger.info("doGetEmptyTable: enter"); + + when(glueClient.getTable(any())).thenThrow(new AmazonServiceException("")); + + GetTableRequest req = new GetTableRequest(TEST_IDENTITY, TEST_QUERY_ID, TEST_CATALOG_NAME, TEST_TABLE_2_NAME); + GetTableResponse res = handler.doGetTable(allocator, req); + + logger.info("doGetEmptyTable - {}", res.getSchema()); + + assertThat(res.getTableName(), equalTo(TEST_TABLE_2_NAME)); + assertThat(res.getSchema().getFields().size(), equalTo(2)); + + logger.info("doGetEmptyTable: exit"); + } + + @Test + public void testCaseInsensitiveResolve() + throws Exception + { + logger.info("doGetTable: enter"); + + when(glueClient.getTable(any())).thenThrow(new AmazonServiceException("")); + + GetTableRequest req = new GetTableRequest(TEST_IDENTITY, TEST_QUERY_ID, TEST_CATALOG_NAME, TEST_TABLE_2_NAME); + GetTableResponse res = handler.doGetTable(allocator, req); + + logger.info("doGetTable - {}", res.getSchema()); + + assertThat(res.getTableName(), equalTo(TEST_TABLE_2_NAME)); + + logger.info("doGetTable: exit"); + } + + @Test + public void doGetTableLayoutScan() + throws Exception + { + logger.info("doGetTableLayoutScan: enter"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("col_3", + EquatableValueSet.newBuilder(allocator, new ArrowType.Bool(), true, true) + .add(true).build()); + + GetTableLayoutRequest req = new GetTableLayoutRequest(TEST_IDENTITY, + TEST_QUERY_ID, + TEST_CATALOG_NAME, + new TableName(TEST_CATALOG_NAME, TEST_TABLE), + new Constraints(constraintsMap), + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout schema - {}", res.getPartitions().getSchema()); + logger.info("doGetTableLayout partitions - {}", res.getPartitions()); + + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(PARTITION_TYPE_METADATA), equalTo(SCAN_PARTITION_TYPE)); + // no hash key constraints, so look for segment count column + assertThat(res.getPartitions().getSchema().findField(SEGMENT_COUNT_METADATA) != null, is(true)); + assertThat(res.getPartitions().getRowCount(), equalTo(1)); + + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(NON_KEY_FILTER_METADATA), equalTo("(#col_3 = :v0 OR attribute_not_exists(#col_3) OR #col_3 = :v1)")); + + ImmutableMap expressionNames = ImmutableMap.of("#col_3", "col_3"); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(EXPRESSION_NAMES_METADATA), equalTo(Jackson.toJsonString(expressionNames))); + + ImmutableMap expressionValues = ImmutableMap.of(":v0", ItemUtils.toAttributeValue(true), ":v1", ItemUtils.toAttributeValue(null)); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(EXPRESSION_VALUES_METADATA), equalTo(Jackson.toJsonString(expressionValues))); + + logger.info("doGetTableLayoutScan: exit"); + } + + @Test + public void doGetTableLayoutQueryIndex() + throws Exception + { + logger.info("doGetTableLayoutQueryIndex: enter"); + Map constraintsMap = new HashMap<>(); + SortedRangeSet.Builder dateValueSet = SortedRangeSet.newBuilder(Types.MinorType.DATEDAY.getType(), false); + SortedRangeSet.Builder timeValueSet = SortedRangeSet.newBuilder(Types.MinorType.DATEMILLI.getType(), false); + LocalDateTime dateTime = new LocalDateTime().withYear(2019).withMonthOfYear(9).withDayOfMonth(23).withHourOfDay(11).withMinuteOfHour(18).withSecondOfMinute(37); + MutableDateTime epoch = new MutableDateTime(); + epoch.setDate(0); //Set to Epoch time + dateValueSet.add(Range.equal(allocator, Types.MinorType.DATEDAY.getType(), Days.daysBetween(epoch, dateTime.toDateTime()).getDays())); + LocalDateTime dateTime2 = dateTime.plusHours(26); + dateValueSet.add(Range.equal(allocator, Types.MinorType.DATEDAY.getType(), Days.daysBetween(epoch, dateTime2.toDateTime()).getDays())); + long startTime = dateTime.toDateTime().getMillis(); + long endTime = dateTime2.toDateTime().getMillis(); + timeValueSet.add(Range.range(allocator, Types.MinorType.DATEMILLI.getType(), startTime, true, + endTime, true)); + constraintsMap.put("col_4", dateValueSet.build()); + constraintsMap.put("col_5", timeValueSet.build()); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, new GetTableLayoutRequest(TEST_IDENTITY, + TEST_QUERY_ID, + TEST_CATALOG_NAME, + TEST_TABLE_NAME, + new Constraints(constraintsMap), + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET)); + + logger.info("doGetTableLayout schema - {}", res.getPartitions().getSchema()); + logger.info("doGetTableLayout partitions - {}", res.getPartitions()); + + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(PARTITION_TYPE_METADATA), equalTo(QUERY_PARTITION_TYPE)); + assertThat(res.getPartitions().getSchema().getCustomMetadata().containsKey(INDEX_METADATA), is(true)); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(INDEX_METADATA), equalTo("test_index")); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(HASH_KEY_NAME_METADATA), equalTo("col_4")); + assertThat(res.getPartitions().getRowCount(), equalTo(2)); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(RANGE_KEY_NAME_METADATA), equalTo("col_5")); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(RANGE_KEY_FILTER_METADATA), equalTo("(#col_5 >= :v0 AND #col_5 <= :v1)")); + + ImmutableMap expressionNames = ImmutableMap.of("#col_4", "col_4", "#col_5", "col_5"); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(EXPRESSION_NAMES_METADATA), equalTo(Jackson.toJsonString(expressionNames))); + + ImmutableMap expressionValues = ImmutableMap.of(":v0", ItemUtils.toAttributeValue(startTime), ":v1", ItemUtils.toAttributeValue(endTime)); + assertThat(res.getPartitions().getSchema().getCustomMetadata().get(EXPRESSION_VALUES_METADATA), equalTo(Jackson.toJsonString(expressionValues))); + + logger.info("doGetTableLayoutQueryIndex: exit"); + } + + @Test + public void doGetSplitsScan() + throws Exception + { + logger.info("doGetSplitsScan: enter"); + + GetTableLayoutResponse layoutResponse = handler.doGetTableLayout(allocator, new GetTableLayoutRequest(TEST_IDENTITY, + TEST_QUERY_ID, + TEST_CATALOG_NAME, + TEST_TABLE_NAME, + new Constraints(ImmutableMap.of()), + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET)); + + GetSplitsRequest req = new GetSplitsRequest(TEST_IDENTITY, + TEST_QUERY_ID, + TEST_CATALOG_NAME, + TEST_TABLE_NAME, + layoutResponse.getPartitions(), + ImmutableList.of(), + new Constraints(new HashMap<>()), + null); + logger.info("doGetSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertThat(rawResponse.getRequestType(), equalTo(MetadataRequestType.GET_SPLITS)); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + String continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + + assertThat(continuationToken == null, is(true)); + + Split split = Iterables.getOnlyElement(response.getSplits()); + assertThat(split.getProperty(SEGMENT_ID_PROPERTY), equalTo("0")); + + logger.info("doGetSplitsScan: exit"); + } + + @Test + public void doGetSplitsQuery() + throws Exception + { + logger.info("doGetSplitsQuery: enter"); + + Map constraintsMap = new HashMap<>(); + EquatableValueSet.Builder valueSet = EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false); + for (int i = 0; i < 2000; i++) { + valueSet.add("test_str_" + i); + } + constraintsMap.put("col_0", valueSet.build()); + GetTableLayoutResponse layoutResponse = handler.doGetTableLayout(allocator, new GetTableLayoutRequest(TEST_IDENTITY, + TEST_QUERY_ID, + TEST_CATALOG_NAME, + TEST_TABLE_NAME, + new Constraints(constraintsMap), + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET)); + + GetSplitsRequest req = new GetSplitsRequest(TEST_IDENTITY, + TEST_QUERY_ID, + TEST_CATALOG_NAME, + TEST_TABLE_NAME, + layoutResponse.getPartitions(), + ImmutableList.of("col_0"), + new Constraints(new HashMap<>()), + null); + logger.info("doGetSplits: req[{}]", req); + + GetSplitsResponse response = handler.doGetSplits(allocator, req); + assertThat(response.getRequestType(), equalTo(MetadataRequestType.GET_SPLITS)); + + String continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + + assertThat(continuationToken, equalTo(String.valueOf(MAX_SPLITS_PER_REQUEST - 1))); + assertThat(response.getSplits().size(), equalTo(MAX_SPLITS_PER_REQUEST)); + + response = handler.doGetSplits(allocator, new GetSplitsRequest(req, continuationToken)); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + + assertThat(response.getContinuationToken(), equalTo(null)); + assertThat(response.getSplits().size(), equalTo(MAX_SPLITS_PER_REQUEST)); + + logger.info("doGetSplitsQuery: exit"); + } +} diff --git a/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBRecordHandlerTest.java b/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBRecordHandlerTest.java new file mode 100644 index 0000000000..02faa0f142 --- /dev/null +++ b/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/DynamoDBRecordHandlerTest.java @@ -0,0 +1,210 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableMap; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Map; +import java.util.UUID; + +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_NAMES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.EXPRESSION_VALUES_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.HASH_KEY_NAME_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.NON_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.RANGE_KEY_FILTER_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_COUNT_METADATA; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.SEGMENT_ID_PROPERTY; +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.TABLE_METADATA; +import static com.amazonaws.services.dynamodbv2.document.ItemUtils.toAttributeValue; +import static com.amazonaws.util.json.Jackson.toJsonString; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertTrue; +import static org.mockito.Mockito.mock; + +public class DynamoDBRecordHandlerTest + extends TestBase +{ + + private static final Logger logger = LoggerFactory.getLogger(DynamoDBRecordHandlerTest.class); + + private static final SpillLocation SPILL_LOCATION = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + private BlockAllocator allocator; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private DynamoDBRecordHandler handler; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + handler = new DynamoDBRecordHandler(ddbClient, mock(AmazonS3.class), mock(AWSSecretsManager.class), mock(AmazonAthena.class), "source_type"); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void testReadScanSplit() + throws Exception + { + logger.info("testReadScanSplit: enter"); + Map expressionNames = ImmutableMap.of("#col_6", "col_6"); + Map expressionValues = ImmutableMap.of(":v0", toAttributeValue(0), ":v1", toAttributeValue(1)); + Split split = Split.newBuilder(SPILL_LOCATION, keyFactory.create()) + .add(TABLE_METADATA, TEST_TABLE) + .add(SEGMENT_ID_PROPERTY, "0") + .add(SEGMENT_COUNT_METADATA, "1") + .add(NON_KEY_FILTER_METADATA, "NOT #col_6 IN (:v0,:v1)") + .add(EXPRESSION_NAMES_METADATA, toJsonString(expressionNames)) + .add(EXPRESSION_VALUES_METADATA, toJsonString(expressionValues)) + .build(); + + ReadRecordsRequest request = new ReadRecordsRequest( + TEST_IDENTITY, + TEST_CATALOG_NAME, + TEST_QUERY_ID, + TEST_TABLE_NAME, + schema, + split, + new Constraints(ImmutableMap.of()), + 100_000_000_000L, // too big to spill + 100_000_000_000L); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("testReadScanSplit: rows[{}]", response.getRecordCount()); + + assertEquals(992, response.getRecords().getRowCount()); + logger.info("testReadScanSplit: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("testReadScanSplit: exit"); + } + + @Test + public void testReadQuerySplit() + throws Exception + { + logger.info("testReadQuerySplit: enter"); + Map expressionNames = ImmutableMap.of("#col_1", "col_1"); + Map expressionValues = ImmutableMap.of(":v0", toAttributeValue(1)); + Split split = Split.newBuilder(SPILL_LOCATION, keyFactory.create()) + .add(TABLE_METADATA, TEST_TABLE) + .add(HASH_KEY_NAME_METADATA, "col_0") + .add("col_0", toJsonString(toAttributeValue("test_str_0"))) + .add(RANGE_KEY_FILTER_METADATA, "#col_1 >= :v0") + .add(EXPRESSION_NAMES_METADATA, toJsonString(expressionNames)) + .add(EXPRESSION_VALUES_METADATA, toJsonString(expressionValues)) + .build(); + + ReadRecordsRequest request = new ReadRecordsRequest( + TEST_IDENTITY, + TEST_CATALOG_NAME, + TEST_QUERY_ID, + TEST_TABLE_NAME, + schema, + split, + new Constraints(ImmutableMap.of()), + 100_000_000_000L, // too big to spill + 100_000_000_000L); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("testReadQuerySplit: rows[{}]", response.getRecordCount()); + + assertEquals(2, response.getRecords().getRowCount()); + logger.info("testReadQuerySplit: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("testReadQuerySplit: exit"); + } + + @Test + public void testZeroRowQuery() + throws Exception + { + logger.info("testZeroRowQuery: enter"); + Map expressionNames = ImmutableMap.of("#col_1", "col_1"); + Map expressionValues = ImmutableMap.of(":v0", toAttributeValue(1)); + Split split = Split.newBuilder(SPILL_LOCATION, keyFactory.create()) + .add(TABLE_METADATA, TEST_TABLE) + .add(HASH_KEY_NAME_METADATA, "col_0") + .add("col_0", toJsonString(toAttributeValue("test_str_999999"))) + .add(RANGE_KEY_FILTER_METADATA, "#col_1 >= :v0") + .add(EXPRESSION_NAMES_METADATA, toJsonString(expressionNames)) + .add(EXPRESSION_VALUES_METADATA, toJsonString(expressionValues)) + .build(); + + ReadRecordsRequest request = new ReadRecordsRequest( + TEST_IDENTITY, + TEST_CATALOG_NAME, + TEST_QUERY_ID, + TEST_TABLE_NAME, + schema, + split, + new Constraints(ImmutableMap.of()), + 100_000_000_000L, // too big to spill + 100_000_000_000L); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("testZeroRowQuery: rows[{}]", response.getRecordCount()); + + assertEquals(0, response.getRecords().getRowCount()); + + logger.info("testZeroRowQuery: exit"); + } +} diff --git a/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/TestBase.java b/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/TestBase.java new file mode 100644 index 0000000000..be497a4259 --- /dev/null +++ b/athena-dynamodb/src/test/java/com/amazonaws/athena/connectors/dynamodb/TestBase.java @@ -0,0 +1,164 @@ +/*- + * #%L + * athena-dynamodb + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.dynamodb; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connectors.dynamodb.util.DDBTableUtils; +import com.amazonaws.services.dynamodbv2.AmazonDynamoDB; +import com.amazonaws.services.dynamodbv2.document.DynamoDB; +import com.amazonaws.services.dynamodbv2.document.Index; +import com.amazonaws.services.dynamodbv2.document.Table; +import com.amazonaws.services.dynamodbv2.document.TableWriteItems; +import com.amazonaws.services.dynamodbv2.local.embedded.DynamoDBEmbedded; +import com.amazonaws.services.dynamodbv2.model.AttributeDefinition; +import com.amazonaws.services.dynamodbv2.model.AttributeValue; +import com.amazonaws.services.dynamodbv2.model.CreateGlobalSecondaryIndexAction; +import com.amazonaws.services.dynamodbv2.model.CreateTableRequest; +import com.amazonaws.services.dynamodbv2.model.KeySchemaElement; +import com.amazonaws.services.dynamodbv2.model.KeyType; +import com.amazonaws.services.dynamodbv2.model.Projection; +import com.amazonaws.services.dynamodbv2.model.ProjectionType; +import com.amazonaws.services.dynamodbv2.model.ProvisionedThroughput; +import com.amazonaws.services.dynamodbv2.model.ScalarAttributeType; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import com.google.common.collect.ImmutableSet; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.AfterClass; +import org.junit.BeforeClass; + +import java.sql.Timestamp; +import java.time.LocalDateTime; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Map; + +import static com.amazonaws.athena.connectors.dynamodb.constants.DynamoDBConstants.DEFAULT_SCHEMA; +import static com.amazonaws.athena.connectors.dynamodb.throttling.DynamoDBExceptionFilter.EXCEPTION_FILTER; +import static com.amazonaws.services.dynamodbv2.document.ItemUtils.toAttributeValue; +import static com.amazonaws.services.dynamodbv2.document.ItemUtils.toItem; + +public class TestBase +{ + protected FederatedIdentity TEST_IDENTITY = new FederatedIdentity("id", "principal", "account"); + protected static final String TEST_QUERY_ID = "queryId"; + protected static final String TEST_CATALOG_NAME = "default"; + protected static final String TEST_TABLE = "test_table"; + protected static final TableName TEST_TABLE_NAME = new TableName(DEFAULT_SCHEMA, TEST_TABLE); + protected static final TableName TEST_TABLE_2_NAME = new TableName(DEFAULT_SCHEMA, "Test_table2"); + + protected static AmazonDynamoDB ddbClient; + protected static Schema schema; + + @BeforeClass + public static void setupOnce() throws Exception + { + ddbClient = setupDatabase(); + ThrottlingInvoker invoker = ThrottlingInvoker.newDefaultBuilder(EXCEPTION_FILTER).build(); + schema = DDBTableUtils.peekTableForSchema(TEST_TABLE, invoker, ddbClient); + } + + @AfterClass + public static void tearDownOnce() + { + ddbClient.shutdown(); + } + + private static AmazonDynamoDB setupDatabase() throws InterruptedException + { + System.setProperty("sqlite4java.library.path", "native-libs"); + AmazonDynamoDB client = DynamoDBEmbedded.create().amazonDynamoDB(); + DynamoDB ddb = new DynamoDB(client); + + ArrayList attributeDefinitions = new ArrayList<>(); + attributeDefinitions.add(new AttributeDefinition().withAttributeName("col_0").withAttributeType("S")); + attributeDefinitions.add(new AttributeDefinition().withAttributeName("col_1").withAttributeType("N")); + + ArrayList keySchema = new ArrayList<>(); + keySchema.add(new KeySchemaElement().withAttributeName("col_0").withKeyType(KeyType.HASH)); + keySchema.add(new KeySchemaElement().withAttributeName("col_1").withKeyType(KeyType.RANGE)); + + ProvisionedThroughput provisionedThroughput = new ProvisionedThroughput() + .withReadCapacityUnits(5L) + .withWriteCapacityUnits(6L); + CreateTableRequest createTableRequest = new CreateTableRequest() + .withTableName(TEST_TABLE) + .withKeySchema(keySchema) + .withAttributeDefinitions(attributeDefinitions) + .withProvisionedThroughput(provisionedThroughput); + + Table table = ddb.createTable(createTableRequest); + + table.waitForActive(); + + TableWriteItems tableWriteItems = new TableWriteItems(TEST_TABLE); + int len = 1000; + LocalDateTime dateTime = LocalDateTime.of(2019, 9, 23, 11, 18, 37); + for (int i = 0; i < len; i++) { + Map item = new HashMap<>(); + item.put("col_0", toAttributeValue("test_str_" + (i - i % 3))); + item.put("col_1", toAttributeValue(i)); + double doubleVal = 200000.0 + i / 2.0; + if (Math.floor(doubleVal) != doubleVal) { + item.put("col_2", toAttributeValue(200000.0 + i / 2.0)); + } + item.put("col_3", toAttributeValue(ImmutableMap.of("modulo", i % 2 == 0, "nextModulos", ImmutableList.of((i + 1) % 2 == 0, ((i + 2) % 2 == 0))))); + item.put("col_4", toAttributeValue(dateTime.toLocalDate().toEpochDay())); + item.put("col_5", toAttributeValue(Timestamp.valueOf(dateTime).toInstant().toEpochMilli())); + item.put("col_6", toAttributeValue(i % 128 == 0 ? null : i % 128)); + item.put("col_7", toAttributeValue(-i)); + item.put("col_8", toAttributeValue(ImmutableSet.of(i - 100, i - 200))); + item.put("col_9", toAttributeValue(100.0f + i)); + tableWriteItems.addItemToPut(toItem(item)); + + if (tableWriteItems.getItemsToPut().size() == 25) { + ddb.batchWriteItem(tableWriteItems); + tableWriteItems = new TableWriteItems(TEST_TABLE); + } + + dateTime = dateTime.plusHours(26); + } + + CreateGlobalSecondaryIndexAction createIndexRequest = new CreateGlobalSecondaryIndexAction() + .withIndexName("test_index") + .withKeySchema( + new KeySchemaElement().withKeyType(KeyType.HASH).withAttributeName("col_4"), + new KeySchemaElement().withKeyType(KeyType.RANGE).withAttributeName("col_5")) + .withProjection(new Projection().withProjectionType(ProjectionType.ALL)) + .withProvisionedThroughput(provisionedThroughput); + Index gsi = table.createGSI(createIndexRequest, + new AttributeDefinition().withAttributeName("col_4").withAttributeType(ScalarAttributeType.N), + new AttributeDefinition().withAttributeName("col_5").withAttributeType(ScalarAttributeType.N)); + gsi.waitForActive(); + + // for case sensitivity testing + createTableRequest = new CreateTableRequest() + .withTableName("Test_table2") + .withKeySchema(keySchema) + .withAttributeDefinitions(attributeDefinitions) + .withProvisionedThroughput(provisionedThroughput); + table = ddb.createTable(createTableRequest); + table.waitForActive(); + + return client; + } +} diff --git a/athena-example/LICENSE.txt b/athena-example/LICENSE.txt new file mode 100644 index 0000000000..834d25ef58 --- /dev/null +++ b/athena-example/LICENSE.txt @@ -0,0 +1 @@ +my license diff --git a/athena-example/README.md b/athena-example/README.md new file mode 100644 index 0000000000..48a70e094f --- /dev/null +++ b/athena-example/README.md @@ -0,0 +1,193 @@ +## Example Athena Connector + +This module is meant to serve as a guided example for writing and deploying your own connector to enable Athena to query a custom source. The goal with this guided tutorial is to help you understand the development process and point out capabilities. Out of necessity some of the examples are rather contrived and make use of hard coded schemas to separate learning how to write a connector from learning how to interface with the target systems you will inevitably want to federate to. + +## What is a 'Connector'? + +A 'Connector' is a piece of code that can translate between your target data source and Athena. Today this code is expected to run in an AWS Lambda function but in the future we hope to offer more options. You can think of a connector as an extension of Athena's query engine. Athena will delegate portions of the federated query plan to your connector. More specifically: + +1. Your connector must provide a source of meta-data for Athena to get schema information about what databases, tables, and columns your connector has. This is done by building and deploying a lambda function that extends com.amazonaws.athena.connector.lambda.handlers.MetadataHandler in the athena-federation-sdk module. +2. Your connector must provide a way for Athena to read the data stored in your tables. This is done by building and deploying a lambda function that extends com.amazonaws.athena.connector.lambda.handlers.RecordHandler in the athena-federation-sdk module. + +Alternatively, you can deploy a single Lambda function which combines the two above requirements by using com.amazonaws.athena.connector.lambda.handlers.CompositeHandler or com.amazonaws.athena.connector.lambda.handlers.UnifiedHandler. While breaking this into two separate Lambda functions allows you to independently control the cost and timeout of your Lambda functions, using a single Lambda function can be simpler and higher performance due to less cold start. + +In the next section we take a closer look at the methods we must implement on the MetadataHandler and RecordHandler. + +### MetadataHandler Details + +Lets take a closer look at what is required for a MetadataHandler. Below we have the basic functions we need to implement when using the Amazon Athena Query Federation SDK's MetadataHandler to satisfy the boiler plate work of serialization and initialization. The abstract class we are extending takes care of all the Lambda interface bits and delegates on the discrete operations that are relevant to the task at hand, querying our new data source. + +```java +public class MyMetadataHandler extends MetadataHandler +{ + /** + * Used to get the list of schemas (aka databases) that this source contains. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @return A ListSchemasResponse which primarily contains a Set of schema names and a catalog name + * corresponding the Athena catalog that was queried. + */ + @Override + protected ListSchemasResponse doListSchemaNames(BlockAllocator allocator, ListSchemasRequest request) {} + + /** + * Used to get the list of tables that this source contains. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog and database they are querying. + * @return A ListTablesResponse which primarily contains a List enumerating the tables in this + * catalog, database tuple. It also contains the catalog name corresponding the Athena catalog that was queried. + */ + @Override + protected ListTablesResponse doListTables(BlockAllocator allocator, ListTablesRequest request) {} + + /** + * Used to get definition (field names, types, descriptions, etc...) of a Table. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog, database, and table they are querying. + * @return A GetTableResponse which primarily contains: + * 1. An Apache Arrow Schema object describing the table's columns, types, and descriptions. + * 2. A Set of partition column names (or empty if the table isn't partitioned). + */ + @Override + protected GetTableResponse doGetTable(BlockAllocator allocator, GetTableRequest request) {} + + /** + * Used to get the partitions that must be read from the request table in order to satisfy the requested predicate. + * + * @param blockWriter Used to write rows (partitions) into the Apache Arrow response. + * @param request Provides details of the catalog, database, and table being queried as well as any filter predicate. + * @note Partitions are partially opaque to Amazon Athena in that it only understands your partition columns and + * how to filter out partitions that do not meet the query's constraints. Any additional columns you add to the + * partition data are ignored by Athena but passed on to calls on GetSplits. Also note tat the BlockWriter handlers + * automatically constraining and filtering out values that don't satisfy the query's predicate. This is how we + * we accomplish partition pruning. You can optionally retreive a ConstraintEvaluator from BlockWriter if you have + * your own need to apply filtering in Lambda. Otherwise you can get the actual preducate from the request object + * for pushing down into the source you are querying. + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request) {} + + /** + * Used to split-up the reads required to scan the requested batch of partition(s). + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details of the catalog, database, table, andpartition(s) being queried as well as + * any filter predicate. + * @return A GetSplitsResponse which primarily contains: + * 1. A Set which represent read operations Amazon Athena must perform by calling your read function. + * 2. (Optional) A continuation token which allows you to paginate the generation of splits for large queries. + * @note A Split is a mostly opaque object to Amazon Athena. Amazon Athena will use the optional SpillLocation and + * optional EncryptionKey for pipelined reads but all properties you set on the Split are passed to your read + * function to help you perform the read. + */ + @Override + protected GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) {} +} +``` + +You can find example MetadataHandlers by looking at some of the connectors in the repository. athena-cloudwatch and athena-tpcds are fairly easy to follow along with. + +Alternatively, if you wish to use AWS Glue DataCatalog as the authoritative (or supplemental) source of meta-data for your connector you can extend com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler instead of com.amazonaws.athena.connector.lambda.handlers.MetadataHandler. GlueMetadataHandler comes with implementations for doListSchemas(...), doListTables(...), and doGetTable(...) leaving you to implemented only 2 methods. The Amazon Athena DocumentDB Connector in the athena-docdb module is an example of using GlueMetadataHandler. + +### RecordHandler Details + +Lets take a closer look at what is required for a RecordHandler. Below we have the basic functions we need to implement when using the Amazon Athena Query Federation SDK's MetadataHandler to satisfy the boiler plate work of serialization and initialization. The abstract class we are extending takes care of all the Lambda interface bits and delegates on the discrete operations that are relevant to the task at hand, querying our new data source. + +```java +public class MyRecordHandler + extends RecordHandler +{ + /** + * Used to read the row data associated with the provided Split. + * @param constraints A ConstraintEvaluator capable of applying constraints form the query that request this read. + * @param spiller A BlockSpiller that should be used to write the row data associated with this Split. + * The BlockSpiller automatically handles chunking the response, encrypting, and spilling to S3. + * @param recordsRequest Details of the read request, including: + * 1. The Split + * 2. The Catalog, Database, and Table the read request is for. + * 3. The filtering predicate (if any) + * 4. The columns required for projection. + * @note Avoid writing >10 rows per-call to BlockSpiller.writeRow(...) because this will limit the BlockSpiller's + * ability to control Block size. The resulting increase in Block size may cause failures and reduced performance. + */ + @Override + protected void readWithConstraint(ConstraintEvaluator constraints, BlockSpiller spiller, ReadRecordsRequest recordsRequest){} +} +``` + +## How To Build & Deploy + +You can use any IDE or even just comman line editor to write your connector. The below steps show you how to use an AWS Cloud9 IDE running on EC2 to get started but most of the steps are applicable to any linux based development machine. + + +### Step 1: Create your Cloud9 Instance + +1. Open the AWS Console and navigate to the [Cloud9 Service or Click Here](https://console.aws.amazon.com/cloud9/) +2. Click 'Create Environment' and follow the steps to create a new instance using a new EC2 Instance (we recommend m4.large) running Amazon Linux. + + +### Step 2: Download The SDK + Connectors + +1. At your Cloud9 terminal run `git clone https://github.com/awslabs/aws-athena-query-federation.git` to get a copy of the Amazon Athena Query Federation SDK, Connector Suite, and Example Connector. + +### Step 3: Install Development Tools (Pre-Requisites) + +1. This step may be optional if you are working on a development machine that already has Apache Maven, the AWS CLI, and the AWS SAM build tool for Serverless Applications. If not, you can run the `./tools/prepare_dev_env.sh` script in the root of the github project you checked out. +2. To ensure your terminal can see the new tools we installed run `source ~/.profile` or open a fresh terminal. If you skip this step you will get errors later about the aws cli or sam build tool not being able to publish your connector. + +Now run `mvn clean install -DskipTests=true` from the athena-federation-sdk directory within the github project you checked out earlier. We are skipping tests just to make the build faster. Normally you should let the tests as a matter of best practice. + +### Step 4: Write The Code + +1. Create an s3 bucket (in the same region you will be deploying the connector), that we can use for spill and to upload some sample data using the following command `aws s3 mb s3://BUCKET_NAME` but be sure to put your actual bucket name in the command and that you pick something that is unlikely to already exist. +2. (If using Cloud9) Navigate to the aws-athena-query-federation/athena-example folder on the left nav. This is the code you extracted back in Step 2. +3. Complete the TODOs in ExampleMetadataHandler by uncommenting the provided example code and providing missing code where indicated. +4. Complete the TODOs in ExampleRecordHandler by uncommenting the provided example code and providing missing code where indicated. +5. Run the following command from the aws-athena-query-federation/athena-example directory to ensure your connector is valid. `mvn clean install` +6. Upload our sample data by running the following command from aws-athena-query-federation/athena-example directory. Be sure to replace BUCKET_NAME with the name of the bucket your created earlier. `aws s3 cp ./sample_data.csv s3://BUCKET_NAME/2017/11/1/sample_data.csv` + +### Step 5: Package and Deploy Your New Connector + +We have two options for deploying our connector: directly to Lambda or via Serverless Application Repository. We'll do both below. + +*Publish Your Connector To Serverless Application Repository* + +Run `../tools/publish.sh S3_BUCKET_NAME athena-example` to publish the connector to your private AWS Serverless Application Repository. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. + +If the publish command gave you an error about the aws cli or sam tool not recognizing an argument, you likely forgot to source the new bash profile after +updating your development environment so run `source ~/.profile` and try again. + +Then you can navigate to [Serverless Application Repository](https://console.aws.amazon.com/serverlessrepo/) to search for your application and deploy it before using it from Athena. + +(Alternatively you can publish your connector directly to Lambda but for simplicity this tutorial uses Serverless Application Repository.) + +### Step 6: Validate our Connector. + +One of the most challenging aspects of integrating systems (in this case our connector and Athena) is testing how these two things will work together. Lambda will capture logging from out connector in Cloudwatch Logs but we've also tried to provide some tools to stream line detecting and correcting common semantic and logical issues with your custom connector. By running Athena's connector validation tool you can simulate how Athena will interact with your Lambda function and get access to diagnostic information that would normally only be available within Athena or require you to add extra diagnostics to your connector. + +Run `../tools/validate_connector.sh --lambda-func --schema schema1 --table table1 --constraints year=2017,month=11,day=1` +Be sure to replace lambda_func with the name you gave to your function/catalog when you deployed it via Serverless Application Repository. + +If everything worked as expected you should see the script generate useful debugging info and end with: +```txt +2019-11-07 20:25:08 <> INFO ConnectorValidator:================================================== +2019-11-07 20:25:08 <> INFO ConnectorValidator:Successfully Passed Validation! +2019-11-07 20:25:08 <> INFO ConnectorValidator:================================================== +``` + +### Step 7: Run a Query! + +Ok, now we are ready to try running some queries using our new connector. Some good examples to try include (be sure to put in your actual database and table names): + +`select * from "lambda:".schema1.table1 where year=2017 and month=11 and day=1;` + +`select transaction.completed, count(*) from "lambda:".schema1.table1 where year=2017 and month=11 and day=1 group by transaction.completed;` + +*note that the corresponds to the name of your Lambda function. + + + + diff --git a/athena-example/athena-example.yaml b/athena-example/athena-example.yaml new file mode 100644 index 0000000000..181ce69536 --- /dev/null +++ b/athena-example/athena-example.yaml @@ -0,0 +1,74 @@ +Transform: 'AWS::Serverless-2016-10-31' + +Metadata: + AWS::ServerlessRepo::Application: + Name: ExampleAthenaConnector + Description: ExampleAthenaConnector Description + Author: user1 + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: ['athena-federation'] + HomePageUrl: https://github.com/awslabs/aws-athena-query-federation + SemanticVersion: 1.0.0 + SourceCodeUrl: https://github.com/awslabs/aws-athena-query-federation + +# Parameters are CloudFormation features to pass input +# to your template when you create a stack +Parameters: + AthenaCatalogName: + Description: "The name you will give to this catalog in Athena will also be used as you Lambda function name." + Type: String + SpillBucket: + Description: "The bucket where this function can spill large responses." + Type: String + DataBucket: + Description: "The bucket where this tutorial's data lives." + Type: String + SpillPrefix: + Description: "The bucket prefix where this function can spill large responses." + Type: String + Default: "athena-spill" + LambdaTimeout: + Description: "Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)" + Default: 900 + Type: Number + LambdaMemory: + Description: "Lambda memory in MB (min 128 - 3008 max)." + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: "false" + Type: String + +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + data_bucket: !Ref DataBucket + FunctionName: !Sub "${AthenaCatalogName}" + Handler: "com.amazonaws.connectors.athena.example.ExampleCompositeHandler" + CodeUri: "./target/athena-example-1.0.jar" + Description: "A guided example for writing and deploying your own federated Amazon Athena connector for a custom source." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket + - S3CrudPolicy: + BucketName: !Ref DataBucket \ No newline at end of file diff --git a/athena-example/pom.xml b/athena-example/pom.xml new file mode 100644 index 0000000000..9e22fdcad6 --- /dev/null +++ b/athena-example/pom.xml @@ -0,0 +1,65 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-example + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.google.cloud + google-cloud-bigquery + 1.87.0 + compile + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + org.apache.maven.plugins + maven-checkstyle-plugin + + true + + + + + \ No newline at end of file diff --git a/athena-example/sample_data.csv b/athena-example/sample_data.csv new file mode 100644 index 0000000000..cccb0dd2d1 --- /dev/null +++ b/athena-example/sample_data.csv @@ -0,0 +1,11601 @@ +2017,7,18,1795956446,894609933,false +2017,7,19,1716339081,272577275,false +2017,7,19,688171731,1378122016,false +2017,7,19,193902233,1433256248,true +2017,7,19,2129370796,2049762888,true +2017,7,19,171953495,365729954,true +2017,7,19,195334229,1228390096,false +2017,7,19,1779205754,686833825,true +2017,7,19,903033383,1454168653,false +2017,7,19,1443458971,334050035,true +2017,7,19,1115189438,511247096,true +2017,7,20,2034403243,1875721250,false +2017,7,20,1955829989,444515885,true +2017,7,20,1383703271,475742027,false +2017,7,20,1868008787,17732604,true +2017,7,20,1046166032,1163070105,true +2017,7,20,1908512098,959782166,true +2017,7,20,1197601506,1009204990,true +2017,7,20,1841480979,455877752,true +2017,7,20,1203863537,1904933137,true +2017,7,20,351714297,1797721759,false +2017,7,21,1421427832,318585367,true +2017,7,21,1775258585,818080304,false +2017,7,21,756188357,1688459171,false +2017,7,21,1099288870,1882623571,true +2017,7,21,1453352512,1316701522,true +2017,7,21,1982887597,2079803704,false +2017,7,21,881896000,1491709837,false +2017,7,21,1823576827,75072892,false +2017,7,21,2000254254,1038477143,true +2017,7,21,2101582186,207247205,true +2017,7,22,1499244448,222715536,false +2017,7,22,352192992,833755954,false +2017,7,22,1078845818,802437432,false +2017,7,22,1148727703,458474483,false +2017,7,22,1584547636,305795983,true +2017,7,22,1937357967,75382435,false +2017,7,22,330637264,1973870125,false +2017,7,22,339991167,239267704,true +2017,7,22,1638559277,1288300645,true +2017,7,22,1679647568,997121957,true +2017,7,23,143676089,129379982,true +2017,7,23,725665426,501497865,false +2017,7,23,2048154282,726537479,true +2017,7,23,1994258033,655065466,true +2017,7,23,1828550729,2083861010,false +2017,7,23,1703721473,1052376490,false +2017,7,23,1007961086,412121051,true +2017,7,23,721826825,778507234,false +2017,7,23,1277093899,286541127,true +2017,7,23,1672856212,668353276,true +2017,7,24,371050820,1200582488,true +2017,7,24,1763834093,982990439,true +2017,7,24,1467624746,167216722,true +2017,7,24,730605775,2024656418,false +2017,7,24,1490344486,1052262623,true +2017,7,24,678889715,1314359781,true +2017,7,24,2097383649,1306158311,false +2017,7,24,1649956422,332512756,true +2017,7,24,253756906,298289369,true +2017,7,24,720642662,1420170328,false +2017,7,25,552184101,393933196,false +2017,7,25,1081519421,1741719400,false +2017,7,25,612588734,1995796497,true +2017,7,25,417888458,1937226474,true +2017,7,25,1201627024,1600440330,false +2017,7,25,461016451,2035068733,true +2017,7,25,2083737413,199664212,false +2017,7,25,1543172818,352183869,true +2017,7,25,704078993,1907142964,false +2017,7,25,2062877329,97653738,true +2017,7,26,938686562,1649033799,false +2017,7,26,435799966,882926838,true +2017,7,26,1856533203,191059932,true +2017,7,26,907204318,1180226502,true +2017,7,26,1564704067,945313417,false +2017,7,26,1803048762,184680587,false +2017,7,26,578701035,1740025107,true +2017,7,26,1790511237,2130123641,false +2017,7,26,266301362,2144037897,true +2017,7,26,185227860,1617344590,true +2017,7,27,387338053,1910953888,false +2017,7,27,181787419,1061251234,true +2017,7,27,743661209,733178281,true +2017,7,27,125879612,1826541241,true +2017,7,27,1871517777,1524419267,true +2017,7,27,1334313674,545459402,true +2017,7,27,180846803,1272893717,true +2017,7,27,555612706,1139938556,true +2017,7,27,1060356330,781574188,false +2017,7,27,616866994,1552874633,true +2017,7,28,591177274,550693085,true +2017,7,28,646194827,1147871429,true +2017,7,28,1887634183,704188726,true +2017,7,28,1485318274,1168848631,false +2017,7,28,665049838,1721928666,false +2017,7,28,2104092692,2123520079,false +2017,7,28,1984666789,206023529,false +2017,7,28,1201477457,1044861270,true +2017,7,28,263350499,749115839,true +2017,7,28,1009161323,31785134,false +2017,7,29,517612788,1534368815,true +2017,7,29,1692369859,571011838,false +2017,7,29,341304238,1030374654,true +2017,7,29,1730723979,962378329,true +2017,7,29,1554422209,394440661,true +2017,7,29,1205516121,1098861960,true +2017,7,29,1871456913,709409246,false +2017,7,29,933684024,149952505,false +2017,7,29,1651912995,165371845,true +2017,7,29,569516312,1254153535,true +2017,7,30,117935301,1249958619,true +2017,7,30,1385377466,173357937,true +2017,7,30,2110798698,966593538,false +2017,7,30,1060487518,865402308,false +2017,7,30,1201634311,1797721661,false +2017,7,30,793837022,1216062766,true +2017,7,30,1787633989,1885912037,false +2017,7,30,2127400705,2044010240,true +2017,7,30,565606757,228350203,false +2017,7,30,538669098,317366704,true +2017,7,31,962120243,461160983,true +2017,7,31,1607373226,614966863,false +2017,7,31,934160459,575488722,false +2017,7,31,431624544,1115029850,true +2017,7,31,1769994028,103473103,false +2017,7,31,2140122481,962119658,false +2017,7,31,1003926473,1722957730,true +2017,7,31,2018093110,164300407,true +2017,7,31,680156560,1547027276,false +2017,7,31,1467177913,2032395414,true +2017,8,1,909046311,1696504329,false +2017,8,1,717523027,343127437,false +2017,8,1,943296080,1603992981,true +2017,8,1,114740557,1853503372,false +2017,8,1,1638724268,2082085699,true +2017,8,1,1507263373,1482093541,false +2017,8,1,1103557979,755223409,true +2017,8,1,1538142652,276422824,true +2017,8,1,519496106,1997335616,false +2017,8,1,1470748258,380641371,true +2017,8,2,602504523,264785314,true +2017,8,2,780738545,513447637,false +2017,8,2,632319694,1277979655,true +2017,8,2,1491446709,1144590947,true +2017,8,2,1686479697,559811685,true +2017,8,2,378100759,626233016,false +2017,8,2,1476993446,1763374923,true +2017,8,2,713561034,51227485,false +2017,8,2,2133919582,2071192111,true +2017,8,2,1149335213,1122525596,false +2017,8,3,72108148,2043403596,false +2017,8,3,796101252,1172858395,false +2017,8,3,2066007188,1998760353,true +2017,8,3,1127743717,1910571166,false +2017,8,3,1075014962,1631059440,false +2017,8,3,1798400222,1241223343,true +2017,8,3,2115509740,29051597,false +2017,8,3,1928286786,288555034,true +2017,8,3,380140774,1250348321,true +2017,8,3,409329063,756751482,true +2017,8,4,2047298345,1209533357,false +2017,8,4,1494293361,554747733,true +2017,8,4,1650265384,549631849,false +2017,8,4,943409538,1197944946,false +2017,8,4,1784705532,120588256,false +2017,8,4,125321832,1257074796,true +2017,8,4,627076081,742335437,false +2017,8,4,2144049694,208660915,true +2017,8,4,900929401,2067911591,true +2017,8,4,713737149,734684285,true +2017,8,5,204367271,482447079,false +2017,8,5,2022408370,1847522890,true +2017,8,5,1544328634,1366937120,false +2017,8,5,795383113,1704329236,false +2017,8,5,1481436517,1871084359,true +2017,8,5,1546952635,366212670,true +2017,8,5,1407607250,1774720705,false +2017,8,5,1562625534,1870711768,false +2017,8,5,1475810192,765342475,false +2017,8,5,316738554,1011666655,false +2017,8,6,1479618037,987863576,true +2017,8,6,1518425861,972625270,true +2017,8,6,1153955479,857511613,false +2017,8,6,104578216,426466387,false +2017,8,6,2111452133,7195074,true +2017,8,6,376015387,296940750,false +2017,8,6,314087946,124941677,true +2017,8,6,370591485,1609444702,true +2017,8,6,107361183,375624617,true +2017,8,6,1227989294,1510279517,false +2017,8,7,2014302174,1144924688,false +2017,8,7,480346337,956446065,false +2017,8,7,1948774983,869627010,false +2017,8,7,696798003,573602795,false +2017,8,7,142319615,1328839192,false +2017,8,7,1914562288,1584374006,false +2017,8,7,325993810,640008582,false +2017,8,7,373897803,925175926,true +2017,8,7,1336924696,1991373140,true +2017,8,7,1955979700,441202651,false +2017,8,8,923268595,1662092656,false +2017,8,8,2014551830,1681413214,false +2017,8,8,1154858399,1011889930,true +2017,8,8,250496542,1456601693,true +2017,8,8,1441401032,1198623928,false +2017,8,8,1332905303,1700345042,true +2017,8,8,2115298340,784472146,true +2017,8,8,1057926470,1920525335,false +2017,8,8,534721989,1094614286,true +2017,8,8,401952201,885492090,false +2017,8,9,751163866,1232880566,true +2017,8,9,1944747876,301924683,true +2017,8,9,1613601888,1888359449,true +2017,8,9,1795277121,25255782,true +2017,8,9,2108887143,110096904,true +2017,8,9,1833487886,769802864,true +2017,8,9,189923326,337963377,true +2017,8,9,665876741,716050487,true +2017,8,9,124590057,47708291,false +2017,8,9,1333123508,154384892,true +2017,8,10,1485707422,1310281691,false +2017,8,10,484939884,328298893,false +2017,8,10,1902332489,527433766,false +2017,8,10,2010275418,920399988,true +2017,8,10,223462340,52507076,false +2017,8,10,157750198,1262463152,true +2017,8,10,325735620,44097100,true +2017,8,10,62522260,1453734577,true +2017,8,10,935141801,918422784,true +2017,8,10,936706094,799305346,true +2017,8,11,81677239,1329351312,false +2017,8,11,2092665799,1995062900,true +2017,8,11,450620065,1456419268,false +2017,8,11,72329598,79089573,false +2017,8,11,652096683,688745953,false +2017,8,11,1792712351,684527917,true +2017,8,11,1795721421,1298434273,true +2017,8,11,1792960595,455390441,false +2017,8,11,1283142305,1892353761,false +2017,8,11,852249292,501357833,false +2017,8,12,526369282,1731700035,true +2017,8,12,193558066,228573176,false +2017,8,12,274579615,253435779,true +2017,8,12,1242864385,1290680378,true +2017,8,12,1318321656,54285482,false +2017,8,12,1194080921,1434404933,false +2017,8,12,1737991803,771198948,false +2017,8,12,1893521966,46176907,false +2017,8,12,369194167,192766888,false +2017,8,12,545049946,1323402290,false +2017,8,13,2136576999,526240361,false +2017,8,13,1753703164,970693924,true +2017,8,13,31641919,1010591963,true +2017,8,13,210151988,370682496,true +2017,8,13,1162841534,552316187,true +2017,8,13,28251787,305606810,true +2017,8,13,2033194103,726166379,false +2017,8,13,1983238579,473634813,true +2017,8,13,582982030,589190207,true +2017,8,13,1940248168,805516602,false +2017,8,14,1061667929,569934022,false +2017,8,14,1187697482,460703081,true +2017,8,14,1600803686,697772602,true +2017,8,14,770088401,926425495,true +2017,8,14,759477083,393002395,true +2017,8,14,577207470,894081449,true +2017,8,14,1826445794,1834397860,true +2017,8,14,1217205804,137929554,false +2017,8,14,487648289,837359125,false +2017,8,14,1234643573,165241903,false +2017,8,15,195279988,1266925227,false +2017,8,15,1325253317,2047288874,true +2017,8,15,1965067617,210438490,false +2017,8,15,1208284656,1467255721,false +2017,8,15,63494133,542716074,false +2017,8,15,1589583816,652111172,true +2017,8,15,1971692054,1300509763,false +2017,8,15,314391622,297450729,false +2017,8,15,666258796,1467172105,true +2017,8,15,1599426688,982821564,true +2017,8,16,33571304,933832541,false +2017,8,16,1050622674,255927829,true +2017,8,16,1440087837,1446060632,true +2017,8,16,1845442766,1028964716,false +2017,8,16,554468828,63249440,false +2017,8,16,256294078,685541904,false +2017,8,16,1624063958,192299687,true +2017,8,16,1421680107,1797889275,true +2017,8,16,1241128248,1107322223,true +2017,8,16,1224021473,1980086312,false +2017,8,17,299802742,923451433,false +2017,8,17,2119051874,265538602,true +2017,8,17,873434810,1962981200,true +2017,8,17,706471638,1257255937,true +2017,8,17,725878718,862502665,false +2017,8,17,605253960,145344859,false +2017,8,17,48067066,1316987373,true +2017,8,17,1952671325,519051713,false +2017,8,17,1667432541,1144025228,true +2017,8,17,1581534399,1511497059,false +2017,8,18,276049180,321182666,false +2017,8,18,1855532022,2029947518,false +2017,8,18,735668534,721644392,true +2017,8,18,366556075,1039321234,true +2017,8,18,1768489658,1891579721,false +2017,8,18,975419324,617229541,false +2017,8,18,1254683636,1365988707,true +2017,8,18,21455539,696491520,false +2017,8,18,1354595344,432037724,false +2017,8,18,1688351615,151752236,true +2017,8,19,1471946193,1911495216,false +2017,8,19,349762683,673021210,false +2017,8,19,1594784618,2053420937,true +2017,8,19,1921460518,269184825,false +2017,8,19,977243811,1372924807,true +2017,8,19,73024800,396596588,true +2017,8,19,1630438091,1785854755,false +2017,8,19,520104515,1235962388,false +2017,8,19,1830226472,1233684099,true +2017,8,19,1497283828,296231089,true +2017,8,20,1656290015,154433479,false +2017,8,20,429630712,722377610,true +2017,8,20,1208835987,1080654400,false +2017,8,20,226778747,1009794840,false +2017,8,20,25331236,60555029,true +2017,8,20,1075405322,125680932,true +2017,8,20,436405079,323120053,true +2017,8,20,1136571302,852419033,true +2017,8,20,2068622420,89058225,true +2017,8,20,1273587779,1350654791,false +2017,8,21,1196375859,132387536,false +2017,8,21,1729033807,1912434681,false +2017,8,21,1745971734,142267770,true +2017,8,21,2085634317,617075629,false +2017,8,21,872686719,1574169808,true +2017,8,21,1705545717,362352986,true +2017,8,21,1423274213,1012388105,true +2017,8,21,1094113868,1315178500,true +2017,8,21,1886235829,1622859751,false +2017,8,21,488000696,645385223,false +2017,8,22,1433805632,1258051180,true +2017,8,22,1553432957,765423844,false +2017,8,22,1182315587,1050758160,false +2017,8,22,335928728,610328509,false +2017,8,22,328898835,2072324900,true +2017,8,22,1652601869,506428663,true +2017,8,22,1926151056,658572590,true +2017,8,22,19695083,1603799997,true +2017,8,22,369965158,2029382492,false +2017,8,22,923386107,627083363,true +2017,8,23,2057280077,1007481446,true +2017,8,23,2099318812,702261969,false +2017,8,23,982363717,2010386028,true +2017,8,23,1737783216,1895233660,true +2017,8,23,1368035582,792809150,true +2017,8,23,1292658091,1886507421,true +2017,8,23,887182319,2006418063,true +2017,8,23,2123522078,878790568,false +2017,8,23,1880909000,748603127,true +2017,8,23,152266337,568908585,true +2017,8,24,1959414850,2028026848,false +2017,8,24,804993049,1687948690,false +2017,8,24,1294496125,362672296,true +2017,8,24,714726099,2026114987,true +2017,8,24,31188327,423187099,true +2017,8,24,771076595,1595625363,false +2017,8,24,153017830,103798428,true +2017,8,24,1463262746,716001690,true +2017,8,24,471665338,404982929,true +2017,8,24,1243787355,627746142,false +2017,8,25,1420451408,63149677,true +2017,8,25,749624017,1211170032,false +2017,8,25,55563004,853700076,false +2017,8,25,767982454,171965931,true +2017,8,25,1905537370,663243617,false +2017,8,25,1805612879,767648958,false +2017,8,25,2026337981,2072390949,true +2017,8,25,1391635263,1169363109,true +2017,8,25,141378301,1140703877,false +2017,8,25,1097796902,302226167,true +2017,8,26,2033466115,673588803,true +2017,8,26,672580380,1097713367,true +2017,8,26,1158128092,773698568,true +2017,8,26,417990620,2037870198,false +2017,8,26,1895242816,1067493718,false +2017,8,26,121420482,859318612,true +2017,8,26,1101247182,1761246683,true +2017,8,26,1546307305,178834257,true +2017,8,26,1051018611,1295401012,false +2017,8,26,1207038027,8167015,false +2017,8,27,723046058,1759484806,false +2017,8,27,1679351538,2055268706,true +2017,8,27,816160669,1258997618,true +2017,8,27,129253270,1753711610,true +2017,8,27,1366373971,638277452,false +2017,8,27,33412676,205649170,false +2017,8,27,343999847,4716198,true +2017,8,27,1710111454,563573258,true +2017,8,27,246253827,1883271299,true +2017,8,27,155318419,376970857,true +2017,8,28,1020782353,25834459,true +2017,8,28,1868349598,1183953227,true +2017,8,28,1122773212,1223799598,true +2017,8,28,163159540,1006058067,true +2017,8,28,909535902,346672724,true +2017,8,28,509831844,211333655,true +2017,8,28,944060845,1525202140,true +2017,8,28,1765572888,1343513141,true +2017,8,28,1735545918,1125645446,false +2017,8,28,1939640948,1763217309,true +2017,8,29,1535911425,289696453,false +2017,8,29,1020133579,1575959985,false +2017,8,29,136069884,455620944,false +2017,8,29,1091888864,1449126727,true +2017,8,29,269735534,1841474850,false +2017,8,29,27682988,1578033846,false +2017,8,29,857214377,1341906045,false +2017,8,29,1415696101,1187621996,true +2017,8,29,1573586716,1690090445,false +2017,8,29,473079087,303200203,true +2017,8,30,887507976,1017055869,false +2017,8,30,1969034696,689336375,true +2017,8,30,550135700,88860698,false +2017,8,30,568379088,905839772,false +2017,8,30,1758156859,276518751,false +2017,8,30,1778883698,68859351,true +2017,8,30,838758604,452495349,false +2017,8,30,1502468250,402660550,false +2017,8,30,772691054,1020639506,false +2017,8,30,1361579016,278467305,false +2017,8,31,540220784,1297856061,true +2017,8,31,1491144152,1017991391,false +2017,8,31,1799684291,1541055709,true +2017,8,31,501859791,238943267,false +2017,8,31,296743026,1317004384,false +2017,8,31,437762841,1960481226,true +2017,8,31,960982397,1426355228,false +2017,8,31,2085325887,545058370,true +2017,8,31,1191638484,734755128,false +2017,8,31,418713194,1438118458,true +2017,9,1,909694320,799894589,true +2017,9,1,299147621,1506083637,false +2017,9,1,570010037,1276901669,true +2017,9,1,1442224848,1775334020,false +2017,9,1,55651496,1721763323,true +2017,9,1,1292523134,1533362059,false +2017,9,1,497217947,162493291,true +2017,9,1,1205324548,1819001513,false +2017,9,1,1912418137,1437945817,true +2017,9,1,700401874,2126980122,true +2017,9,2,296588288,1556951246,false +2017,9,2,471046003,378480291,false +2017,9,2,381746439,273319906,false +2017,9,2,366004174,1302351858,false +2017,9,2,394451487,1712185196,false +2017,9,2,1656036213,1211820272,false +2017,9,2,374049782,1003243679,false +2017,9,2,310255717,1654800002,false +2017,9,2,1347241984,1106163675,true +2017,9,2,799335884,1727006558,false +2017,9,3,328696021,1879500121,true +2017,9,3,528009641,1866558984,true +2017,9,3,1671927490,1413633923,true +2017,9,3,117055584,255985174,false +2017,9,3,1091581606,1050202512,false +2017,9,3,1062973093,766212670,false +2017,9,3,1259887913,1314212925,false +2017,9,3,1508633402,128521451,false +2017,9,3,448266993,1460481980,false +2017,9,3,111350141,472774467,true +2017,9,4,1011815016,930843949,false +2017,9,4,303072959,1101591267,true +2017,9,4,679412265,1560929091,false +2017,9,4,801489865,1902516423,true +2017,9,4,1303861041,309812445,false +2017,9,4,1743071534,298073095,false +2017,9,4,244840543,1692254651,false +2017,9,4,1710469167,1121153403,true +2017,9,4,1017360634,872501404,false +2017,9,4,593664860,954469736,false +2017,9,5,1776488998,936951650,true +2017,9,5,494727487,571889336,true +2017,9,5,742990737,351041813,true +2017,9,5,552073390,1763191667,true +2017,9,5,280793542,1752249050,false +2017,9,5,1648357780,1417458799,false +2017,9,5,1464423651,1848698931,false +2017,9,5,491138647,1310463606,true +2017,9,5,1759491237,1703840809,true +2017,9,5,2062831912,1556503889,false +2017,9,6,1020864265,198316084,false +2017,9,6,276043220,1847703922,false +2017,9,6,31218930,128129777,true +2017,9,6,1358662647,75244351,false +2017,9,6,1961751700,1180547084,true +2017,9,6,1252804261,994221860,true +2017,9,6,2098744211,13352940,true +2017,9,6,1758436824,1628652532,false +2017,9,6,1585699610,2128838909,false +2017,9,6,1118501709,535091534,false +2017,9,7,1085994578,104179115,false +2017,9,7,1282613144,2107700721,false +2017,9,7,105103361,1342454132,false +2017,9,7,280208781,312018507,true +2017,9,7,1316308547,1808751760,true +2017,9,7,11376985,743973290,true +2017,9,7,1557134220,626227494,true +2017,9,7,271781341,515580115,true +2017,9,7,1913965177,386659849,false +2017,9,7,480380458,1739428591,true +2017,9,8,1896535587,1241334379,true +2017,9,8,209459755,1219495435,false +2017,9,8,2054957801,293337871,false +2017,9,8,44877030,1108160157,true +2017,9,8,1711424488,901031131,true +2017,9,8,1266849752,758504912,false +2017,9,8,1110501278,1803114975,true +2017,9,8,1722905170,1623521414,false +2017,9,8,873499255,1553964327,false +2017,9,8,785949935,1160106309,true +2017,9,9,1869656096,1901635319,true +2017,9,9,442001522,1668347293,true +2017,9,9,192003324,964668986,false +2017,9,9,248550568,1350155163,false +2017,9,9,1324277534,901991763,false +2017,9,9,409919276,1896167014,false +2017,9,9,1783249532,1509814160,true +2017,9,9,1354656541,1051246881,false +2017,9,9,1412752995,21727721,true +2017,9,9,250341921,1188140445,true +2017,9,10,1528994396,1225034966,false +2017,9,10,254364748,230291596,true +2017,9,10,1063696120,221531747,true +2017,9,10,4706248,2026174246,true +2017,9,10,2037494583,732885700,true +2017,9,10,1552841293,1325699720,false +2017,9,10,1444835733,1695311644,false +2017,9,10,1243248075,1158419254,false +2017,9,10,304981982,1607804215,false +2017,9,10,912572117,1991846783,false +2017,9,11,2087775054,324894813,false +2017,9,11,1972952743,740956607,false +2017,9,11,885597872,13499026,false +2017,9,11,26762218,742405033,true +2017,9,11,1345770107,680322777,true +2017,9,11,801698943,1914157601,false +2017,9,11,800737052,1022834775,true +2017,9,11,1526521832,1131780437,true +2017,9,11,61636584,1667571559,true +2017,9,11,1616780608,1494178574,true +2017,9,12,283947379,773967105,false +2017,9,12,2020833461,1276256630,true +2017,9,12,1906098238,265597439,false +2017,9,12,1214796796,733187774,false +2017,9,12,1351679220,2104206116,false +2017,9,12,973185601,459741809,false +2017,9,12,1826169266,897565687,true +2017,9,12,1385350016,1986989909,true +2017,9,12,836202514,160942536,true +2017,9,12,671960129,1731286310,false +2017,9,13,1392284140,739204479,true +2017,9,13,1054744119,1468780434,true +2017,9,13,864849672,1389948418,false +2017,9,13,1554567981,1827967037,true +2017,9,13,1872451643,972191322,false +2017,9,13,1087254339,1456008121,true +2017,9,13,545399968,1501651413,false +2017,9,13,2089445495,1514717619,true +2017,9,13,1297463287,986326494,true +2017,9,13,707354841,1349355202,true +2017,9,14,393195840,1667705928,true +2017,9,14,1345832461,1174858404,false +2017,9,14,100714294,3904047,false +2017,9,14,988558302,1228464067,false +2017,9,14,1371630646,927063753,false +2017,9,14,2052606225,1116954971,true +2017,9,14,1585661363,1637841592,false +2017,9,14,171041984,1616784284,true +2017,9,14,1256538950,52112321,true +2017,9,14,205836248,215989656,false +2017,9,15,777682939,1212185067,false +2017,9,15,1355725207,2134179553,false +2017,9,15,1548321905,73038456,false +2017,9,15,151910177,1818678048,false +2017,9,15,2050853593,500062778,true +2017,9,15,1181726076,464344077,true +2017,9,15,1202789895,306115627,false +2017,9,15,277779095,1016686219,false +2017,9,15,1296261892,1553865301,false +2017,9,15,310816041,791181796,true +2017,9,16,1207804143,1208141419,false +2017,9,16,452557961,976390218,false +2017,9,16,1351269640,1014725797,true +2017,9,16,1549334738,2136372777,true +2017,9,16,173502774,1604687545,true +2017,9,16,686233833,1693893112,true +2017,9,16,1189646707,242892297,false +2017,9,16,1450327875,643934338,true +2017,9,16,1156632791,2059101023,false +2017,9,16,930621315,314147299,false +2017,9,17,1423225153,1704611792,false +2017,9,17,1120147003,1531492776,false +2017,9,17,130068682,2027441554,false +2017,9,17,1659945342,560492762,false +2017,9,17,350875273,477343991,true +2017,9,17,589198344,1325998804,true +2017,9,17,1421767760,1820920006,true +2017,9,17,125279768,629972536,false +2017,9,17,723101032,817611428,true +2017,9,17,1454124643,1332402534,false +2017,9,18,715133565,1434118993,false +2017,9,18,380304030,124628529,false +2017,9,18,1684582702,932131811,true +2017,9,18,1174012558,420997846,false +2017,9,18,86000134,298179652,true +2017,9,18,1421722027,1829467740,false +2017,9,18,317594056,1614864543,false +2017,9,18,1829102252,2020676832,true +2017,9,18,581081933,1182112389,false +2017,9,18,1879887266,817280130,false +2017,9,19,639239534,2050965476,false +2017,9,19,1126663265,931309036,false +2017,9,19,1189600427,431429937,false +2017,9,19,1976918637,1165085606,true +2017,9,19,1917172366,1018117460,false +2017,9,19,1687317675,1473982908,false +2017,9,19,128761085,78769457,true +2017,9,19,757807735,28226390,false +2017,9,19,1886633464,2128910823,true +2017,9,19,1711076607,1024452185,false +2017,9,20,818018418,1096091675,false +2017,9,20,1639826152,951551280,true +2017,9,20,1144609099,522530792,true +2017,9,20,1239352744,145788223,true +2017,9,20,1672250259,645479500,true +2017,9,20,1101620000,559109304,true +2017,9,20,575215007,1415951916,false +2017,9,20,1291747676,1452246216,true +2017,9,20,426315924,1542669891,true +2017,9,20,532343671,1061764441,true +2017,9,21,1782576756,120315541,true +2017,9,21,1323212320,1892695511,true +2017,9,21,499536602,633493312,false +2017,9,21,1165002547,1132450489,true +2017,9,21,1517125851,1488645062,true +2017,9,21,1466083166,711526853,true +2017,9,21,1764601037,179789085,false +2017,9,21,2028260810,769613634,true +2017,9,21,814799587,493165096,true +2017,9,21,2127291716,1424704657,true +2017,9,22,31601631,2096923500,true +2017,9,22,32362329,1796316398,true +2017,9,22,389254983,42825758,true +2017,9,22,1567466847,1203185475,false +2017,9,22,1325020893,319531475,false +2017,9,22,978532453,1492767826,true +2017,9,22,1084539222,1104349572,true +2017,9,22,910270585,58642081,true +2017,9,22,937407094,858114980,true +2017,9,22,374597841,629968366,true +2017,9,23,652074274,1548178030,false +2017,9,23,405944752,157696534,false +2017,9,23,691560457,863802560,true +2017,9,23,2013567519,516309428,true +2017,9,23,1057755117,968336671,false +2017,9,23,2125943512,828671711,true +2017,9,23,1844857524,2067439534,true +2017,9,23,144601323,1982116054,false +2017,9,23,5772496,1680985927,false +2017,9,23,277197330,1678510968,false +2017,9,24,1866339994,1034819831,true +2017,9,24,433952166,142005717,true +2017,9,24,1812996526,1045037698,true +2017,9,24,2077242541,550434631,true +2017,9,24,1107846772,1313653701,false +2017,9,24,777334927,844338888,true +2017,9,24,567148633,780211214,true +2017,9,24,1479887677,1064184455,false +2017,9,24,1009257307,458726332,true +2017,9,24,224355575,206546409,false +2017,9,25,1356177410,314535883,true +2017,9,25,539654906,1103624822,false +2017,9,25,367979221,584028184,true +2017,9,25,625813430,1985952869,true +2017,9,25,821471974,124410489,false +2017,9,25,868933724,1055782338,true +2017,9,25,1212663582,997463048,false +2017,9,25,1072893160,703634277,true +2017,9,25,1291185207,1701207853,true +2017,9,25,1587343092,696728882,false +2017,9,26,1929598392,384270409,true +2017,9,26,434465005,343441241,true +2017,9,26,1328124489,1822622740,false +2017,9,26,913238669,911162513,true +2017,9,26,579555382,873759239,true +2017,9,26,115826848,581038469,true +2017,9,26,986028287,349250157,false +2017,9,26,68212099,70701025,true +2017,9,26,1903061828,248189681,true +2017,9,26,1434305751,1968645070,false +2017,9,27,1637083252,115768769,false +2017,9,27,615895139,2056265917,true +2017,9,27,1468293115,746763931,false +2017,9,27,1400886556,1873363653,true +2017,9,27,1522398190,2135848548,true +2017,9,27,1432757956,50055182,true +2017,9,27,1073836900,1803745560,true +2017,9,27,86328746,1299802613,false +2017,9,27,1948509439,1432417073,true +2017,9,27,1046906213,1221066738,true +2017,9,28,1744119883,1654811352,true +2017,9,28,45229226,1357719146,false +2017,9,28,1220517629,1210965634,true +2017,9,28,1947211144,1046016489,false +2017,9,28,1772095302,429364672,false +2017,9,28,1438546607,1315779458,false +2017,9,28,1543513765,2115818094,true +2017,9,28,1384675727,524512578,false +2017,9,28,2097719557,1438558088,true +2017,9,28,1097908847,1584225784,true +2017,9,29,1747204619,2072891463,false +2017,9,29,1800922836,323064261,false +2017,9,29,1300262620,1249496409,false +2017,9,29,1746375620,399030574,true +2017,9,29,1419846854,802234195,false +2017,9,29,1011275982,525548195,false +2017,9,29,748459406,1345420455,false +2017,9,29,1604328931,1867879099,false +2017,9,29,1976900697,783396553,false +2017,9,29,527702370,661796251,true +2017,9,30,1875414620,917841228,false +2017,9,30,1018449116,532205621,true +2017,9,30,14856420,850302735,false +2017,9,30,444013190,1619300973,true +2017,9,30,1547881039,815007974,true +2017,9,30,85243474,1773183809,false +2017,9,30,1239227915,673027635,false +2017,9,30,652336412,2145752708,true +2017,9,30,1347249090,1245262706,false +2017,9,30,206023035,390774744,false +2017,9,31,2141423655,789160872,false +2017,9,31,1053876494,137896159,true +2017,9,31,1709199923,1919209770,false +2017,9,31,221773403,637698268,false +2017,9,31,1143697836,1073084227,false +2017,9,31,1571740326,1786481217,true +2017,9,31,1434129483,390618524,true +2017,9,31,253460541,1926735579,true +2017,9,31,566818011,1375674229,false +2017,9,31,1965369648,2060340633,true +2017,10,1,219151403,223504395,false +2017,10,1,1462898546,38706466,true +2017,10,1,1786613161,107021001,true +2017,10,1,1101089490,1781007540,true +2017,10,1,296454649,1608116491,false +2017,10,1,2061751886,993489158,false +2017,10,1,1169981423,1366419617,false +2017,10,1,1579338155,366904610,true +2017,10,1,414089388,1368941912,false +2017,10,1,431369568,139519855,true +2017,10,2,1604724179,1390929197,true +2017,10,2,1567383083,1530931558,false +2017,10,2,963748116,1887263223,true +2017,10,2,2075921137,547445926,false +2017,10,2,1790554742,1644119212,false +2017,10,2,745947827,355439202,true +2017,10,2,3553069,1864150416,true +2017,10,2,1120934336,720329312,false +2017,10,2,380056799,756404711,false +2017,10,2,945747054,599485697,false +2017,10,3,161408421,1796023050,false +2017,10,3,575394556,1952452619,false +2017,10,3,880928552,2059932651,false +2017,10,3,508747118,241633318,false +2017,10,3,282090411,1413155176,true +2017,10,3,491796354,388634968,true +2017,10,3,1985287656,64151302,true +2017,10,3,1731896330,2038351573,true +2017,10,3,470409467,2059314404,true +2017,10,3,777127000,851394313,false +2017,10,4,1447832544,1143788805,false +2017,10,4,280173431,1369815713,true +2017,10,4,977396775,141364554,true +2017,10,4,1588887145,1695555068,false +2017,10,4,209235433,1714991276,false +2017,10,4,2048987558,386396758,false +2017,10,4,1113257768,1046314391,false +2017,10,4,1108554115,584638585,false +2017,10,4,839332637,1205500022,true +2017,10,4,81688157,2005354727,false +2017,10,5,1478092325,373051602,false +2017,10,5,1638678159,494826936,false +2017,10,5,1906746693,1709110415,true +2017,10,5,449157017,487506074,true +2017,10,5,1852167299,12593194,true +2017,10,5,1751782671,1177023415,false +2017,10,5,1732466052,1774584847,true +2017,10,5,1677881123,2000336498,true +2017,10,5,1204259886,2133653073,false +2017,10,5,714388446,162533179,true +2017,10,6,423735618,2011016568,false +2017,10,6,1028734321,1576171998,false +2017,10,6,2010475302,449047577,false +2017,10,6,1819790627,1577506339,false +2017,10,6,324707162,1526007410,true +2017,10,6,938274239,1235883168,true +2017,10,6,1887279948,328106572,true +2017,10,6,1686923670,434378741,true +2017,10,6,2069753223,251875013,true +2017,10,6,1302789998,1711886079,true +2017,10,7,791707759,358489297,false +2017,10,7,1328670455,830417059,true +2017,10,7,581764620,405854872,true +2017,10,7,1932914845,1625766961,true +2017,10,7,235505810,1600193276,false +2017,10,7,1325100492,1621158302,true +2017,10,7,1350256035,1190963176,true +2017,10,7,1106667014,1109908227,true +2017,10,7,645712513,1435543862,false +2017,10,7,743193178,2047952158,true +2017,10,8,1225159081,537049932,false +2017,10,8,1046907931,357799627,true +2017,10,8,335814292,750553038,true +2017,10,8,346969319,676956080,true +2017,10,8,2035084936,1298964456,false +2017,10,8,1925868897,1274459280,false +2017,10,8,1745264066,1196423966,false +2017,10,8,490509614,1918452736,true +2017,10,8,92612196,1553376943,false +2017,10,8,317405709,473495671,true +2017,10,9,1526837993,1667819746,true +2017,10,9,1055001891,159100137,true +2017,10,9,654569482,1127334109,true +2017,10,9,134397149,569118577,false +2017,10,9,2081630812,531113685,true +2017,10,9,508966633,1840746939,true +2017,10,9,2090205717,1197674932,false +2017,10,9,937392440,633964835,false +2017,10,9,2108156846,1284759182,true +2017,10,9,1742635536,523004108,false +2017,10,10,1792196194,358060918,false +2017,10,10,828714633,1728015894,false +2017,10,10,511920574,1773103770,false +2017,10,10,658185236,610473026,true +2017,10,10,1893523895,229862308,true +2017,10,10,2013314265,1656536608,true +2017,10,10,1473189124,2027120858,true +2017,10,10,2046693394,154750239,true +2017,10,10,2107630511,1248971381,false +2017,10,10,1085646508,1953325843,false +2017,10,11,1571609627,622067230,true +2017,10,11,220273101,2072699021,false +2017,10,11,975114986,1987277209,true +2017,10,11,592882718,1811899300,false +2017,10,11,2115307151,904038147,true +2017,10,11,692045793,1376004972,false +2017,10,11,1394456946,1985953328,false +2017,10,11,310524733,1310810782,true +2017,10,11,200324584,1213411850,true +2017,10,11,1655228742,91062298,true +2017,10,12,1199953264,2061062661,false +2017,10,12,576631209,1496102472,true +2017,10,12,1082628343,684080890,true +2017,10,12,1858241032,258957955,false +2017,10,12,1706897936,1941092368,false +2017,10,12,34712426,1755030547,false +2017,10,12,1976769888,2078793592,false +2017,10,12,1581531974,1099679337,false +2017,10,12,13561009,1825349675,false +2017,10,12,197712658,1019645065,true +2017,10,13,1598860136,1078523108,true +2017,10,13,516342376,772848628,true +2017,10,13,456241746,658476563,false +2017,10,13,1814177556,627643250,true +2017,10,13,782030849,1593714551,true +2017,10,13,1567337381,1230845565,false +2017,10,13,943800633,656938914,true +2017,10,13,1360267363,1063424072,true +2017,10,13,74299668,1257902746,false +2017,10,13,991259573,1177606551,true +2017,10,14,729545446,1871187531,false +2017,10,14,57042511,1501778654,true +2017,10,14,2110441486,427459209,false +2017,10,14,1612549330,304871520,true +2017,10,14,1434161719,489959383,true +2017,10,14,832007068,759334988,true +2017,10,14,1512898222,186646313,true +2017,10,14,1693008629,576533254,false +2017,10,14,380772829,1997257310,false +2017,10,14,1894715620,925813789,false +2017,10,15,131209742,1784815655,true +2017,10,15,1432945487,1090412189,true +2017,10,15,1869821694,804814729,false +2017,10,15,919033903,2081272253,false +2017,10,15,831139823,709518930,true +2017,10,15,1253655302,1066574624,false +2017,10,15,1878694956,1840468743,false +2017,10,15,1777235409,321704477,true +2017,10,15,889127497,1156748857,false +2017,10,15,1191164209,968096709,false +2017,10,16,842592390,2034364493,false +2017,10,16,130245252,413215144,false +2017,10,16,551337235,1836240904,true +2017,10,16,528597014,122062214,false +2017,10,16,376461101,853062546,true +2017,10,16,193237702,861393515,false +2017,10,16,427035216,1777787040,true +2017,10,16,1594732963,381437933,true +2017,10,16,727304938,1031520778,false +2017,10,16,1702612852,791504512,true +2017,10,17,1319741531,31033335,true +2017,10,17,308960043,1406428191,true +2017,10,17,1592123934,917011809,true +2017,10,17,1820421031,587175612,false +2017,10,17,965734004,1042788527,true +2017,10,17,1758782562,319608364,false +2017,10,17,1844580074,1857642108,false +2017,10,17,469521359,437698969,true +2017,10,17,1092283274,705484025,false +2017,10,17,1592182171,2014721174,false +2017,10,18,868067231,1323012972,true +2017,10,18,1816900111,1872467558,false +2017,10,18,1983908683,1772424363,true +2017,10,18,1238018368,607984160,false +2017,10,18,1841841000,479132851,true +2017,10,18,682238936,911151277,true +2017,10,18,113143146,2043183264,false +2017,10,18,1722832019,53290457,true +2017,10,18,1246574390,1745249723,true +2017,10,18,1431944393,782850021,true +2017,10,19,889543472,265764689,false +2017,10,19,104133540,250461544,false +2017,10,19,1996898029,1435964563,true +2017,10,19,1082348230,102649690,false +2017,10,19,1634895687,781954738,false +2017,10,19,1479959698,1371602330,true +2017,10,19,1065459405,1643908332,false +2017,10,19,1894436695,1137500761,true +2017,10,19,2008155074,709171022,true +2017,10,19,830823942,601982249,true +2017,10,20,1181108515,1623368275,false +2017,10,20,1152221763,1301305504,true +2017,10,20,1218845296,176358566,false +2017,10,20,761498861,775685583,true +2017,10,20,934990896,50542001,true +2017,10,20,686731629,210321157,true +2017,10,20,270889826,1143830215,false +2017,10,20,804259189,275629964,false +2017,10,20,2078952538,1305623041,true +2017,10,20,249684463,771600485,true +2017,10,21,475149755,1692737834,true +2017,10,21,1339401034,1237944884,true +2017,10,21,1322642,1121639504,false +2017,10,21,1279891061,1562839760,true +2017,10,21,1821183672,2146579152,false +2017,10,21,1811690771,1093169468,false +2017,10,21,1886590725,764799738,false +2017,10,21,1384414939,946167026,true +2017,10,21,1045646080,1839162270,true +2017,10,21,1759940391,941417925,true +2017,10,22,697298951,1003280419,true +2017,10,22,2028098826,413079493,true +2017,10,22,552562221,1923236599,false +2017,10,22,403497365,1106877162,true +2017,10,22,1556983100,1272050548,true +2017,10,22,443867920,2022559862,false +2017,10,22,2080896154,1561396120,false +2017,10,22,1428417875,1346838599,false +2017,10,22,685498120,1431058200,true +2017,10,22,371781247,597630612,false +2017,10,23,1393044336,2100379833,false +2017,10,23,1092547781,1257680586,true +2017,10,23,1057794618,98976759,false +2017,10,23,786826504,488825472,true +2017,10,23,1046170348,55736417,false +2017,10,23,1544782706,1696668606,false +2017,10,23,799338485,954034481,false +2017,10,23,704865880,16678929,false +2017,10,23,96091891,1579824905,true +2017,10,23,1564174594,1168973334,false +2017,10,24,21213663,1385616093,false +2017,10,24,1774946098,457628039,false +2017,10,24,1520486872,1476866164,true +2017,10,24,778208124,806375926,true +2017,10,24,917503628,1464220675,false +2017,10,24,693037518,205067954,true +2017,10,24,1906240876,1105969614,false +2017,10,24,492516542,180528776,false +2017,10,24,318435885,1083359192,true +2017,10,24,866797425,2118943843,false +2017,10,25,451397192,1400346884,false +2017,10,25,1172152124,153887229,false +2017,10,25,170760053,826517049,false +2017,10,25,878921549,80091156,true +2017,10,25,544687447,448494947,false +2017,10,25,532350461,1678845922,false +2017,10,25,1827714252,446187370,false +2017,10,25,232406683,2123391955,true +2017,10,25,682390439,2028041588,false +2017,10,25,332393427,537505189,true +2017,10,26,469684664,1011049232,true +2017,10,26,2001122807,1935738485,true +2017,10,26,1636183153,88000023,false +2017,10,26,752668122,61081960,true +2017,10,26,1425997960,22251116,false +2017,10,26,847753212,2091312120,false +2017,10,26,1896642918,288130783,true +2017,10,26,1940297012,1488209669,true +2017,10,26,1776367683,417291738,false +2017,10,26,612363553,1620750904,false +2017,10,27,1696136509,1898366893,false +2017,10,27,1857362947,473771874,false +2017,10,27,775637712,609619338,true +2017,10,27,1845602514,408266198,true +2017,10,27,1575295098,611419024,true +2017,10,27,418865575,949513719,false +2017,10,27,1740096359,1712909693,true +2017,10,27,951356370,1514687334,true +2017,10,27,1573172352,824040668,false +2017,10,27,247614010,474448029,false +2017,10,28,799422776,1422214339,true +2017,10,28,1747177425,1292753112,false +2017,10,28,1535172826,1208049288,false +2017,10,28,692585800,1869045145,true +2017,10,28,409395802,1234296819,true +2017,10,28,1686229065,1671856937,false +2017,10,28,1847439047,2017738446,true +2017,10,28,1561032660,945617897,false +2017,10,28,33649398,1973264107,false +2017,10,28,535397534,1205707851,true +2017,10,29,381351684,1440407294,true +2017,10,29,1664370056,3746254,false +2017,10,29,1375311028,1245795786,false +2017,10,29,557686646,575061330,false +2017,10,29,1777333726,650773666,false +2017,10,29,256202043,1992711578,true +2017,10,29,60916326,1319673131,false +2017,10,29,1584942934,219062496,true +2017,10,29,733866359,431785944,true +2017,10,29,500231337,1601399996,true +2017,10,30,335909876,1482845604,false +2017,10,30,344841342,1746686418,false +2017,10,30,171843160,443839212,false +2017,10,30,1285209966,1234851081,true +2017,10,30,1196541575,8626814,false +2017,10,30,1647288327,2112315948,true +2017,10,30,566498872,1866518325,true +2017,10,30,208244036,37389389,true +2017,10,30,1501859066,1385873334,false +2017,10,30,2114195514,557833795,true +2017,10,31,596423948,52206386,true +2017,10,31,399977342,2036484626,true +2017,10,31,1369470078,1230346036,false +2017,10,31,1972100659,1631981610,true +2017,10,31,209690704,867345892,true +2017,10,31,200742341,1551065008,true +2017,10,31,1771596172,93759818,true +2017,10,31,2032885943,708749150,true +2017,10,31,910179476,2121616238,false +2017,10,31,1055973905,778744470,false +2017,11,1,505981200,1930517617,false +2017,11,1,283822162,1007292597,false +2017,11,1,1045834,1300778732,false +2017,11,1,1357722603,580639960,false +2017,11,1,1559275988,144271423,false +2017,11,1,1706481845,705432196,true +2017,11,1,1360247502,2040941969,false +2017,11,1,1144064737,1260713977,true +2017,11,1,347114026,627249068,false +2017,11,1,159327054,142786832,false +2017,11,2,816505527,1655867141,false +2017,11,2,356288586,1426044410,false +2017,11,2,1472646622,1992230534,false +2017,11,2,1186856745,954924856,false +2017,11,2,904779357,176879357,false +2017,11,2,980912188,104918179,false +2017,11,2,1278310491,1856380466,false +2017,11,2,721533447,911818592,false +2017,11,2,459226027,640547867,true +2017,11,2,1739041227,571473681,false +2017,11,3,1520946973,1508775333,true +2017,11,3,1630912189,1675995297,false +2017,11,3,26903591,497158152,true +2017,11,3,1264099972,1197724032,false +2017,11,3,1210077271,679618723,true +2017,11,3,1827515599,1319846959,true +2017,11,3,1905724991,1820660944,true +2017,11,3,1992882473,232993627,false +2017,11,3,109095397,699873846,true +2017,11,3,1360239949,2057893642,true +2017,11,4,1060062423,2029837182,true +2017,11,4,319980462,899869570,false +2017,11,4,1116454004,1712109170,false +2017,11,4,1062605038,575474665,false +2017,11,4,893348112,1982859256,false +2017,11,4,593970292,86233819,false +2017,11,4,1118041840,94406503,false +2017,11,4,868828010,198859108,false +2017,11,4,1344270938,800843293,true +2017,11,4,731415236,724019484,false +2017,11,5,333878830,1519237479,false +2017,11,5,782426141,174608480,false +2017,11,5,2143006909,369181725,false +2017,11,5,1324825231,301174743,true +2017,11,5,1572771035,1335536250,true +2017,11,5,1711690714,37584576,false +2017,11,5,1783242084,887522840,true +2017,11,5,624050021,1570985927,true +2017,11,5,1603557284,558162206,true +2017,11,5,1489188876,2112770634,true +2017,11,6,1376421226,151791354,true +2017,11,6,1013427892,1367936090,false +2017,11,6,82928040,10348100,true +2017,11,6,703501075,2067802072,false +2017,11,6,494729962,1202463382,true +2017,11,6,1215073602,2138597759,false +2017,11,6,1519230235,2144190087,true +2017,11,6,1638066654,278065856,true +2017,11,6,146946114,2092172546,false +2017,11,6,1922477037,160311217,true +2017,11,7,1183188027,2125258415,true +2017,11,7,1513250501,767690885,true +2017,11,7,1514398119,1985780531,true +2017,11,7,1381642145,1774144068,true +2017,11,7,1202894456,637581158,false +2017,11,7,1072781738,1514593471,true +2017,11,7,583717102,1590788313,false +2017,11,7,1628264602,1417993200,true +2017,11,7,1207223776,1622881338,false +2017,11,7,364344655,237943112,true +2017,11,8,1408849817,1889929084,false +2017,11,8,1768056028,59557276,true +2017,11,8,597838025,258918867,true +2017,11,8,1585950202,1214693033,true +2017,11,8,1979425569,723268312,true +2017,11,8,283159024,1287170924,false +2017,11,8,1630632897,1781919606,false +2017,11,8,247467335,1262467268,true +2017,11,8,1349037956,1424344694,true +2017,11,8,1310409185,387972109,false +2017,11,9,1881800635,64364474,false +2017,11,9,704401229,1947571138,true +2017,11,9,822244366,2017894539,true +2017,11,9,1825892214,1703059008,true +2017,11,9,1903526003,489405577,true +2017,11,9,1606302965,1157397824,false +2017,11,9,126105196,261807637,false +2017,11,9,1223548337,433973414,true +2017,11,9,497504034,1160606298,true +2017,11,9,892450711,1266154933,true +2017,11,10,641347567,242910260,true +2017,11,10,622984840,1122357195,true +2017,11,10,2080161128,257095341,false +2017,11,10,2000794070,1036949031,true +2017,11,10,502657309,1195879111,true +2017,11,10,176311523,406877813,false +2017,11,10,921584758,1126448583,true +2017,11,10,720277950,633777679,true +2017,11,10,1331113854,1344496712,true +2017,11,10,1685497303,1472891950,true +2017,11,11,1835779196,862684079,true +2017,11,11,1403893788,1002429834,true +2017,11,11,93383076,709943016,true +2017,11,11,573664545,2065638928,true +2017,11,11,848062546,758498410,false +2017,11,11,1171073309,1013833665,true +2017,11,11,1175391612,892697326,true +2017,11,11,1964527845,1667104209,false +2017,11,11,1423243626,96108403,false +2017,11,11,1722126890,1580775834,true +2017,11,12,652807381,1202180685,true +2017,11,12,1392148647,938951439,true +2017,11,12,578662390,1974264445,true +2017,11,12,51390723,2108678304,true +2017,11,12,304221231,140438917,false +2017,11,12,373158794,240937693,true +2017,11,12,1569901398,2141042541,true +2017,11,12,257761306,1389653553,true +2017,11,12,664246214,1716396477,false +2017,11,12,1824168393,840542606,false +2017,11,13,1772715716,1228321170,true +2017,11,13,942320690,1065396967,false +2017,11,13,694946526,2094093341,true +2017,11,13,453219700,1766589120,false +2017,11,13,1902968387,1668609575,false +2017,11,13,708564793,304184904,true +2017,11,13,1760695632,318807665,true +2017,11,13,1688458860,518265773,true +2017,11,13,616413125,669121548,false +2017,11,13,503946339,59805361,true +2017,11,14,1390611286,311529723,false +2017,11,14,2106897825,1693334446,true +2017,11,14,482280604,1465851584,true +2017,11,14,998246791,593763712,true +2017,11,14,2067393427,1452772761,false +2017,11,14,892496778,594290449,false +2017,11,14,306044707,766083234,true +2017,11,14,1215062462,1069280129,true +2017,11,14,1235830630,377223192,true +2017,11,14,1516994021,242073629,true +2017,11,15,412444845,1653151849,true +2017,11,15,969964858,1420865948,false +2017,11,15,965376369,988428248,true +2017,11,15,320700593,507615843,false +2017,11,15,2078034862,837223905,false +2017,11,15,1524294245,2018900413,false +2017,11,15,26793572,1516947105,true +2017,11,15,1220724328,1628367943,true +2017,11,15,1090012713,906791400,false +2017,11,15,1961514042,452224003,false +2017,11,16,319538191,1866713284,false +2017,11,16,1107417986,607029966,false +2017,11,16,1903388761,911031720,false +2017,11,16,537317864,1812243500,false +2017,11,16,1257169576,153938550,true +2017,11,16,1050208551,1600707497,true +2017,11,16,936956836,932996374,true +2017,11,16,64570277,734118720,true +2017,11,16,774878297,12025449,false +2017,11,16,1220827956,588966275,false +2017,11,17,204801469,658929836,true +2017,11,17,1940279504,455307645,false +2017,11,17,1190948622,960174509,true +2017,11,17,534865265,1584203400,true +2017,11,17,123206680,159203939,true +2017,11,17,224926538,1492612228,true +2017,11,17,318760061,904014022,true +2017,11,17,1927955472,1652140343,false +2017,11,17,1860549523,819341812,false +2017,11,17,840293398,384007618,true +2017,11,18,738372726,408500749,false +2017,11,18,1103319913,2093182825,false +2017,11,18,1079773282,1216842831,true +2017,11,18,481795376,1150743840,true +2017,11,18,1640103277,820443468,true +2017,11,18,1902467829,887452577,true +2017,11,18,1312447035,1683700316,true +2017,11,18,474975403,1050575855,false +2017,11,18,983621446,965875862,false +2017,11,18,201500552,692150422,true +2017,11,19,1420515524,1848736497,false +2017,11,19,187278606,694250510,true +2017,11,19,1266197600,95789668,true +2017,11,19,177328339,584249087,false +2017,11,19,2058185786,483470356,true +2017,11,19,1376102128,1678964574,true +2017,11,19,58316005,909865206,true +2017,11,19,2079475226,1535162782,false +2017,11,19,1174548067,836689778,true +2017,11,19,1477353426,856722279,true +2017,11,20,168228944,1978124008,false +2017,11,20,509869618,578720656,true +2017,11,20,727893941,2020478496,true +2017,11,20,787380797,964635980,true +2017,11,20,804710719,1911571731,true +2017,11,20,495773375,970837203,false +2017,11,20,903447752,1777647505,false +2017,11,20,230904605,927039980,false +2017,11,20,1979490171,2126756263,true +2017,11,20,1359212203,360452377,true +2017,11,21,669986810,1606107096,true +2017,11,21,2073006172,1859229274,true +2017,11,21,1833868357,1756333190,true +2017,11,21,412272617,1482804428,true +2017,11,21,1809120097,1248772383,false +2017,11,21,321713347,241695872,true +2017,11,21,273704164,1496485091,true +2017,11,21,1934933100,1082999906,true +2017,11,21,1696204534,1804010599,false +2017,11,21,2098605325,2103707663,false +2017,11,22,1563735916,2106393628,true +2017,11,22,818557538,73386467,false +2017,11,22,185454335,1945396610,true +2017,11,22,2143392333,1381971882,false +2017,11,22,1720673745,1991733716,true +2017,11,22,1042660278,1185332408,false +2017,11,22,921636541,206366286,true +2017,11,22,51060098,1408767571,true +2017,11,22,198837036,1201849685,false +2017,11,22,297848005,818560950,true +2017,11,23,1756423000,330044723,false +2017,11,23,931842169,1718558998,false +2017,11,23,541511500,1773561893,true +2017,11,23,705394272,1548210539,false +2017,11,23,1622555399,519236320,true +2017,11,23,485480419,1592425716,true +2017,11,23,1596034176,1513131495,true +2017,11,23,1873526530,260518568,true +2017,11,23,1218829444,1498020674,false +2017,11,23,1003539126,1658554642,true +2017,11,24,1152265124,43531636,true +2017,11,24,454981459,1821132623,false +2017,11,24,161977008,469341637,true +2017,11,24,621996745,1255381283,true +2017,11,24,1060207211,2081935162,true +2017,11,24,1883547101,877190495,false +2017,11,24,792021122,1176023247,false +2017,11,24,1012644523,1280533523,false +2017,11,24,1608007660,1050270932,true +2017,11,24,1025369613,1167951421,true +2017,11,25,1936793909,1199357050,true +2017,11,25,1227472068,1510666997,true +2017,11,25,1556444015,841080299,true +2017,11,25,1874941550,1169223237,true +2017,11,25,2028198066,1291434129,true +2017,11,25,209276847,1708530735,true +2017,11,25,519076217,1884614221,true +2017,11,25,1236529697,376004242,false +2017,11,25,1038415932,966994381,true +2017,11,25,746429399,935581413,true +2017,11,26,1978176277,1723758923,false +2017,11,26,256143776,1103385866,true +2017,11,26,66628443,1755040337,false +2017,11,26,775212624,1776612003,true +2017,11,26,1128983398,1297073752,false +2017,11,26,131731124,331636397,true +2017,11,26,157630730,1079710051,false +2017,11,26,716840185,1572405070,true +2017,11,26,1308421995,1495835360,false +2017,11,26,191154100,854958110,false +2017,11,27,1768203556,845475437,true +2017,11,27,762269647,236351821,false +2017,11,27,1816976401,1193452902,false +2017,11,27,547028479,2136848498,false +2017,11,27,1743019533,2088850229,true +2017,11,27,631950228,1671253492,false +2017,11,27,976662143,1428145633,true +2017,11,27,581536701,1665919545,false +2017,11,27,218159773,567175512,true +2017,11,27,63330720,1650584625,true +2017,11,28,812901021,1241532248,true +2017,11,28,1789963474,120239820,false +2017,11,28,1116675154,902926092,true +2017,11,28,594946311,1774687610,true +2017,11,28,670084070,61025980,true +2017,11,28,1427724067,594302934,true +2017,11,28,794429878,1899001623,true +2017,11,28,896976109,1673073398,false +2017,11,28,1308726384,632275718,true +2017,11,28,196861453,137347876,false +2017,11,29,678884621,842860554,false +2017,11,29,1445198886,1120891626,false +2017,11,29,1585491514,1818088081,false +2017,11,29,2134506786,462169265,false +2017,11,29,629089092,842782403,false +2017,11,29,1185543698,2009436775,true +2017,11,29,1678885557,155564474,true +2017,11,29,1125361191,734853869,false +2017,11,29,1624399638,1720080419,false +2017,11,29,1887236279,168900403,false +2017,11,30,1920019605,2125489846,false +2017,11,30,1142894155,1418805582,true +2017,11,30,1837687937,1637819916,true +2017,11,30,266036255,2111882741,true +2017,11,30,622938074,355129012,false +2017,11,30,1600829097,452220186,true +2017,11,30,430518377,390067102,true +2017,11,30,1871808245,2001658020,false +2017,11,30,1816568857,814204979,false +2017,11,30,1667623936,937918579,false +2017,11,31,13664394,1184238770,false +2017,11,31,1464103003,798718258,true +2017,11,31,1280009488,440245926,false +2017,11,31,1170766257,519002020,false +2017,11,31,570625955,2055961751,true +2017,11,31,997610255,2044994676,false +2017,11,31,154565994,1886237510,false +2017,11,31,2033896442,751850667,false +2017,11,31,118184417,543176716,true +2017,11,31,1820102078,829143381,false +2018,1,1,1321447376,673025847,false +2018,1,1,227802482,2073713927,true +2018,1,1,1290663306,473425534,false +2018,1,1,889425063,2017691708,true +2018,1,1,947225137,1333906142,false +2018,1,1,1070140357,976428484,false +2018,1,1,1740753899,282994307,false +2018,1,1,1388034164,1918285499,false +2018,1,1,1263701842,267226754,true +2018,1,1,179220401,456878532,false +2018,1,2,1626521220,790391190,false +2018,1,2,1906699861,2058521465,false +2018,1,2,201581208,1732833314,false +2018,1,2,984642624,741606992,false +2018,1,2,701957233,1490141973,false +2018,1,2,1299004767,1336718946,true +2018,1,2,433736213,2082187926,false +2018,1,2,735897927,1213895135,false +2018,1,2,1439550492,399394425,false +2018,1,2,160359716,175825165,false +2018,1,3,1075628754,377547241,true +2018,1,3,1046482207,756511013,false +2018,1,3,454552865,1468059861,true +2018,1,3,1644220819,1613566830,true +2018,1,3,355115869,821831926,false +2018,1,3,1492797859,776072799,false +2018,1,3,411154050,1443846545,false +2018,1,3,2010255378,688306365,false +2018,1,3,1085245587,1063672897,true +2018,1,3,108241003,154662622,false +2018,1,4,1543972108,1421289063,true +2018,1,4,422050266,1344819366,false +2018,1,4,1235814574,1967515160,true +2018,1,4,2042094152,755704531,false +2018,1,4,471469578,59997855,false +2018,1,4,524160915,1956518066,false +2018,1,4,814235794,958074859,true +2018,1,4,124915199,1981084506,false +2018,1,4,643566120,1768339847,true +2018,1,4,2028847664,1450588661,true +2018,1,5,462638932,79561960,false +2018,1,5,48378441,979399369,false +2018,1,5,599010474,1744487477,false +2018,1,5,301901068,1522222842,false +2018,1,5,1779315724,622496219,true +2018,1,5,1502244619,750756667,false +2018,1,5,1995100878,1617132161,true +2018,1,5,1482081606,5641708,true +2018,1,5,937624029,1252060740,true +2018,1,5,1433203154,207054679,false +2018,1,6,1889802110,589197118,false +2018,1,6,1029055066,439108358,false +2018,1,6,1035834946,1357574842,true +2018,1,6,53348754,2016203110,false +2018,1,6,770422587,637075384,true +2018,1,6,1282900993,1619978045,true +2018,1,6,991378843,1956028544,false +2018,1,6,1606593952,671810709,false +2018,1,6,1439395910,672058995,false +2018,1,6,368847178,1366773547,true +2018,1,7,1455616504,997740134,false +2018,1,7,195367918,3216519,false +2018,1,7,987179792,1381499170,true +2018,1,7,103746809,342212159,false +2018,1,7,1768628261,977293026,false +2018,1,7,1326854790,568037151,true +2018,1,7,527988452,878697957,false +2018,1,7,1424997153,1748535844,true +2018,1,7,710811382,1019741038,false +2018,1,7,385632329,439196812,false +2018,1,8,55387859,135898056,true +2018,1,8,116651005,255859311,true +2018,1,8,473875668,2024624421,false +2018,1,8,867420393,1769595286,true +2018,1,8,1164122849,413317752,true +2018,1,8,1070241811,1282316429,true +2018,1,8,720553865,770880153,false +2018,1,8,1575569443,1085509671,false +2018,1,8,1723695470,1163302216,false +2018,1,8,442603916,872963051,true +2018,1,9,683201096,2084922103,false +2018,1,9,1217413186,1612993053,false +2018,1,9,2057182439,136963428,true +2018,1,9,1524887246,1055678055,true +2018,1,9,602701312,1183997535,true +2018,1,9,495177681,61078095,true +2018,1,9,180836782,1661990740,true +2018,1,9,24363611,1906984183,false +2018,1,9,2078709731,116798333,false +2018,1,9,328452768,867903454,false +2018,1,10,1704083526,1064720309,false +2018,1,10,1053842415,1147521176,false +2018,1,10,668754481,873652844,true +2018,1,10,468826619,1251100606,false +2018,1,10,2122682308,1407720243,true +2018,1,10,866025172,1581091706,true +2018,1,10,2027993931,1269974220,true +2018,1,10,1365157925,227713156,true +2018,1,10,408521842,569075146,false +2018,1,10,504183177,1120773738,false +2018,1,11,1450596084,915585825,true +2018,1,11,409195291,594498674,false +2018,1,11,1027493573,1080626493,false +2018,1,11,154165907,716470912,false +2018,1,11,1304065514,999234416,false +2018,1,11,344302215,271091136,false +2018,1,11,1209048705,2024239831,false +2018,1,11,1390755491,1559925513,true +2018,1,11,1951647654,1312955669,false +2018,1,11,1845591298,264440109,true +2018,1,12,724280875,1683419234,false +2018,1,12,162335721,1059365815,true +2018,1,12,1950442095,470859239,false +2018,1,12,763648143,1663259335,false +2018,1,12,1072958128,1033860197,false +2018,1,12,282881720,1086910216,false +2018,1,12,1520565669,997088398,true +2018,1,12,1520509524,1392150775,false +2018,1,12,1364407123,231483810,true +2018,1,12,2014182236,1577033972,false +2018,1,13,1809046092,1903199268,false +2018,1,13,1440962971,692120016,true +2018,1,13,593481180,1465599439,false +2018,1,13,271669559,1445980635,false +2018,1,13,690095576,1118283019,false +2018,1,13,996899337,747695769,false +2018,1,13,1604508415,1411210131,false +2018,1,13,236436258,1389487534,false +2018,1,13,1775493944,1391277838,true +2018,1,13,911216925,807066009,false +2018,1,14,1074188043,1525730244,false +2018,1,14,1027713283,268409692,false +2018,1,14,1629496869,1303314620,true +2018,1,14,717724396,689478179,true +2018,1,14,1455985364,163787232,false +2018,1,14,657544875,1868406911,true +2018,1,14,1182575789,1114800129,false +2018,1,14,883196444,1978020577,false +2018,1,14,2012770531,1394337738,false +2018,1,14,551551987,155069768,true +2018,1,15,608756475,555397126,false +2018,1,15,1009639198,1298861669,false +2018,1,15,763050716,298319046,true +2018,1,15,1651253782,1881648296,false +2018,1,15,1277205865,472229962,true +2018,1,15,68997808,1167388141,false +2018,1,15,692728623,747715511,true +2018,1,15,1508457041,475665601,true +2018,1,15,1048679624,1977536769,true +2018,1,15,465541011,1334140535,false +2018,1,16,1328187139,511122584,true +2018,1,16,44683705,1338807180,false +2018,1,16,1192620271,631613613,true +2018,1,16,926459784,637460486,true +2018,1,16,1870357599,1089060152,true +2018,1,16,463914588,1000823756,true +2018,1,16,1714064770,1428707068,false +2018,1,16,415338385,1490496307,true +2018,1,16,1951135899,1951153466,true +2018,1,16,185524636,380870194,true +2018,1,17,1154999557,806127841,true +2018,1,17,268848998,754467063,true +2018,1,17,686351111,180689228,true +2018,1,17,936156104,81656530,false +2018,1,17,980471329,473453046,true +2018,1,17,458711094,1846161591,false +2018,1,17,247130074,425746855,true +2018,1,17,1160727804,1391851206,false +2018,1,17,1495430965,587206416,true +2018,1,17,14007968,1937490065,false +2018,1,18,1357311837,1459677128,true +2018,1,18,1176520507,1401897505,true +2018,1,18,2073381921,221087268,true +2018,1,18,924685296,277373926,false +2018,1,18,1457626456,1590869870,true +2018,1,18,94606794,445635095,false +2018,1,18,963151,576700250,false +2018,1,18,1016158878,1817062441,false +2018,1,18,2102856893,1210232531,false +2018,1,18,2132237776,2026394474,false +2018,1,19,1162856194,1223316336,true +2018,1,19,1971362309,2118557670,true +2018,1,19,1889631124,1860921902,true +2018,1,19,586423340,1211226392,true +2018,1,19,1282771939,1398680401,true +2018,1,19,1530565985,985768982,true +2018,1,19,617607975,978499177,false +2018,1,19,1509637270,945269570,false +2018,1,19,934489546,2105863413,true +2018,1,19,334400011,479847330,false +2018,1,20,556808303,2146815776,false +2018,1,20,705218041,155724384,true +2018,1,20,35606986,286495100,true +2018,1,20,1950898424,1535009077,true +2018,1,20,1085116606,126200794,false +2018,1,20,1918900239,1047247524,true +2018,1,20,2000262271,2079318150,false +2018,1,20,1262597999,454893383,true +2018,1,20,694167213,2031157388,true +2018,1,20,1787491902,1376652204,false +2018,1,21,59090844,601345238,false +2018,1,21,635781898,2104953447,true +2018,1,21,1688148809,1421022538,true +2018,1,21,603806643,1786810476,false +2018,1,21,1088272816,1187573287,true +2018,1,21,1191318768,871207369,true +2018,1,21,168398713,997445968,false +2018,1,21,2095568062,1656959834,true +2018,1,21,456004664,1240568910,true +2018,1,21,1296805970,1299094032,true +2018,1,22,59322803,1916179761,false +2018,1,22,792025453,248681179,true +2018,1,22,2070420459,58470289,false +2018,1,22,1012623395,516142685,false +2018,1,22,1116412896,1608695902,false +2018,1,22,1658905964,928600558,true +2018,1,22,551755585,325651015,true +2018,1,22,968726556,850776984,false +2018,1,22,1339166092,1924791770,true +2018,1,22,333072667,499927319,true +2018,1,23,861249350,103387057,true +2018,1,23,300305647,1420739071,false +2018,1,23,2073513555,223720508,true +2018,1,23,1524166641,19899279,false +2018,1,23,1263664798,1040948247,true +2018,1,23,1624651722,268847340,false +2018,1,23,1936573576,1833317606,true +2018,1,23,587110417,824521898,false +2018,1,23,1258144472,2012910119,true +2018,1,23,1117915408,578334955,false +2018,1,24,180685863,453740165,true +2018,1,24,1833645866,682791305,false +2018,1,24,651351031,1643065738,false +2018,1,24,271814228,127987708,false +2018,1,24,1111529981,1750656712,false +2018,1,24,333351060,1621605121,false +2018,1,24,1558079039,1456892616,true +2018,1,24,398274454,1073275941,false +2018,1,24,1301886742,1971937125,false +2018,1,24,588972200,1257595850,true +2018,1,25,711031357,1510124639,false +2018,1,25,1169532865,102123581,false +2018,1,25,128802919,756877897,true +2018,1,25,724640097,13901548,true +2018,1,25,1513299738,1151493406,true +2018,1,25,2114655333,853233396,false +2018,1,25,83842749,994304286,false +2018,1,25,849964081,899416679,true +2018,1,25,1007477537,1736030151,false +2018,1,25,1020377266,1935544358,false +2018,1,26,644213478,992228355,true +2018,1,26,1064201509,467044898,false +2018,1,26,1413357198,1266457573,true +2018,1,26,490999733,732050529,false +2018,1,26,476533278,988816264,false +2018,1,26,1222613820,162187031,true +2018,1,26,644023449,438429675,true +2018,1,26,1555499478,1579453108,true +2018,1,26,1876135613,1178515944,true +2018,1,26,400099844,1233528508,false +2018,1,27,451454659,1671364750,false +2018,1,27,1886294417,1273525434,true +2018,1,27,570308894,1790607242,false +2018,1,27,1654335607,1813143221,true +2018,1,27,144660065,183445861,true +2018,1,27,138654580,1098077416,true +2018,1,27,1588442323,290498234,true +2018,1,27,1331235093,1969440952,false +2018,1,27,1872826470,651735742,true +2018,1,27,1880032602,1563061011,true +2018,1,28,666377735,350300836,true +2018,1,28,1915416387,17676992,false +2018,1,28,732028790,1900904643,false +2018,1,28,1577527526,689687365,false +2018,1,28,1049918259,440499602,false +2018,1,28,125181410,600483702,false +2018,1,28,2006736936,885761998,false +2018,1,28,1744602435,1872452290,true +2018,1,28,1327562501,1399890535,false +2018,1,28,1271538483,1571295964,true +2018,1,29,1576326042,252058122,false +2018,1,29,1003525146,1971912533,true +2018,1,29,1086678669,1684327772,true +2018,1,29,1392451051,1641260859,false +2018,1,29,1558277179,1516020299,false +2018,1,29,1408785876,550333664,false +2018,1,29,1680523178,169863162,false +2018,1,29,189249427,553711184,true +2018,1,29,157728981,1506125871,true +2018,1,29,1451688210,1407139108,true +2018,1,30,205530037,1073321047,false +2018,1,30,1395445794,1032827571,true +2018,1,30,700682713,369326453,false +2018,1,30,392585578,1781677517,true +2018,1,30,627978860,1519029141,false +2018,1,30,1414010589,86273416,true +2018,1,30,486493424,747771037,true +2018,1,30,1816746339,730800322,false +2018,1,30,2112968993,916353944,false +2018,1,30,223218798,1856021609,false +2018,1,31,1588787903,300527810,false +2018,1,31,365169587,1337588395,true +2018,1,31,714420772,340166663,true +2018,1,31,1976511044,1837968274,true +2018,1,31,643072762,649952948,false +2018,1,31,68431800,1187134403,true +2018,1,31,196708462,1171829382,true +2018,1,31,38480683,962347087,false +2018,1,31,243823276,604331866,true +2018,1,31,448558388,387387104,false +2018,2,1,1139002255,1603530290,true +2018,2,1,419475755,1202114467,true +2018,2,1,226930885,824478535,false +2018,2,1,1399544654,1464289984,false +2018,2,1,325451738,1242305117,true +2018,2,1,210683046,284343611,true +2018,2,1,1907543269,1402890138,true +2018,2,1,569841349,1483497878,false +2018,2,1,806778574,1453613195,true +2018,2,1,977594541,2011575954,true +2018,2,2,1466487912,1841693723,false +2018,2,2,1069900663,769915605,true +2018,2,2,1290918856,1837199789,false +2018,2,2,2082586787,256794213,false +2018,2,2,1558960562,1256980116,false +2018,2,2,1744788754,963388220,true +2018,2,2,537616616,1857576604,true +2018,2,2,994018069,206429274,true +2018,2,2,429627836,419546989,false +2018,2,2,112638745,103956258,false +2018,2,3,497358913,408567769,true +2018,2,3,1409856984,60955884,false +2018,2,3,1549462496,2058912651,false +2018,2,3,1055067162,2083334916,true +2018,2,3,837600581,2005950596,false +2018,2,3,1665593189,57953690,false +2018,2,3,1822484449,520947562,false +2018,2,3,1696255022,762263117,false +2018,2,3,669495919,1066320027,false +2018,2,3,1865455985,450715984,false +2018,2,4,1615523390,221882233,false +2018,2,4,1785033394,993846373,false +2018,2,4,1473402675,351257204,false +2018,2,4,592048337,627222806,true +2018,2,4,539885241,1533663670,true +2018,2,4,448320324,115449852,false +2018,2,4,702470691,173350781,false +2018,2,4,779665037,2138754296,true +2018,2,4,1826228515,344646795,false +2018,2,4,1222995457,884198552,false +2018,2,5,1031011,1798275122,true +2018,2,5,1766950466,836841781,false +2018,2,5,1178739818,594548456,true +2018,2,5,1696762621,79553118,true +2018,2,5,187315023,378491744,true +2018,2,5,614999485,1080038315,false +2018,2,5,440981477,678970785,false +2018,2,5,1196702978,1716682782,true +2018,2,5,1336283498,1159643289,false +2018,2,5,1057825238,101182851,false +2018,2,6,1372054238,1614121482,false +2018,2,6,30058382,414031777,false +2018,2,6,1616324490,773706231,true +2018,2,6,6036618,1489298713,false +2018,2,6,1741631073,1720155601,false +2018,2,6,266019847,1961711671,true +2018,2,6,1368296770,1051668756,false +2018,2,6,1444360285,1361665071,false +2018,2,6,96916804,392392732,true +2018,2,6,1955668545,1667292027,true +2018,2,7,685787024,3478346,true +2018,2,7,926821614,1431758418,true +2018,2,7,1864134264,2121093091,false +2018,2,7,202235955,1772918889,true +2018,2,7,668213449,2099982632,false +2018,2,7,65007835,2137863052,true +2018,2,7,1248593295,1439123344,true +2018,2,7,893605109,114208932,true +2018,2,7,583974081,1796465973,true +2018,2,7,823090392,1292942155,true +2018,2,8,1793682829,610092500,true +2018,2,8,885517053,937896961,false +2018,2,8,229782847,854838761,false +2018,2,8,107781131,649147281,false +2018,2,8,762239476,2041376717,true +2018,2,8,63825307,1930139224,false +2018,2,8,1028901536,330862313,true +2018,2,8,1683406250,913973342,true +2018,2,8,1061284362,1776400470,false +2018,2,8,1125289395,1050538891,false +2018,2,9,1412330698,436505986,false +2018,2,9,978901976,8783707,true +2018,2,9,1536725614,1582307502,false +2018,2,9,2015265209,1373320663,true +2018,2,9,2045796436,1920124008,true +2018,2,9,29445052,205984712,true +2018,2,9,1181454555,489301580,true +2018,2,9,649184498,282960475,true +2018,2,9,1961075699,901450415,true +2018,2,9,835228388,1381906088,false +2018,2,10,1098143291,1693647362,true +2018,2,10,1878223337,1499713510,true +2018,2,10,1778432552,477129227,true +2018,2,10,1853955545,915148587,true +2018,2,10,294687472,1711135825,true +2018,2,10,808842288,1082993670,true +2018,2,10,1626573609,161505348,true +2018,2,10,1530393603,323299164,false +2018,2,10,963498334,1265142539,true +2018,2,10,1108790617,1252602446,true +2018,2,11,1824658189,1805854717,false +2018,2,11,1765401791,1207783692,true +2018,2,11,278121149,1723919408,true +2018,2,11,498100532,795787751,false +2018,2,11,1414918203,137520294,false +2018,2,11,509077951,615788286,false +2018,2,11,1038377501,723489282,true +2018,2,11,465827147,884292954,false +2018,2,11,1613260308,933023167,false +2018,2,11,1212725922,1357854293,false +2018,2,12,817131963,792294506,true +2018,2,12,845721498,173629596,true +2018,2,12,804954080,1240118607,true +2018,2,12,73715156,307999802,false +2018,2,12,2113459843,1593680784,false +2018,2,12,573849748,1100216667,false +2018,2,12,488259575,457213349,true +2018,2,12,1265975975,1410180050,true +2018,2,12,1978099174,1308833717,false +2018,2,12,858829907,183561608,true +2018,2,13,796165937,570528333,false +2018,2,13,681930930,49700959,false +2018,2,13,1947974109,1771502164,true +2018,2,13,1603399454,1451834406,false +2018,2,13,1441308815,1147590741,false +2018,2,13,2119833707,1258323893,true +2018,2,13,1594621758,1155499129,true +2018,2,13,171292961,1818420306,true +2018,2,13,958651691,1672120770,false +2018,2,13,1291874560,1383882976,false +2018,2,14,899822143,1217515106,true +2018,2,14,667860328,757456530,false +2018,2,14,1142484061,64127053,false +2018,2,14,605196155,594754324,false +2018,2,14,157447497,323972180,true +2018,2,14,1986795549,669881442,true +2018,2,14,169172015,533263203,false +2018,2,14,97903382,877443285,true +2018,2,14,585592318,1833104575,true +2018,2,14,2132561923,1059226109,false +2018,2,15,14085636,1158942044,true +2018,2,15,402375387,792030055,true +2018,2,15,65655949,1915112342,true +2018,2,15,66311986,238546537,false +2018,2,15,420252548,1348090874,false +2018,2,15,2139834496,975348345,false +2018,2,15,1740677430,1439813352,true +2018,2,15,553938709,1797645766,false +2018,2,15,1772414923,1060788261,true +2018,2,15,2056713346,745174029,true +2018,2,16,395019289,973250755,false +2018,2,16,471075745,717218264,false +2018,2,16,1047468242,1276379366,false +2018,2,16,1735907200,854393003,false +2018,2,16,1548325699,2104592971,true +2018,2,16,341774808,484232402,true +2018,2,16,287290653,275629474,false +2018,2,16,1692089562,1291730886,true +2018,2,16,1966758651,1171993622,false +2018,2,16,1152060797,249907813,true +2018,2,17,2030518695,2046032792,false +2018,2,17,18813579,48095343,true +2018,2,17,1313848394,226340810,false +2018,2,17,1034172953,2131325705,true +2018,2,17,1911356427,265575069,true +2018,2,17,931433027,2127891001,true +2018,2,17,1063994825,1913556730,true +2018,2,17,674331439,710372747,false +2018,2,17,164842740,297778454,false +2018,2,17,1743237065,1548225132,false +2018,2,18,199357103,1686635325,false +2018,2,18,509059004,1383109336,true +2018,2,18,1747545798,1918672694,true +2018,2,18,1865281679,678181230,false +2018,2,18,1083014171,530760592,false +2018,2,18,950930674,921372681,true +2018,2,18,458757262,1135753594,false +2018,2,18,1824313494,522232942,false +2018,2,18,801810590,732279535,false +2018,2,18,1703685173,1892285103,true +2018,2,19,1542266152,1749897826,true +2018,2,19,826624133,400705362,true +2018,2,19,1416119820,1654883374,true +2018,2,19,825768021,1981737825,false +2018,2,19,992779951,1725244273,true +2018,2,19,1715964166,456844004,true +2018,2,19,821737861,1300879134,true +2018,2,19,582829582,1192931971,false +2018,2,19,96196138,448261480,true +2018,2,19,153493344,77314872,true +2018,2,20,377770431,969706205,true +2018,2,20,1659347902,248851288,true +2018,2,20,380946293,1254365119,false +2018,2,20,699316059,2140392033,true +2018,2,20,237684630,909853053,false +2018,2,20,1367075942,790540619,true +2018,2,20,1274517869,1764489060,true +2018,2,20,132773456,1039872852,true +2018,2,20,1275500758,1559122137,true +2018,2,20,458123747,1107745658,false +2018,2,21,776010336,2128058552,true +2018,2,21,957033358,1732509981,true +2018,2,21,523839815,1299422387,false +2018,2,21,296591982,61350532,false +2018,2,21,1722489012,2034837702,false +2018,2,21,750506485,1944223297,false +2018,2,21,1000527354,1455101458,true +2018,2,21,198978897,1836604565,true +2018,2,21,201915350,1307957702,true +2018,2,21,66985678,854438002,false +2018,2,22,503998164,552983516,false +2018,2,22,895167211,404372399,true +2018,2,22,1027754342,319708882,true +2018,2,22,649927145,1878180986,false +2018,2,22,982223449,1038760111,false +2018,2,22,1791491047,1806973499,false +2018,2,22,2125972129,1401379560,false +2018,2,22,1578629882,867989090,false +2018,2,22,1492273082,795474437,false +2018,2,22,1515000761,1600496080,true +2018,2,23,1342637582,1711496992,false +2018,2,23,617203765,415351266,true +2018,2,23,562402081,2027936904,false +2018,2,23,257884762,1355978819,false +2018,2,23,1635672429,1800229934,true +2018,2,23,234713640,458057486,false +2018,2,23,954953787,931924402,true +2018,2,23,183767365,187457263,false +2018,2,23,1970782173,748383003,false +2018,2,23,44689624,846860094,false +2018,2,24,562384647,105308841,true +2018,2,24,215355313,1272131654,false +2018,2,24,501481548,1394078016,false +2018,2,24,259581472,706027070,true +2018,2,24,1445170911,1522308186,true +2018,2,24,1599810064,1072916377,false +2018,2,24,1104035877,879666052,false +2018,2,24,914254150,43809719,true +2018,2,24,192631749,1279996116,false +2018,2,24,1207691464,344973092,true +2018,2,25,1305011764,580541655,true +2018,2,25,1374035935,874111198,true +2018,2,25,1034033647,1161841127,true +2018,2,25,1277276721,663203934,true +2018,2,25,357333469,455621614,true +2018,2,25,1819752421,1188591196,true +2018,2,25,322352021,1868712189,false +2018,2,25,896166863,1183695487,true +2018,2,25,911381118,1121623105,true +2018,2,25,1671845654,1243074173,false +2018,2,26,1259314426,1773994040,true +2018,2,26,985339601,788321106,false +2018,2,26,203778850,289898050,false +2018,2,26,1109439376,1362200476,true +2018,2,26,1635963577,684732622,false +2018,2,26,1460962250,1496507965,true +2018,2,26,1052173927,1274755807,true +2018,2,26,445286826,593830606,false +2018,2,26,300127827,818021025,true +2018,2,26,728922543,434849166,true +2018,2,27,1628966577,756129338,true +2018,2,27,232155925,367516366,false +2018,2,27,1843842293,72788259,true +2018,2,27,827255825,237404680,true +2018,2,27,136076945,626836904,true +2018,2,27,1617044276,1261209385,true +2018,2,27,1098835086,1214886248,false +2018,2,27,2129665180,1221860526,false +2018,2,27,1108499824,1230134504,true +2018,2,27,304688313,165271466,false +2018,2,28,346843922,1404876173,true +2018,2,28,1219609551,786548431,true +2018,2,28,1711627393,1901401765,true +2018,2,28,1511781505,299070472,true +2018,2,28,939049992,56783176,true +2018,2,28,440311037,483419219,false +2018,2,28,477257701,348748846,false +2018,2,28,1749444017,1515506639,false +2018,2,28,840214619,102527952,true +2018,2,28,1265693061,1253522501,false +2018,2,29,687636512,855273588,false +2018,2,29,656898833,1503830669,false +2018,2,29,396647518,1911352048,true +2018,2,29,1703804750,1190265760,true +2018,2,29,79384508,1482883653,true +2018,2,29,1336857001,1233822782,false +2018,2,29,937412641,1747272551,true +2018,2,29,1376343686,1456909919,true +2018,2,29,2004357436,968911304,false +2018,2,29,1980004506,67538425,true +2018,2,30,731176435,312318595,true +2018,2,30,724732572,234631871,true +2018,2,30,332712259,328748345,false +2018,2,30,1825208895,1428338395,true +2018,2,30,915340088,566246654,false +2018,2,30,610325115,437496535,false +2018,2,30,1395292421,870623462,false +2018,2,30,573495940,184419636,false +2018,2,30,933637997,602910487,true +2018,2,30,429749256,693731420,false +2018,2,31,1646922991,1603794537,false +2018,2,31,1395132038,1050385393,true +2018,2,31,805672313,214235637,false +2018,2,31,699864798,779390638,true +2018,2,31,1664469907,1441361704,true +2018,2,31,2127884652,814533706,false +2018,2,31,2026908471,1123498552,true +2018,2,31,541940297,1014327078,true +2018,2,31,792678615,319981066,true +2018,2,31,58022533,393484205,false +2018,3,1,1030738753,1068280801,true +2018,3,1,453331585,449848228,false +2018,3,1,771252573,1251734651,false +2018,3,1,1493563556,608718314,true +2018,3,1,449473790,515944293,true +2018,3,1,217354536,1105163091,true +2018,3,1,1116938805,1640625827,false +2018,3,1,1216317112,357634976,true +2018,3,1,513093358,1584497676,false +2018,3,1,950697222,385863104,true +2018,3,2,392724674,476353250,false +2018,3,2,457633199,223632615,false +2018,3,2,1078374198,455848847,false +2018,3,2,861093123,1373501467,true +2018,3,2,685889258,742525072,false +2018,3,2,432539532,748070975,false +2018,3,2,754730599,1929931506,false +2018,3,2,629336678,1970030796,true +2018,3,2,993335170,2111750643,true +2018,3,2,1404811207,224789945,true +2018,3,3,1799862420,1229929631,false +2018,3,3,126041677,609637690,false +2018,3,3,522260217,1697604993,false +2018,3,3,1568494821,1110062692,true +2018,3,3,1770410131,908265412,false +2018,3,3,583536237,1828386264,false +2018,3,3,337740867,1157030487,false +2018,3,3,760389465,262303739,true +2018,3,3,458938771,515178735,true +2018,3,3,1739844875,2050018417,false +2018,3,4,1120486921,1535349754,false +2018,3,4,249872794,794005709,false +2018,3,4,1869095647,1344804839,true +2018,3,4,962037818,379697332,true +2018,3,4,1414688783,285989197,false +2018,3,4,866129285,208654316,true +2018,3,4,1471805937,1341798093,false +2018,3,4,1603511382,1050535683,true +2018,3,4,2101572837,1496562386,false +2018,3,4,1904454390,1425225114,true +2018,3,5,61261499,831526516,true +2018,3,5,1715391937,1646749049,true +2018,3,5,1552005074,1909619749,true +2018,3,5,2019085217,1502504944,true +2018,3,5,329004593,648258557,false +2018,3,5,1961172446,753101626,false +2018,3,5,378585854,1612199094,true +2018,3,5,345372726,360945617,true +2018,3,5,1014865691,823407543,false +2018,3,5,564768587,232177151,true +2018,3,6,1649370591,2046348231,false +2018,3,6,1754798773,605002255,false +2018,3,6,1799032090,355663159,false +2018,3,6,167681379,462873057,true +2018,3,6,1770145766,1602417574,false +2018,3,6,2107632143,1067153627,true +2018,3,6,1481907704,626891955,false +2018,3,6,277981939,1294597989,true +2018,3,6,270908266,1975975980,true +2018,3,6,1331739391,1342245235,false +2018,3,7,542252340,801871267,false +2018,3,7,405859388,980134857,false +2018,3,7,1511447764,107656309,false +2018,3,7,1546205735,1021792774,false +2018,3,7,1344866766,790827674,true +2018,3,7,1657540649,1119158159,false +2018,3,7,996559373,54289377,false +2018,3,7,705022830,803326955,false +2018,3,7,2099212880,183787106,false +2018,3,7,1901339858,335193927,true +2018,3,8,1825898021,1464402507,true +2018,3,8,737118873,1003474436,true +2018,3,8,2107200286,1608700503,false +2018,3,8,1213526833,1104403105,true +2018,3,8,936779077,107347849,false +2018,3,8,308947241,1412934509,true +2018,3,8,1563058943,1430599174,true +2018,3,8,1559776803,19490227,true +2018,3,8,876002519,1544096293,false +2018,3,8,639509877,71520873,false +2018,3,9,1357412843,1487326214,false +2018,3,9,167047330,1961111687,true +2018,3,9,1367274484,1777145178,false +2018,3,9,440669959,433747853,false +2018,3,9,190250519,1275955223,true +2018,3,9,1123534658,1880551656,true +2018,3,9,930988046,1235566161,false +2018,3,9,1431899195,978884910,true +2018,3,9,263226175,1669409401,true +2018,3,9,715112681,84865647,false +2018,3,10,276531678,1226534045,false +2018,3,10,30618902,466315017,true +2018,3,10,1829505076,1849834638,true +2018,3,10,711557843,1823744201,false +2018,3,10,2089113233,206686708,true +2018,3,10,195647333,2076203570,false +2018,3,10,1763671718,1773087411,false +2018,3,10,136944582,954777674,false +2018,3,10,476377391,97742128,true +2018,3,10,1001834519,1135511872,false +2018,3,11,309430578,205584219,false +2018,3,11,556864146,1030586552,false +2018,3,11,2087229515,374461574,true +2018,3,11,1553524188,121373878,true +2018,3,11,1491639215,2140160878,false +2018,3,11,993094604,795792977,false +2018,3,11,1599536113,1782188806,true +2018,3,11,785535230,1648117684,false +2018,3,11,1007544758,104245217,false +2018,3,11,631931595,579650846,false +2018,3,12,17995725,1691718995,false +2018,3,12,420174975,2005895391,true +2018,3,12,1448005488,249535639,false +2018,3,12,2002637283,1697024636,false +2018,3,12,1791079832,1259372963,true +2018,3,12,510555252,2134964109,true +2018,3,12,910663844,909600887,false +2018,3,12,1764754913,1530384616,false +2018,3,12,940483529,2139043578,true +2018,3,12,497186210,228864310,true +2018,3,13,686754305,1860032206,false +2018,3,13,1716189383,291854859,true +2018,3,13,444460153,754704291,false +2018,3,13,268710487,1862302487,true +2018,3,13,328298664,643149287,false +2018,3,13,930574282,2134067975,true +2018,3,13,758028438,1080855279,false +2018,3,13,263954571,265456680,true +2018,3,13,1898970110,2035884777,true +2018,3,13,1757886601,570918656,true +2018,3,14,81447756,876110386,true +2018,3,14,1835035959,823391041,false +2018,3,14,1313289648,1313261297,false +2018,3,14,2080735631,13550573,false +2018,3,14,1574781361,1944292603,true +2018,3,14,1750397172,1720280456,true +2018,3,14,2122148871,1656858992,true +2018,3,14,1396476427,739984800,false +2018,3,14,1889174192,594976915,false +2018,3,14,17678808,722960670,true +2018,3,15,864109166,660777926,false +2018,3,15,1144325326,1184918292,true +2018,3,15,741526328,1476376249,true +2018,3,15,1734827662,1489489931,true +2018,3,15,1907694658,1253589536,true +2018,3,15,459590278,273054200,true +2018,3,15,1276379813,1675659164,false +2018,3,15,1110617781,2008946534,false +2018,3,15,677662169,1896684943,false +2018,3,15,965719865,1089885305,true +2018,3,16,656491375,905847576,true +2018,3,16,1032082554,488652489,false +2018,3,16,1718304725,566590314,false +2018,3,16,111595269,1374487892,true +2018,3,16,2062533650,2105812277,false +2018,3,16,877792121,1092705268,true +2018,3,16,1672736939,47584742,true +2018,3,16,1702299159,723734043,false +2018,3,16,1666133053,939950558,false +2018,3,16,1690603729,1580904128,false +2018,3,17,1911324783,539423220,true +2018,3,17,968509417,945660871,false +2018,3,17,1110811198,211960799,false +2018,3,17,1721487494,1935907144,true +2018,3,17,363705058,81757786,true +2018,3,17,475699648,1292715391,true +2018,3,17,2121669806,1066326817,true +2018,3,17,1163285531,1410064466,false +2018,3,17,857969598,551614485,true +2018,3,17,326136934,1099651496,false +2018,3,18,614581337,496948777,true +2018,3,18,1572613744,49922052,false +2018,3,18,812820801,1515612142,true +2018,3,18,872067364,228801837,true +2018,3,18,1566274160,1505239758,true +2018,3,18,1564008665,758620357,true +2018,3,18,1925734648,1132629339,true +2018,3,18,2109711874,907637008,true +2018,3,18,594974450,1855009339,false +2018,3,18,1784083840,137590896,false +2018,3,19,327525069,1963793048,false +2018,3,19,628378947,2065748530,false +2018,3,19,1119425368,4299813,true +2018,3,19,905136894,911483840,false +2018,3,19,1870575188,1045014022,true +2018,3,19,1500484635,1321207647,false +2018,3,19,306304384,1655840077,false +2018,3,19,496848180,265765880,true +2018,3,19,1159980627,746919108,true +2018,3,19,1897213107,440750494,false +2018,3,20,61619933,519114041,false +2018,3,20,2119473007,1996778706,true +2018,3,20,1707773584,1234551436,true +2018,3,20,1510244192,1397697207,true +2018,3,20,2118820828,2032510766,true +2018,3,20,883669948,921218731,true +2018,3,20,612270047,234218192,true +2018,3,20,120952607,1373920773,false +2018,3,20,1551921049,1452130930,true +2018,3,20,562768776,1595918573,false +2018,3,21,805299819,81188630,false +2018,3,21,281486676,1809221502,true +2018,3,21,345946052,1440667072,true +2018,3,21,1922989437,1978568892,false +2018,3,21,404364868,1847555728,true +2018,3,21,513145028,611722496,false +2018,3,21,1055673549,801923596,true +2018,3,21,807505049,408704997,false +2018,3,21,2123637290,610697983,true +2018,3,21,117933009,79144415,false +2018,3,22,1945408087,1620779129,true +2018,3,22,2022321295,1716477955,false +2018,3,22,735805593,1358284490,false +2018,3,22,1040076276,495789228,false +2018,3,22,1628875740,1640599447,false +2018,3,22,1655598947,590521843,false +2018,3,22,628556566,519673946,true +2018,3,22,553151616,648591175,false +2018,3,22,1910032797,724810722,false +2018,3,22,969324609,1251578235,false +2018,3,23,1518760888,2121514680,false +2018,3,23,1411210902,1700233512,true +2018,3,23,1526309325,2029084535,true +2018,3,23,1766201246,267408694,false +2018,3,23,1331244282,1889006442,false +2018,3,23,331475044,2079192041,true +2018,3,23,215769598,901952793,true +2018,3,23,842706066,953027268,true +2018,3,23,1588944831,1461943691,false +2018,3,23,98945442,1351739928,false +2018,3,24,1039770828,208771684,true +2018,3,24,502990708,1731188527,true +2018,3,24,335952807,1615372841,true +2018,3,24,44591951,1817151548,true +2018,3,24,263862216,1146740048,true +2018,3,24,60082615,1946330016,false +2018,3,24,1603233169,774497764,true +2018,3,24,1625743527,2140278822,true +2018,3,24,1220984232,1234489577,true +2018,3,24,201064197,791361091,true +2018,3,25,1933076593,81812421,true +2018,3,25,1483134825,1100092026,false +2018,3,25,1269397676,1330218317,false +2018,3,25,362933146,1223866231,false +2018,3,25,499321941,434339163,true +2018,3,25,1549751111,860793470,false +2018,3,25,668846420,1839480551,false +2018,3,25,270355427,1839198011,true +2018,3,25,294371250,211830812,false +2018,3,25,723079342,645378082,true +2018,3,26,1246965056,607455467,false +2018,3,26,1160593751,525584085,false +2018,3,26,1837858547,1331731451,true +2018,3,26,151497400,1804274768,true +2018,3,26,423610983,1345224465,false +2018,3,26,584804501,272227013,false +2018,3,26,1232528964,177411909,false +2018,3,26,1784078446,177183605,false +2018,3,26,1780535932,326314646,true +2018,3,26,337171647,1745103081,true +2018,3,27,1254513052,944696258,true +2018,3,27,2094175554,629951056,false +2018,3,27,1211452062,1624313456,true +2018,3,27,522513430,1203447538,true +2018,3,27,779054857,1138125065,false +2018,3,27,1079503053,89807098,false +2018,3,27,31064895,41434833,false +2018,3,27,748446167,510250247,false +2018,3,27,744490816,102643747,true +2018,3,27,1830879686,788394039,false +2018,3,28,1278463063,1961009118,true +2018,3,28,343525717,475488439,true +2018,3,28,77578341,641631331,false +2018,3,28,1883727939,998798588,true +2018,3,28,1791439306,1900554161,false +2018,3,28,1624800855,966200356,false +2018,3,28,1822207768,275680645,true +2018,3,28,120064819,1961641537,false +2018,3,28,1681607702,1922842509,true +2018,3,28,1714282165,1408732526,true +2018,3,29,1809091008,956456710,true +2018,3,29,354225435,156881160,false +2018,3,29,718238017,1848516705,false +2018,3,29,522981678,671844959,true +2018,3,29,604221980,1343596739,false +2018,3,29,1803990032,49948334,true +2018,3,29,1312936450,1352615551,true +2018,3,29,618452364,761091157,true +2018,3,29,277334279,2002646230,false +2018,3,29,1973295350,1298366834,true +2018,3,30,1783058223,406345886,true +2018,3,30,452598267,1193748596,false +2018,3,30,888387723,1128904365,true +2018,3,30,1061777911,188244804,false +2018,3,30,2021856103,32906766,true +2018,3,30,844976550,1968912396,false +2018,3,30,573530154,475590507,true +2018,3,30,727913543,524434128,false +2018,3,30,1801895168,775249978,false +2018,3,30,684574867,900007240,false +2018,3,31,2116374377,2059363204,false +2018,3,31,165251350,1262795466,false +2018,3,31,1086989363,1072550755,false +2018,3,31,1539403969,362274841,false +2018,3,31,305478301,1981783443,true +2018,3,31,469782998,100618107,false +2018,3,31,1536952711,1026646683,false +2018,3,31,67822720,1723109219,false +2018,3,31,983700986,2060008063,false +2018,3,31,345388975,2006343308,true +2018,4,1,67844219,879131393,false +2018,4,1,674404881,1855091302,false +2018,4,1,993985241,1587296975,false +2018,4,1,1018118720,1647197458,true +2018,4,1,288483960,1320235580,false +2018,4,1,1670939774,2038867567,false +2018,4,1,2113305418,2126927178,false +2018,4,1,907391383,35522524,false +2018,4,1,1726182902,1350067257,false +2018,4,1,649502140,417201343,false +2018,4,2,78342107,1983809235,true +2018,4,2,1242994547,940100089,false +2018,4,2,436114925,2083071017,true +2018,4,2,1781798399,258264252,false +2018,4,2,2074348455,141450116,true +2018,4,2,1615588281,362782092,false +2018,4,2,2111100287,478944437,true +2018,4,2,137110620,197783127,false +2018,4,2,1651176097,891268361,false +2018,4,2,118437941,1443869396,true +2018,4,3,1779559395,174963588,false +2018,4,3,509260966,682125501,false +2018,4,3,1629113891,2009649727,false +2018,4,3,1846884242,2070636919,false +2018,4,3,1381242005,138626032,false +2018,4,3,1309437570,569221442,false +2018,4,3,899312562,1441823757,false +2018,4,3,914621340,2096490511,false +2018,4,3,2073604028,622601477,false +2018,4,3,754373069,1999115914,false +2018,4,4,2042376724,870169474,false +2018,4,4,165850804,1432149496,true +2018,4,4,805222108,835727034,false +2018,4,4,1237345327,105706490,false +2018,4,4,243585173,90172720,false +2018,4,4,1573545108,769416024,true +2018,4,4,1265291709,2134409843,true +2018,4,4,1228942336,317786462,false +2018,4,4,481823894,795706824,true +2018,4,4,478572102,574942063,false +2018,4,5,730797979,1167743671,true +2018,4,5,328660060,1057747676,false +2018,4,5,619895127,535297014,true +2018,4,5,1846654303,472145185,false +2018,4,5,1341835407,58721838,true +2018,4,5,58463873,341998873,true +2018,4,5,79177222,1346259840,false +2018,4,5,6136448,1730536898,true +2018,4,5,1935969086,2018357727,false +2018,4,5,61523057,1290244521,false +2018,4,6,133041671,523160986,false +2018,4,6,1019704078,1885243937,true +2018,4,6,1385953984,1546444455,true +2018,4,6,1871450309,1157707709,true +2018,4,6,699520902,1925997196,false +2018,4,6,978739761,333904343,true +2018,4,6,800428243,917675517,false +2018,4,6,306524111,820606216,true +2018,4,6,173894715,661268615,false +2018,4,6,873349387,303843537,false +2018,4,7,1968584769,1865019877,false +2018,4,7,1803070964,1587092210,true +2018,4,7,89733246,1869364165,false +2018,4,7,932141626,1315255068,true +2018,4,7,113382287,2012015666,false +2018,4,7,1485608776,470600705,true +2018,4,7,1310825611,1819491295,true +2018,4,7,1571142801,246011633,true +2018,4,7,700204008,2029283584,false +2018,4,7,1164364907,780322180,false +2018,4,8,1392466967,690721395,false +2018,4,8,59722003,990384386,false +2018,4,8,1067170889,671055985,false +2018,4,8,898367652,382351746,false +2018,4,8,60209171,1702409975,true +2018,4,8,1541480496,1794565891,false +2018,4,8,1870359790,900271015,true +2018,4,8,1738874172,1763919074,true +2018,4,8,1541480128,1241508824,false +2018,4,8,408046147,507489403,true +2018,4,9,2086294268,1639110914,false +2018,4,9,1591871004,281071068,false +2018,4,9,2141298164,2105356503,false +2018,4,9,227194700,1060359629,false +2018,4,9,1536833647,83585947,true +2018,4,9,281017258,133944992,false +2018,4,9,959218603,1662019236,true +2018,4,9,1319825898,1318219138,false +2018,4,9,449982535,675045889,true +2018,4,9,1604654561,49418309,true +2018,4,10,783210705,504356796,false +2018,4,10,2060591516,1659453401,true +2018,4,10,731650799,722788403,true +2018,4,10,1959367855,853522932,false +2018,4,10,625123919,858315392,true +2018,4,10,313866297,1663062161,false +2018,4,10,2032320238,38050015,true +2018,4,10,994771348,558307413,false +2018,4,10,659100756,418226327,false +2018,4,10,1118004909,814784246,true +2018,4,11,33571281,1033458137,false +2018,4,11,1950637688,1931874618,false +2018,4,11,2113676262,1487161221,true +2018,4,11,1431909622,1455537195,false +2018,4,11,565382488,1093271518,true +2018,4,11,362929612,39106443,true +2018,4,11,858596009,343428749,false +2018,4,11,285530463,2146621367,false +2018,4,11,1540451552,1901057418,false +2018,4,11,1549656037,1173214065,false +2018,4,12,2046185710,794596804,true +2018,4,12,1092332283,2082989650,false +2018,4,12,1081241361,1621025175,false +2018,4,12,1680355734,647279303,true +2018,4,12,302092799,909273718,true +2018,4,12,811191141,1725157104,true +2018,4,12,1497159625,1512904792,false +2018,4,12,1667745804,281660862,false +2018,4,12,1261350570,242066622,true +2018,4,12,2109367222,463784463,false +2018,4,13,963849800,1140843638,false +2018,4,13,1213363694,1653214159,true +2018,4,13,2098713100,1437405793,true +2018,4,13,145232071,397434686,false +2018,4,13,359019420,1153276649,true +2018,4,13,1270699445,40547558,false +2018,4,13,2037591200,81373931,true +2018,4,13,1263688361,1736486127,false +2018,4,13,1194675536,671182394,false +2018,4,13,1166796105,1636195021,true +2018,4,14,266135944,160844045,true +2018,4,14,1497256802,1452598911,true +2018,4,14,65974352,489536759,true +2018,4,14,1182994142,119359190,true +2018,4,14,661447194,391963958,true +2018,4,14,1930279145,537080962,false +2018,4,14,758601760,1754388343,true +2018,4,14,285252365,1090926541,true +2018,4,14,752860078,1607830896,false +2018,4,14,710489700,1951556436,true +2018,4,15,484775487,1567463632,true +2018,4,15,1458489398,791837725,true +2018,4,15,894131111,1560188540,true +2018,4,15,2023202976,1150684125,false +2018,4,15,1235437013,1007578902,false +2018,4,15,1014956821,651867202,false +2018,4,15,1133698497,149209649,false +2018,4,15,1571113426,278405796,false +2018,4,15,169144228,959360726,true +2018,4,15,369491107,866634095,true +2018,4,16,1280735386,724391699,true +2018,4,16,483641485,195500311,false +2018,4,16,657287725,1042894290,true +2018,4,16,2029951280,781575296,false +2018,4,16,2133851267,768828720,false +2018,4,16,1706806788,196607257,true +2018,4,16,183157282,867787006,true +2018,4,16,1308652735,732262690,true +2018,4,16,1324144776,1299493951,true +2018,4,16,530048897,254821069,false +2018,4,17,511407884,371063226,false +2018,4,17,546961569,485521729,true +2018,4,17,1667420606,262280955,true +2018,4,17,100157222,286991522,true +2018,4,17,1583568958,2027437650,false +2018,4,17,1941782101,61400242,false +2018,4,17,1890900172,135929788,true +2018,4,17,1130688482,671794592,true +2018,4,17,480873977,756210100,false +2018,4,17,587873182,1914595618,true +2018,4,18,799123901,1843688024,true +2018,4,18,222617783,2074968562,true +2018,4,18,884742881,928456305,true +2018,4,18,200799206,1567700543,false +2018,4,18,1527128581,1613705017,true +2018,4,18,1667111722,1450449332,false +2018,4,18,1120100768,1254455542,false +2018,4,18,1391569619,910731813,false +2018,4,18,240744657,165098432,false +2018,4,18,1203478795,96838142,true +2018,4,19,1960475353,1490291980,true +2018,4,19,153900072,771987979,false +2018,4,19,970048916,1457990617,false +2018,4,19,1895548263,1382705201,true +2018,4,19,1212639056,132094605,true +2018,4,19,1837350111,899808046,false +2018,4,19,424902863,1784197343,false +2018,4,19,1681550539,1045377921,true +2018,4,19,60844631,1205863785,false +2018,4,19,643180932,895350782,false +2018,4,20,1569647392,1899884979,true +2018,4,20,1815615675,1925615812,false +2018,4,20,1493833852,1452481753,true +2018,4,20,156788179,1405926793,false +2018,4,20,419071630,1282514968,true +2018,4,20,1145559665,490607738,true +2018,4,20,899933809,44924386,false +2018,4,20,838566293,1236110629,true +2018,4,20,1190268436,138288977,true +2018,4,20,1580435675,104447773,true +2018,4,21,219104316,1801104653,false +2018,4,21,1685632696,679992929,false +2018,4,21,409311084,1801203709,true +2018,4,21,501667279,1234169770,false +2018,4,21,637505696,2110440293,false +2018,4,21,607073454,1713347885,true +2018,4,21,1903479581,2037683507,true +2018,4,21,193928849,2106888304,false +2018,4,21,388985035,285035185,true +2018,4,21,1290980616,970510408,true +2018,4,22,1951293128,37187289,false +2018,4,22,3609480,862087705,false +2018,4,22,1462917185,1470506227,false +2018,4,22,964279634,2074227690,false +2018,4,22,167001519,1650410826,true +2018,4,22,2029687612,1660799222,true +2018,4,22,1570721032,1815255841,true +2018,4,22,444797489,1646826675,true +2018,4,22,1868106631,416612905,false +2018,4,22,754996371,216260751,true +2018,4,23,1162097685,316272419,true +2018,4,23,601251772,673904340,false +2018,4,23,1860955519,484790911,false +2018,4,23,2041855053,234789227,true +2018,4,23,1571762679,1686506765,true +2018,4,23,1663093528,1124717949,true +2018,4,23,307479953,1414475292,false +2018,4,23,1384544701,1831122205,false +2018,4,23,1789258691,2055092126,true +2018,4,23,842178903,928646790,false +2018,4,24,1611877385,908492844,true +2018,4,24,1425589962,1351251088,false +2018,4,24,39367743,1364642809,true +2018,4,24,1065088914,262016609,false +2018,4,24,861123411,592784664,false +2018,4,24,38037285,994452,true +2018,4,24,729936210,2029942947,false +2018,4,24,371068996,630124608,true +2018,4,24,593740297,338686368,true +2018,4,24,480297677,2103997609,true +2018,4,25,1936737708,722633661,false +2018,4,25,398098922,2077214339,true +2018,4,25,1792380733,895211318,false +2018,4,25,1248573376,386653194,true +2018,4,25,628378290,120650883,false +2018,4,25,781736666,1288023908,false +2018,4,25,499993860,1919170974,false +2018,4,25,342490883,614748414,false +2018,4,25,1502668461,1510536197,false +2018,4,25,201506400,936892902,true +2018,4,26,38260036,991992259,false +2018,4,26,424100053,907585115,true +2018,4,26,64981811,1856903460,false +2018,4,26,1982576415,1758576379,true +2018,4,26,1847569598,1201977820,false +2018,4,26,111850544,921494363,false +2018,4,26,555214805,1299600336,false +2018,4,26,530660006,298046197,true +2018,4,26,61100127,1982643904,false +2018,4,26,182347136,1498958244,true +2018,4,27,861199626,1738881920,true +2018,4,27,1640077906,1053637423,false +2018,4,27,15605955,210391686,true +2018,4,27,409764007,1829125522,true +2018,4,27,1824963420,1319273714,false +2018,4,27,1490541924,552817660,false +2018,4,27,100642827,474416225,false +2018,4,27,1920416306,660634823,true +2018,4,27,549788369,531481380,true +2018,4,27,618643555,1775375056,true +2018,4,28,1416869345,859529658,false +2018,4,28,472724437,1421168374,false +2018,4,28,652269212,87234451,false +2018,4,28,178542003,1103270062,true +2018,4,28,523505097,184648847,false +2018,4,28,1523939373,348953836,true +2018,4,28,460388613,1951509940,false +2018,4,28,1866171254,491997394,false +2018,4,28,1002316714,1974276875,false +2018,4,28,130247093,1087052256,true +2018,4,29,1977328535,1527651760,false +2018,4,29,2030820004,567793567,false +2018,4,29,118323869,919727207,true +2018,4,29,1498693897,635103996,true +2018,4,29,1926119305,558895716,true +2018,4,29,1322353510,330057163,true +2018,4,29,1621441013,1784545025,true +2018,4,29,254080603,238385452,true +2018,4,29,1207337582,1600900760,true +2018,4,29,2075802296,801778633,true +2018,4,30,1956506767,677512123,false +2018,4,30,1765323316,2065324326,true +2018,4,30,64492320,1339387262,false +2018,4,30,1894751550,1572616567,false +2018,4,30,206888540,316152470,true +2018,4,30,2064458092,1010529722,false +2018,4,30,793828073,1911880302,false +2018,4,30,771898214,1796664841,false +2018,4,30,312394796,973505678,false +2018,4,30,366692459,779020380,true +2018,4,31,461393652,1495776224,false +2018,4,31,64959168,1587776565,true +2018,4,31,1584372157,1384642232,false +2018,4,31,532722082,978004975,false +2018,4,31,1716796916,1611419051,true +2018,4,31,512575643,1402771105,true +2018,4,31,335929860,512377498,true +2018,4,31,59118405,905090517,false +2018,4,31,1580723731,746874493,true +2018,4,31,2048770528,1828282506,true +2018,5,1,319796666,1874284491,false +2018,5,1,1567475786,167723316,true +2018,5,1,176723742,933258395,true +2018,5,1,1279816755,303070902,true +2018,5,1,1983138859,163291852,false +2018,5,1,1747302567,175691914,true +2018,5,1,1187284206,489553370,false +2018,5,1,1733413960,1494164636,false +2018,5,1,1210903099,1807281446,true +2018,5,1,1357221064,1749150068,false +2018,5,2,1781034522,1680125661,true +2018,5,2,291292547,1696971217,true +2018,5,2,1516528570,443179717,false +2018,5,2,1043023951,800763580,false +2018,5,2,1222167259,1793574289,true +2018,5,2,510267092,1981297590,true +2018,5,2,1529649897,580112532,true +2018,5,2,1353558032,481524897,true +2018,5,2,655046720,865018598,false +2018,5,2,1749080273,1767571296,true +2018,5,3,339585613,1690546265,false +2018,5,3,1832705867,1953577834,false +2018,5,3,908473235,475860943,true +2018,5,3,529860885,978028488,true +2018,5,3,2007147700,1035711955,false +2018,5,3,2141687852,1387961390,true +2018,5,3,1874573834,578339197,true +2018,5,3,976342744,1940235316,true +2018,5,3,397678469,859476241,true +2018,5,3,1301486465,1357181917,false +2018,5,4,786274483,1077045464,false +2018,5,4,1140774096,408255205,true +2018,5,4,1216920396,413458966,true +2018,5,4,600548282,2078533779,false +2018,5,4,634419354,1203693886,true +2018,5,4,564235335,1903141891,false +2018,5,4,1419312773,1131059627,false +2018,5,4,1277255463,850877727,false +2018,5,4,685044540,1589547647,false +2018,5,4,1820544381,1521457349,false +2018,5,5,1836209833,1094796164,true +2018,5,5,213259629,796346062,false +2018,5,5,1994144561,1766029849,false +2018,5,5,1047021889,312392421,true +2018,5,5,1250132855,1805841815,false +2018,5,5,250012746,1156405629,true +2018,5,5,1925719317,1403604913,true +2018,5,5,1203931083,129269828,true +2018,5,5,1568878725,1102373048,true +2018,5,5,721659786,1198044638,false +2018,5,6,864853175,749324498,false +2018,5,6,1690792156,1822068441,false +2018,5,6,871801095,889699961,false +2018,5,6,1241948233,1535116810,false +2018,5,6,191550086,1810759657,false +2018,5,6,189110553,730191071,false +2018,5,6,1110331870,1540331585,false +2018,5,6,112998251,520795209,false +2018,5,6,248294712,2016305792,true +2018,5,6,327930020,1200089407,true +2018,5,7,334151140,225944492,true +2018,5,7,2139354968,435824669,true +2018,5,7,776031164,33557179,false +2018,5,7,983674620,1394229308,false +2018,5,7,201568375,1377447970,false +2018,5,7,139008253,1411921301,true +2018,5,7,1694606179,1895276561,true +2018,5,7,954013719,1630470238,false +2018,5,7,891343428,1134813627,true +2018,5,7,616710144,887630544,false +2018,5,8,443975242,1729043276,true +2018,5,8,1321694910,168395680,true +2018,5,8,336747606,1166325074,false +2018,5,8,78987843,1064052688,true +2018,5,8,968334859,360516034,false +2018,5,8,1619568681,591581405,true +2018,5,8,1189420383,118000104,false +2018,5,8,572855964,1310764162,false +2018,5,8,1378166437,900279253,false +2018,5,8,1055708379,1257019769,false +2018,5,9,77937990,302263099,true +2018,5,9,859129555,87853322,true +2018,5,9,1079276968,1004767913,false +2018,5,9,1909830335,352290636,true +2018,5,9,925330161,615717287,true +2018,5,9,1383039196,1680579668,false +2018,5,9,1657379237,792224038,true +2018,5,9,331279492,141027028,false +2018,5,9,73424980,1873644348,false +2018,5,9,523576150,611257560,false +2018,5,10,44473813,1030517806,true +2018,5,10,640723510,844495259,true +2018,5,10,906946746,822436385,true +2018,5,10,281010500,491644845,false +2018,5,10,835927560,52082207,false +2018,5,10,1288337182,1127578853,true +2018,5,10,923063636,1897665577,true +2018,5,10,1332336907,1149063633,true +2018,5,10,241571537,1364545931,true +2018,5,10,1564213468,1293562015,false +2018,5,11,1936690176,1038270388,false +2018,5,11,88521388,1593817016,true +2018,5,11,1250598384,1397847863,true +2018,5,11,2001234882,130378372,false +2018,5,11,392240219,1127279048,false +2018,5,11,933550402,958310163,false +2018,5,11,16609024,180783954,false +2018,5,11,1712601203,787153298,true +2018,5,11,583468809,603300208,true +2018,5,11,974537663,584269538,true +2018,5,12,1153611874,2013695479,false +2018,5,12,1690105533,743054765,true +2018,5,12,1983128388,1434059860,false +2018,5,12,543637731,1030578621,true +2018,5,12,1328600170,380558053,false +2018,5,12,1087717811,1556658065,true +2018,5,12,648498852,1157324954,false +2018,5,12,313051773,1376917087,true +2018,5,12,235625160,1751489661,true +2018,5,12,1997979919,1044829437,false +2018,5,13,168350510,1305374716,false +2018,5,13,2087904230,1291698804,true +2018,5,13,1148794425,933858813,false +2018,5,13,1051667399,529583390,false +2018,5,13,1890889631,71591978,true +2018,5,13,167503803,563555236,false +2018,5,13,693799534,1283909884,true +2018,5,13,2000455661,1022763322,true +2018,5,13,1951411706,1559092322,true +2018,5,13,701966555,1032612576,false +2018,5,14,1814656572,390961702,false +2018,5,14,1108878285,1322562974,false +2018,5,14,664630352,30846030,true +2018,5,14,475955873,1873493484,false +2018,5,14,1533214214,292165944,true +2018,5,14,1647772737,69105354,false +2018,5,14,1883241423,892435487,true +2018,5,14,1496822882,1344955182,false +2018,5,14,1689496169,1434681407,true +2018,5,14,870958168,1304540890,true +2018,5,15,11606542,1809871335,false +2018,5,15,2068816937,99216310,false +2018,5,15,873309990,1698610701,false +2018,5,15,140845608,690671416,true +2018,5,15,138152088,127677499,false +2018,5,15,1263199410,1561112462,false +2018,5,15,61107796,1203765244,false +2018,5,15,225948911,873452160,false +2018,5,15,1862080183,2036950538,true +2018,5,15,453778084,1500860483,false +2018,5,16,2009877447,1199091431,true +2018,5,16,1097369713,43866804,true +2018,5,16,785379715,114826168,true +2018,5,16,1104026900,23426484,true +2018,5,16,791299259,457284454,false +2018,5,16,451778958,1059962786,true +2018,5,16,1444228650,1483736446,false +2018,5,16,1386802487,1946038555,false +2018,5,16,364588998,1433285961,false +2018,5,16,1454927906,540164498,true +2018,5,17,1887404674,49236742,false +2018,5,17,1718477158,1391243865,false +2018,5,17,1748961533,112126552,true +2018,5,17,1417826850,1646261299,false +2018,5,17,1391286221,148984898,false +2018,5,17,685285208,409385863,true +2018,5,17,432524900,830343241,false +2018,5,17,503892980,541619656,true +2018,5,17,603169739,1003355022,false +2018,5,17,123715995,285114042,false +2018,5,18,784953500,1756610050,true +2018,5,18,1615759887,482918031,false +2018,5,18,1768633350,1159259398,true +2018,5,18,12141651,1980092950,false +2018,5,18,1683137480,1341855760,false +2018,5,18,200523732,224826482,true +2018,5,18,105502554,198965033,true +2018,5,18,1588335927,393537046,false +2018,5,18,793550016,444273767,false +2018,5,18,2043975595,764374770,false +2018,5,19,2039812547,936849583,true +2018,5,19,726229705,1439754778,true +2018,5,19,1661077114,442247516,false +2018,5,19,1318283362,1156699984,true +2018,5,19,294405740,121329576,true +2018,5,19,644187862,2074299185,false +2018,5,19,133938101,1579408353,false +2018,5,19,528985027,849121533,false +2018,5,19,1266975732,1457274326,false +2018,5,19,1801562158,485652777,true +2018,5,20,280760552,913129728,true +2018,5,20,1056567924,1449989700,true +2018,5,20,328752899,528127564,true +2018,5,20,21786532,941186517,true +2018,5,20,1766308458,713983642,true +2018,5,20,31478035,1780561207,true +2018,5,20,1166182158,1217697863,true +2018,5,20,995191880,1217589919,false +2018,5,20,620600035,1417731936,false +2018,5,20,800457362,1356191959,false +2018,5,21,469369191,1187507021,false +2018,5,21,1016857316,538157134,true +2018,5,21,1681720818,656502290,true +2018,5,21,1120680912,1935624839,true +2018,5,21,2079536776,1116117124,false +2018,5,21,477889717,651842841,false +2018,5,21,352600864,887032824,false +2018,5,21,1305570425,301299635,false +2018,5,21,1065434529,1895038293,false +2018,5,21,620710251,2143255788,true +2018,5,22,1598512389,1935804062,false +2018,5,22,528967968,1013516221,false +2018,5,22,1056379488,1299053378,false +2018,5,22,1788143716,158353401,false +2018,5,22,1535215801,435171540,false +2018,5,22,1895316123,1639730729,true +2018,5,22,1889050880,735174916,false +2018,5,22,367783204,1387850627,false +2018,5,22,1748087166,1468064485,false +2018,5,22,573345438,718103817,false +2018,5,23,367514773,1395902697,false +2018,5,23,358130456,2113980647,false +2018,5,23,1707930796,434226711,false +2018,5,23,1454769563,1912404906,true +2018,5,23,1134695318,1539467372,false +2018,5,23,266392717,1449633516,true +2018,5,23,1886919133,875988836,false +2018,5,23,1938479577,128427392,false +2018,5,23,1024906924,1244695830,true +2018,5,23,1155569899,1099655500,true +2018,5,24,130765657,935513463,false +2018,5,24,1542953939,1092465999,false +2018,5,24,1725973946,1954071188,true +2018,5,24,2135330358,1914827395,true +2018,5,24,1866584772,2039418345,true +2018,5,24,1494155220,1237840028,false +2018,5,24,913508141,946265626,true +2018,5,24,1173330375,595794903,true +2018,5,24,1912667654,1498815932,false +2018,5,24,1463744720,1553311572,false +2018,5,25,1571990918,1229351671,true +2018,5,25,1928301639,2067460009,false +2018,5,25,1788432507,360880409,true +2018,5,25,1439538309,48954298,true +2018,5,25,1375498021,681849978,true +2018,5,25,752174316,1711575547,true +2018,5,25,1468347485,337149511,false +2018,5,25,1027050707,374105750,true +2018,5,25,791286872,1266444549,true +2018,5,25,1316368376,1063420737,true +2018,5,26,129523822,746100716,true +2018,5,26,1224532568,2000077724,false +2018,5,26,1059800996,45486703,false +2018,5,26,1724267631,1487481197,true +2018,5,26,1003259370,1765315207,false +2018,5,26,949539919,1848734235,true +2018,5,26,809219155,1085267871,false +2018,5,26,167159064,2080164661,false +2018,5,26,2021546668,118741293,false +2018,5,26,1064016299,1081234919,true +2018,5,27,1018208062,119634789,true +2018,5,27,1925654711,1941727655,false +2018,5,27,1115704223,1568576900,false +2018,5,27,1434795888,933348489,false +2018,5,27,1645798876,636480668,true +2018,5,27,1306693001,2102862833,true +2018,5,27,1785095016,1910861565,false +2018,5,27,12808274,1353624229,true +2018,5,27,1479970700,1980414847,true +2018,5,27,881363968,433721706,true +2018,5,28,1440297298,1734468950,true +2018,5,28,1339255184,1618048427,false +2018,5,28,91315488,406509640,true +2018,5,28,787250369,216055268,false +2018,5,28,26459258,1424253519,false +2018,5,28,748984594,844533186,true +2018,5,28,1678115498,674987791,false +2018,5,28,1154573787,1580905526,false +2018,5,28,171102012,1925865132,true +2018,5,28,2062524610,900688881,true +2018,5,29,61208459,901813474,true +2018,5,29,2068398920,432849668,false +2018,5,29,1758078685,164116076,false +2018,5,29,2129181480,98277520,false +2018,5,29,1421347001,410335776,false +2018,5,29,1280747178,1043654610,true +2018,5,29,1338921850,926648392,true +2018,5,29,423869769,276372975,false +2018,5,29,657594705,2126548507,true +2018,5,29,1875184191,600050761,true +2018,5,30,359158833,1906588783,true +2018,5,30,228274626,2004237125,true +2018,5,30,492354909,176021486,true +2018,5,30,1461547813,616104392,true +2018,5,30,1174027150,1382518399,true +2018,5,30,1347635488,2071225307,false +2018,5,30,1288805095,1209059484,false +2018,5,30,1214922453,1670145839,false +2018,5,30,213542968,2070001351,false +2018,5,30,108305282,967759481,false +2018,5,31,1554103614,613739115,true +2018,5,31,667301850,1322083559,true +2018,5,31,1561664988,107482812,true +2018,5,31,905574258,533027089,true +2018,5,31,1059669640,1419530623,false +2018,5,31,1975012817,2043374390,true +2018,5,31,1044023898,905366714,true +2018,5,31,484948480,2074469547,true +2018,5,31,1110584201,1866379706,false +2018,5,31,1667140775,1079074498,true +2018,6,1,1730249519,538620668,true +2018,6,1,981101600,513166737,true +2018,6,1,1324869042,644283859,true +2018,6,1,1286090510,273342307,true +2018,6,1,1687521059,712527326,true +2018,6,1,2098045254,1414607804,true +2018,6,1,2129282925,1184109493,false +2018,6,1,2006444096,1754043694,true +2018,6,1,1088475188,1734319442,false +2018,6,1,1917539136,1787283326,false +2018,6,2,1914035880,182745626,true +2018,6,2,1291632888,1449439167,true +2018,6,2,1435075010,325338528,false +2018,6,2,855586644,908340848,true +2018,6,2,1473286561,1468296041,true +2018,6,2,1585521581,1582367686,true +2018,6,2,1605058300,272081762,true +2018,6,2,2043725979,983471263,true +2018,6,2,1137271280,1370532301,false +2018,6,2,379384850,360477397,false +2018,6,3,922006923,398679743,true +2018,6,3,157942684,819541490,true +2018,6,3,254181850,1253562518,true +2018,6,3,1545653707,1001752937,false +2018,6,3,1249568893,1895848423,false +2018,6,3,1550847788,1317399519,true +2018,6,3,691349461,2146916052,true +2018,6,3,1542833777,1185709878,false +2018,6,3,900351860,2046760656,false +2018,6,3,1835945606,290589519,true +2018,6,4,1267911767,1040176946,false +2018,6,4,59657928,430900664,false +2018,6,4,1457511951,1196159023,true +2018,6,4,262903974,1916509902,false +2018,6,4,484949477,773402896,false +2018,6,4,211947328,386699511,false +2018,6,4,417539993,730513850,true +2018,6,4,1762955499,1371883768,true +2018,6,4,1904686936,612640485,false +2018,6,4,979884970,2081293978,true +2018,6,5,1825788656,303225756,true +2018,6,5,842357997,2105301792,false +2018,6,5,1020355697,274472404,true +2018,6,5,416684604,607649584,true +2018,6,5,1667302073,1874681145,true +2018,6,5,1088959387,1185868866,false +2018,6,5,122159265,354026124,true +2018,6,5,314784199,648480186,true +2018,6,5,379573699,2101170980,true +2018,6,5,1445298585,1904606251,true +2018,6,6,640056338,1991563109,false +2018,6,6,189600567,1861845372,false +2018,6,6,563692472,695874653,true +2018,6,6,881172185,121237223,false +2018,6,6,943379193,1970513849,true +2018,6,6,754405964,1420506630,false +2018,6,6,1012177419,130194127,false +2018,6,6,750997881,1443228779,false +2018,6,6,1928626976,1391975552,true +2018,6,6,1287012759,239240574,false +2018,6,7,176039198,411456266,true +2018,6,7,1294571996,974712964,true +2018,6,7,1617120175,1343630895,false +2018,6,7,1675400632,595058992,true +2018,6,7,477745586,1927028541,true +2018,6,7,1905165323,206149376,true +2018,6,7,1299161915,268414930,true +2018,6,7,819991511,282334296,false +2018,6,7,918278805,749259128,false +2018,6,7,143653454,839118876,true +2018,6,8,892538897,2051256168,true +2018,6,8,1331834720,845305134,true +2018,6,8,1576801927,638214432,false +2018,6,8,1903613191,2114382125,false +2018,6,8,88002244,932603776,false +2018,6,8,1931410179,943671347,true +2018,6,8,1753193069,770949042,false +2018,6,8,162505679,525407634,false +2018,6,8,622031605,2118630929,false +2018,6,8,714475755,21389755,true +2018,6,9,2038752185,149532860,true +2018,6,9,40889299,1397941209,true +2018,6,9,1665953379,271617182,true +2018,6,9,78517821,1277383100,false +2018,6,9,740241177,1255430425,false +2018,6,9,1277286256,241077040,true +2018,6,9,1666099763,97791371,true +2018,6,9,1669176789,2010998230,false +2018,6,9,850796383,1628748140,true +2018,6,9,785355534,966540559,true +2018,6,10,551657089,2126990010,true +2018,6,10,84389437,818875811,true +2018,6,10,3537073,633825248,true +2018,6,10,574804794,1958962980,false +2018,6,10,1245615143,1063004522,false +2018,6,10,207462126,941878593,false +2018,6,10,1811854322,127065374,true +2018,6,10,1059340296,1002443135,false +2018,6,10,1729232076,341831505,false +2018,6,10,1769177290,816589341,true +2018,6,11,839808624,1037103247,false +2018,6,11,156877223,830317823,false +2018,6,11,1239878961,1889246455,false +2018,6,11,256796122,1850990081,false +2018,6,11,1241397711,1183866836,false +2018,6,11,745663046,965649917,true +2018,6,11,331220408,1108261206,false +2018,6,11,1044962372,68803681,true +2018,6,11,829272118,102006146,true +2018,6,11,441841958,512962359,true +2018,6,12,2103563574,397994672,false +2018,6,12,1715157136,960050658,true +2018,6,12,1571664056,1700684579,true +2018,6,12,1543713536,861288703,false +2018,6,12,833749499,964884656,false +2018,6,12,161647388,396174892,true +2018,6,12,1293334533,841761386,false +2018,6,12,284134298,618395813,false +2018,6,12,1036125471,598668228,false +2018,6,12,850025693,99047410,false +2018,6,13,1724074322,1442361782,true +2018,6,13,1430052005,1548346402,false +2018,6,13,430408719,57310650,true +2018,6,13,746753881,1552722751,false +2018,6,13,261897093,639438793,true +2018,6,13,238626265,753608198,true +2018,6,13,492115031,1810722714,true +2018,6,13,1642855393,1377102339,false +2018,6,13,839335590,479156937,false +2018,6,13,1763807828,1020956068,false +2018,6,14,305100210,1346001091,false +2018,6,14,748924435,345904862,true +2018,6,14,1381377410,436541048,false +2018,6,14,1592868655,1246578556,false +2018,6,14,577203792,643666761,false +2018,6,14,328062351,950288940,false +2018,6,14,563068834,1725046193,true +2018,6,14,41156450,475523560,true +2018,6,14,2093659494,2066409870,false +2018,6,14,376883961,83562833,false +2018,6,15,311512940,1524857635,false +2018,6,15,690002138,412197387,true +2018,6,15,1559402492,708380959,true +2018,6,15,959562799,1948148807,false +2018,6,15,253356183,1576772945,true +2018,6,15,212154073,1869581559,true +2018,6,15,783762419,578109137,true +2018,6,15,604640204,1371784549,false +2018,6,15,592397012,588691913,false +2018,6,15,2010385806,1943308574,false +2018,6,16,249226582,1432434717,false +2018,6,16,1613041630,589729650,false +2018,6,16,1928711354,198874020,false +2018,6,16,251489013,660605948,true +2018,6,16,2084045282,455067065,true +2018,6,16,181926883,1864693226,true +2018,6,16,2083261171,1315952959,true +2018,6,16,654291184,12634114,true +2018,6,16,312655576,1764023159,true +2018,6,16,1180078602,321674128,true +2018,6,17,205229990,810966328,true +2018,6,17,1085076647,1945513669,false +2018,6,17,1029504258,1642613090,true +2018,6,17,1069287336,426222388,false +2018,6,17,608179748,788033195,false +2018,6,17,886652701,1514019688,false +2018,6,17,2025420548,715254294,false +2018,6,17,1166649592,635594438,true +2018,6,17,826248918,573437102,false +2018,6,17,2041817876,1487548743,false +2018,6,18,731008486,2115584609,false +2018,6,18,629893144,564150428,false +2018,6,18,236201830,13201675,true +2018,6,18,2033703971,893565669,false +2018,6,18,2009837534,36818456,true +2018,6,18,1088682816,239047425,false +2018,6,18,1903926061,1257129534,true +2018,6,18,2077002516,1534268696,false +2018,6,18,1736148323,374896187,true +2018,6,18,2014275141,283343359,false +2018,6,19,670914526,1821433006,false +2018,6,19,919020420,1568535155,true +2018,6,19,1220235375,1015934029,true +2018,6,19,439618382,571528045,true +2018,6,19,1281573489,2069791535,false +2018,6,19,499887798,304481012,false +2018,6,19,1054976532,985458413,false +2018,6,19,1516216021,807831867,true +2018,6,19,1786194001,832841400,true +2018,6,19,1696303032,399625200,true +2018,6,20,1529154114,1872395194,true +2018,6,20,1166973665,1432455677,true +2018,6,20,1926834377,987774305,false +2018,6,20,1797035193,1769454651,false +2018,6,20,1790559703,582619749,true +2018,6,20,1020923266,1731271607,true +2018,6,20,1280418476,1908244640,true +2018,6,20,1979056553,970049692,true +2018,6,20,2073652968,1543875863,false +2018,6,20,1528880782,624981224,true +2018,6,21,1396404805,1367429829,false +2018,6,21,1213118483,787881072,true +2018,6,21,1971979087,1353613830,true +2018,6,21,1412884634,558286810,true +2018,6,21,986286974,834792371,false +2018,6,21,860164151,532458141,false +2018,6,21,149919093,1948852818,true +2018,6,21,139821371,1067920424,false +2018,6,21,306003028,566077665,false +2018,6,21,618559340,764390074,false +2018,6,22,1253268322,941861414,false +2018,6,22,410406014,2054034065,false +2018,6,22,31499131,1949179025,true +2018,6,22,649136052,480726652,true +2018,6,22,844582915,2010887692,false +2018,6,22,2057992102,378154748,true +2018,6,22,218310477,1949218895,true +2018,6,22,833113249,176110436,true +2018,6,22,392549050,655169936,false +2018,6,22,933607206,1685783755,false +2018,6,23,2005661945,1527262893,true +2018,6,23,1187098984,181310632,false +2018,6,23,1813049644,1411486489,false +2018,6,23,1339401182,1146585445,true +2018,6,23,322498537,480700477,true +2018,6,23,2058638403,112304479,false +2018,6,23,642074439,1301608709,true +2018,6,23,364598169,927158526,true +2018,6,23,982403444,7612367,true +2018,6,23,1760246540,1145461488,true +2018,6,24,1425056146,753441795,true +2018,6,24,1075202246,946461098,true +2018,6,24,840308933,1568061932,true +2018,6,24,575153431,1459318716,false +2018,6,24,923061689,1396955829,false +2018,6,24,1476728183,1000474786,true +2018,6,24,1813279500,776555411,true +2018,6,24,856716880,108758539,true +2018,6,24,722683225,641671814,true +2018,6,24,416370072,171335126,false +2018,6,25,730309603,584696883,true +2018,6,25,1382872982,1807010613,true +2018,6,25,58629897,1938938312,true +2018,6,25,401187953,892395955,true +2018,6,25,2036987395,1313226474,true +2018,6,25,2051565655,356457234,false +2018,6,25,711651753,699592467,true +2018,6,25,920015198,305852726,false +2018,6,25,250928168,499744508,true +2018,6,25,1230511469,1049709475,true +2018,6,26,739153391,523176410,false +2018,6,26,771299277,987565106,false +2018,6,26,1364952641,1438776214,false +2018,6,26,670378296,1548773634,true +2018,6,26,1141436975,63346767,true +2018,6,26,15437432,2146636254,false +2018,6,26,1532309213,1790172868,false +2018,6,26,744737020,332000647,false +2018,6,26,624245866,445724280,false +2018,6,26,327909861,1388156834,true +2018,6,27,1058909395,284660656,true +2018,6,27,1635390333,1352417852,false +2018,6,27,155934363,1286285970,false +2018,6,27,1820233243,341394979,false +2018,6,27,479008048,1305722909,false +2018,6,27,210117849,166663044,true +2018,6,27,2026465597,93647527,true +2018,6,27,510348198,1104242829,true +2018,6,27,449673574,189449507,false +2018,6,27,1243562917,1822901767,false +2018,6,28,1783906337,514596715,false +2018,6,28,1469557710,1891310112,true +2018,6,28,407375282,606675757,true +2018,6,28,1047837516,1509464033,false +2018,6,28,1027501423,591394907,false +2018,6,28,1464426516,1418378750,true +2018,6,28,232866766,1663262000,true +2018,6,28,1240491940,1440299531,true +2018,6,28,57689850,603694361,false +2018,6,28,2087610606,166047028,false +2018,6,29,1489348057,1968125741,true +2018,6,29,1474440658,1198124553,false +2018,6,29,231002569,468899842,true +2018,6,29,1626941379,290342663,true +2018,6,29,1344241027,557673450,false +2018,6,29,1336204896,44004310,false +2018,6,29,693309410,80557585,true +2018,6,29,1950126121,1206964096,false +2018,6,29,911184765,397293037,true +2018,6,29,1250180252,1990218765,false +2018,6,30,239838297,1335359480,true +2018,6,30,1404195933,1246480442,false +2018,6,30,2146538272,747390906,true +2018,6,30,115287860,463229529,false +2018,6,30,810482074,904691837,false +2018,6,30,674113796,724479852,true +2018,6,30,1783553629,365367118,true +2018,6,30,886054416,765531166,false +2018,6,30,514348531,547218450,false +2018,6,30,349600801,1616269618,true +2018,6,31,254336485,755917019,false +2018,6,31,2001838447,1280952169,true +2018,6,31,250514021,389979577,false +2018,6,31,814496599,1996887841,true +2018,6,31,959846809,2023154884,false +2018,6,31,396043845,1778243094,false +2018,6,31,1349199789,1808461079,true +2018,6,31,284722528,1017250314,false +2018,6,31,730572084,556786908,true +2018,6,31,1956027347,1244983616,true +2018,7,1,2023056781,953915862,true +2018,7,1,2072011469,1673345549,false +2018,7,1,1521921335,1213224156,true +2018,7,1,789429896,883228910,false +2018,7,1,2066098272,607656000,false +2018,7,1,1394860544,328032482,false +2018,7,1,281030678,312095267,false +2018,7,1,835302425,41468924,false +2018,7,1,1628360926,2091797273,true +2018,7,1,638530151,813372004,false +2018,7,2,2046242721,2045286066,true +2018,7,2,882289884,850668615,true +2018,7,2,1798095404,1174494909,false +2018,7,2,1101110011,1725710175,true +2018,7,2,1332471477,948837222,false +2018,7,2,791709289,584958749,true +2018,7,2,103795240,566786427,true +2018,7,2,992438264,520441500,true +2018,7,2,1908461348,490104183,true +2018,7,2,1322036084,547529188,true +2018,7,3,521263090,91831601,false +2018,7,3,1303981022,2047488042,false +2018,7,3,1268548577,1864825650,false +2018,7,3,327199973,165989129,false +2018,7,3,552774847,366564904,false +2018,7,3,233041833,1795215086,false +2018,7,3,1255787376,1956690931,false +2018,7,3,1802260537,727314220,true +2018,7,3,1483371226,1273860914,true +2018,7,3,556478029,2140834926,true +2018,7,4,47273614,1009031014,false +2018,7,4,1435569949,516090506,true +2018,7,4,1001547515,812925698,true +2018,7,4,492746612,783987418,false +2018,7,4,2097348380,748380159,false +2018,7,4,786449480,2031257258,false +2018,7,4,1822097357,910938828,true +2018,7,4,1231279483,848038236,true +2018,7,4,272274258,588895415,false +2018,7,4,1719959962,450120780,false +2018,7,5,1600864186,991291934,false +2018,7,5,1639076523,1606416034,true +2018,7,5,89028897,779042313,true +2018,7,5,201444211,705327503,false +2018,7,5,1119165690,1697808382,false +2018,7,5,228800949,1398136233,true +2018,7,5,1678030313,1277049388,false +2018,7,5,1337701717,1892429921,true +2018,7,5,1445407708,1612031300,true +2018,7,5,185101680,499753164,false +2018,7,6,247801437,118264978,false +2018,7,6,1979685821,661393709,false +2018,7,6,1135343443,20840505,true +2018,7,6,43611094,471103244,false +2018,7,6,952496818,2036352161,false +2018,7,6,1931245375,941094335,true +2018,7,6,620475332,1240544625,true +2018,7,6,483594301,1992523682,false +2018,7,6,747735823,266661809,false +2018,7,6,1972962529,1023078523,true +2018,7,7,845494287,1637792230,true +2018,7,7,187992274,868990915,true +2018,7,7,6535218,2033617478,true +2018,7,7,1333542841,1428894655,false +2018,7,7,473917660,1671481942,true +2018,7,7,1754182491,49510067,false +2018,7,7,2133167199,1963652239,false +2018,7,7,974874570,945595798,true +2018,7,7,1768718888,1179026399,true +2018,7,7,1268655406,1961161741,false +2018,7,8,1208369295,730705439,true +2018,7,8,1446777371,27976606,true +2018,7,8,1893526287,1239505368,true +2018,7,8,1017848429,1418157137,true +2018,7,8,269568469,1494613227,false +2018,7,8,97517081,521388860,false +2018,7,8,1548713948,305882355,false +2018,7,8,547825796,2135820207,false +2018,7,8,1894789908,1389437828,true +2018,7,8,40086575,1504549374,false +2018,7,9,1038712399,346245008,true +2018,7,9,628699108,638543635,true +2018,7,9,981534431,881296021,false +2018,7,9,348647844,2055395920,false +2018,7,9,1927839790,1709024100,true +2018,7,9,1931411713,182492017,true +2018,7,9,307254705,1101840800,false +2018,7,9,2093614517,763809892,false +2018,7,9,300232718,789273675,false +2018,7,9,1597716618,2074815315,true +2018,7,10,412623951,598760181,true +2018,7,10,1412744180,1066240859,true +2018,7,10,1696441443,1680811114,true +2018,7,10,1003804393,1123451552,false +2018,7,10,1717784185,138749684,false +2018,7,10,739844063,164839097,false +2018,7,10,1196068039,2075397513,false +2018,7,10,297170682,183445679,false +2018,7,10,673130968,592691844,true +2018,7,10,1874527684,1974315016,true +2018,7,11,635084168,1365908400,true +2018,7,11,1111525391,542234764,true +2018,7,11,1266382116,214232275,true +2018,7,11,448472277,834421963,false +2018,7,11,1549002101,35460352,false +2018,7,11,518203172,902358922,false +2018,7,11,725278614,1099315280,true +2018,7,11,1634557575,2006082725,false +2018,7,11,743111145,343214804,false +2018,7,11,219247456,985572874,true +2018,7,12,532984974,2086447845,false +2018,7,12,536794928,698901451,false +2018,7,12,1243358488,1500993560,false +2018,7,12,932805068,887596682,true +2018,7,12,874674052,92678522,true +2018,7,12,1348330464,1973393831,true +2018,7,12,43713178,2013401321,true +2018,7,12,555816023,1746147057,false +2018,7,12,349862964,991152807,true +2018,7,12,889278848,1791936462,false +2018,7,13,2139174453,1950031727,false +2018,7,13,173627402,913283097,true +2018,7,13,1292551942,1183607992,false +2018,7,13,552845610,1222841637,true +2018,7,13,1908746480,1399294455,false +2018,7,13,143093384,846938014,false +2018,7,13,109429180,1778920167,false +2018,7,13,1361778120,1784930638,true +2018,7,13,544077909,1463197268,true +2018,7,13,965408457,905818087,false +2018,7,14,1203540063,1456639446,true +2018,7,14,695414512,592837721,false +2018,7,14,1781662726,109051511,false +2018,7,14,818891032,1008824946,false +2018,7,14,1715016299,1157272395,false +2018,7,14,357461247,1230311081,false +2018,7,14,2121385347,637717069,true +2018,7,14,1473657418,725745764,true +2018,7,14,1921768506,1296886063,true +2018,7,14,73775732,1082146084,false +2018,7,15,1764810298,1864327898,true +2018,7,15,2089302592,180102481,false +2018,7,15,1797305519,2034628553,true +2018,7,15,1874138767,1137268127,false +2018,7,15,1808500688,1175224979,true +2018,7,15,1919871392,1555474504,false +2018,7,15,983982884,1440126775,false +2018,7,15,2043731242,1741950857,false +2018,7,15,328127151,837527297,false +2018,7,15,1773437648,1904502563,false +2018,7,16,1826412949,832539693,true +2018,7,16,1051412439,1373449430,true +2018,7,16,240017105,681576339,true +2018,7,16,846443059,741953549,false +2018,7,16,636421901,1154295795,true +2018,7,16,1869145147,1482131247,true +2018,7,16,231897803,1537643972,false +2018,7,16,1808841128,1687151957,false +2018,7,16,490462801,1664984985,true +2018,7,16,1247148971,1565480280,false +2018,7,17,868603165,329564782,false +2018,7,17,1472068416,328021874,true +2018,7,17,148659404,945434236,false +2018,7,17,1059989509,326251746,true +2018,7,17,1122118585,53937261,true +2018,7,17,225658686,1823310166,true +2018,7,17,998940092,1917024254,true +2018,7,17,971149777,2048181265,true +2018,7,17,1572840914,2146917496,true +2018,7,17,1086359893,1057739512,false +2018,7,18,1863337370,1458084994,true +2018,7,18,992121752,1928648720,true +2018,7,18,1243268070,805261884,false +2018,7,18,448901406,925595165,false +2018,7,18,1515246031,76514726,true +2018,7,18,1801305732,839490388,false +2018,7,18,1189155343,1536173754,true +2018,7,18,1548939735,1158281574,true +2018,7,18,1470011541,931392965,false +2018,7,18,67468345,78717006,true +2018,7,19,1981614214,1531918307,false +2018,7,19,2067379555,1496124832,true +2018,7,19,1720681770,1397425836,true +2018,7,19,1354746355,775238741,false +2018,7,19,137217657,753127793,true +2018,7,19,1132977790,1622647111,false +2018,7,19,674815037,908010395,false +2018,7,19,809748704,1278705362,false +2018,7,19,1171755790,1862199049,false +2018,7,19,2123118145,1383458396,false +2018,7,20,1305973906,2094531200,false +2018,7,20,670834624,1515665582,false +2018,7,20,1088835768,1009546127,true +2018,7,20,1391206614,1468018281,true +2018,7,20,347410424,1661046895,true +2018,7,20,260729276,1234241529,true +2018,7,20,2039697082,646019529,true +2018,7,20,898102652,667994321,true +2018,7,20,1724792941,1107051878,false +2018,7,20,253898371,162778076,false +2018,7,21,937698553,109634103,false +2018,7,21,1808569790,1211191133,false +2018,7,21,715490703,1641509695,false +2018,7,21,514400171,2047567041,false +2018,7,21,1846052358,855421490,true +2018,7,21,2102534128,293080325,true +2018,7,21,502648390,979837941,false +2018,7,21,417175367,643089601,false +2018,7,21,1503212396,1444960324,false +2018,7,21,258005227,326003452,true +2018,7,22,989317808,1780769563,true +2018,7,22,2125697989,1986843726,true +2018,7,22,398622775,1983121863,true +2018,7,22,678078729,908431359,true +2018,7,22,1590704717,1710231194,false +2018,7,22,551791882,238417697,false +2018,7,22,974720453,1773694401,false +2018,7,22,1190709347,1429642718,true +2018,7,22,2110825068,806718123,true +2018,7,22,2039398096,1478901095,false +2018,7,23,156741329,324342014,true +2018,7,23,775583306,1068513611,false +2018,7,23,1068628182,288093461,false +2018,7,23,857490496,1947372840,true +2018,7,23,1590795453,493576047,false +2018,7,23,1166727172,126044131,false +2018,7,23,119213753,1240576936,true +2018,7,23,1297653963,760827895,true +2018,7,23,1463577376,2012421827,true +2018,7,23,2133416649,1113395323,false +2018,7,24,484100055,1295481337,false +2018,7,24,1670632014,1674146016,true +2018,7,24,131687825,91006532,true +2018,7,24,1614886141,1581590956,false +2018,7,24,1954572783,732242527,true +2018,7,24,525879398,2043197071,false +2018,7,24,1815495602,1522552820,true +2018,7,24,142205615,1950999202,true +2018,7,24,229281900,1003746862,false +2018,7,24,2021973265,1821296062,false +2018,7,25,2096717695,1661591273,false +2018,7,25,1692645714,1028002337,false +2018,7,25,666865741,254230835,true +2018,7,25,375663238,285759457,false +2018,7,25,2041960921,13752037,true +2018,7,25,409530042,244807641,true +2018,7,25,1963481173,708136048,false +2018,7,25,1124514390,1835037204,false +2018,7,25,114841934,1263927609,true +2018,7,25,1983688538,1354046443,true +2018,7,26,1654200692,364387353,true +2018,7,26,871156164,2089003076,false +2018,7,26,530403115,1702730460,true +2018,7,26,810491837,1715034299,true +2018,7,26,1548829556,1713278160,false +2018,7,26,1982832482,309573789,false +2018,7,26,955117253,1043130311,false +2018,7,26,1876321602,1365474041,false +2018,7,26,868655592,456371228,false +2018,7,26,303672812,506465077,false +2018,7,27,792281860,606021524,false +2018,7,27,1740771868,1452352491,false +2018,7,27,1315284306,1647011888,true +2018,7,27,934805609,1016193527,false +2018,7,27,86183473,1266493271,false +2018,7,27,1672450186,1442306557,true +2018,7,27,1454580884,435745498,true +2018,7,27,883338112,1459700946,true +2018,7,27,225599489,1045556411,false +2018,7,27,1358129491,2084270987,false +2018,7,28,111813273,526676937,false +2018,7,28,141377561,1301935650,false +2018,7,28,1044778466,1007040097,false +2018,7,28,230982021,1117669358,true +2018,7,28,1911186500,455271939,false +2018,7,28,426736437,645266271,true +2018,7,28,813529477,208588150,true +2018,7,28,1405856622,2058638021,true +2018,7,28,2115166874,1803304102,true +2018,7,28,2114039567,798852613,false +2018,7,29,850940332,33088659,true +2018,7,29,445327067,1895478661,false +2018,7,29,41248043,68233951,true +2018,7,29,752329075,2127710403,true +2018,7,29,93337909,1364918234,true +2018,7,29,847178896,465126212,true +2018,7,29,781111494,1414736680,false +2018,7,29,453499742,1582204135,true +2018,7,29,1694204008,2099782222,true +2018,7,29,1455415560,174326466,false +2018,7,30,1016723037,528392646,false +2018,7,30,897361367,1742625802,false +2018,7,30,861953150,234241139,false +2018,7,30,1603881582,59691705,false +2018,7,30,1265167926,983578242,true +2018,7,30,1562849293,1152659960,false +2018,7,30,470042994,1572345894,true +2018,7,30,1344349660,1185782887,false +2018,7,30,1018383188,94618023,false +2018,7,30,836803250,4139746,true +2018,7,31,779321868,1270234559,true +2018,7,31,2059483091,2039464025,false +2018,7,31,729133862,786196541,false +2018,7,31,1557652631,1667043783,true +2018,7,31,293124650,662277733,false +2018,7,31,402125819,1376102778,false +2018,7,31,219784558,69387140,false +2018,7,31,33223817,2076128032,true +2018,7,31,1820690258,691900337,false +2018,7,31,783128605,1621865052,false +2018,8,1,941785203,1020211124,false +2018,8,1,355744875,1837365195,false +2018,8,1,1749033846,123708550,true +2018,8,1,965416541,78497989,true +2018,8,1,1213140331,1539022323,false +2018,8,1,232632725,1662864443,true +2018,8,1,881554951,2130423600,true +2018,8,1,1208855228,1874557821,false +2018,8,1,1656561955,2069464470,false +2018,8,1,497882732,1574838237,true +2018,8,2,1376423741,1404950716,false +2018,8,2,266024910,1469341271,true +2018,8,2,2028412145,386257053,true +2018,8,2,1776079171,406292685,true +2018,8,2,286851493,1965324696,true +2018,8,2,845203127,97579135,false +2018,8,2,1737055196,526245737,false +2018,8,2,1253035121,1846040507,false +2018,8,2,1588970987,792485739,true +2018,8,2,2001061485,420770051,false +2018,8,3,1256183817,1655570368,true +2018,8,3,1072044979,2007509015,false +2018,8,3,1261343978,1603071030,true +2018,8,3,352052777,451431844,true +2018,8,3,75142249,1910698755,false +2018,8,3,1300980420,1573468292,true +2018,8,3,1555424242,1571867308,true +2018,8,3,1626122030,1432491998,true +2018,8,3,744593057,754278318,false +2018,8,3,83310906,1596626812,false +2018,8,4,196472665,658191310,false +2018,8,4,263913918,1235299381,false +2018,8,4,1067941178,1657752081,true +2018,8,4,119736228,295596096,true +2018,8,4,1978105669,1766392267,false +2018,8,4,251705689,778663875,true +2018,8,4,505150987,1735454798,true +2018,8,4,954210786,1717470211,true +2018,8,4,1379095224,1958218775,true +2018,8,4,337394745,1122306047,true +2018,8,5,1741543651,47949203,false +2018,8,5,1413210821,1112218455,true +2018,8,5,1349606620,1035372746,true +2018,8,5,978135563,2133217488,false +2018,8,5,766904954,323389592,true +2018,8,5,499100904,901708760,false +2018,8,5,1467039889,829386555,true +2018,8,5,214097391,681661650,true +2018,8,5,528239514,2023371099,false +2018,8,5,628372685,1096625687,true +2018,8,6,1872475701,1581783005,true +2018,8,6,312580019,515513320,false +2018,8,6,1728522378,1164292635,true +2018,8,6,2078861371,501456842,false +2018,8,6,858572724,1370395994,false +2018,8,6,1923096651,1358636035,true +2018,8,6,28851387,95455123,true +2018,8,6,942786263,954199601,false +2018,8,6,1556287184,1394574062,false +2018,8,6,1962476948,1741076529,false +2018,8,7,1930387127,226120825,true +2018,8,7,639508481,1600251084,true +2018,8,7,872964529,739066886,false +2018,8,7,339724618,1336039913,false +2018,8,7,1161152582,1789035575,true +2018,8,7,1091938802,1186036795,true +2018,8,7,1154259376,1492250905,false +2018,8,7,1400610631,1356048534,false +2018,8,7,501516002,530002151,false +2018,8,7,156282387,2045011317,true +2018,8,8,1229676640,1850764742,true +2018,8,8,821466879,1462399842,true +2018,8,8,829931614,1758193813,true +2018,8,8,1271088800,1714515246,false +2018,8,8,851586783,1174430767,false +2018,8,8,616123448,1657004356,true +2018,8,8,386570194,1305882032,false +2018,8,8,575859070,1464746882,true +2018,8,8,640936796,51650965,true +2018,8,8,519076622,277797709,false +2018,8,9,1064156874,1265028818,false +2018,8,9,2131278937,1161720806,true +2018,8,9,1685836008,1983287349,true +2018,8,9,1086001959,849487151,false +2018,8,9,1911357103,1147957681,true +2018,8,9,992636418,1040226641,true +2018,8,9,1726057857,1993001570,true +2018,8,9,6071858,1799933967,true +2018,8,9,1566983797,1102900581,false +2018,8,9,1471582472,1673075367,true +2018,8,10,1652343388,1738408676,true +2018,8,10,1539919270,877261708,false +2018,8,10,913246342,401625053,false +2018,8,10,1357951893,597162854,false +2018,8,10,1998475891,501202576,true +2018,8,10,288711609,614258960,true +2018,8,10,74118383,1843505820,false +2018,8,10,1203024011,922848404,false +2018,8,10,141175632,2140379944,true +2018,8,10,1342483359,1135805782,true +2018,8,11,868834250,181338808,false +2018,8,11,2054157791,1069025768,true +2018,8,11,1267495961,1762336629,false +2018,8,11,1415693573,1493552189,false +2018,8,11,459064300,1553137684,false +2018,8,11,2049096757,2099670840,false +2018,8,11,130109147,946688163,true +2018,8,11,540018381,585456332,true +2018,8,11,1110081999,2083534813,true +2018,8,11,384461758,482021923,false +2018,8,12,637442075,2015693021,true +2018,8,12,1546591499,1591960454,true +2018,8,12,174480066,606023236,true +2018,8,12,1217505870,1975176191,true +2018,8,12,582176923,914246442,true +2018,8,12,256443968,1658575087,false +2018,8,12,835747920,935684986,false +2018,8,12,1941556929,1515962373,true +2018,8,12,503356778,386273415,true +2018,8,12,392186838,1842504287,true +2018,8,13,1603956718,695025649,true +2018,8,13,816100752,1485375932,true +2018,8,13,1766623791,2057783877,false +2018,8,13,1372426716,1037793031,false +2018,8,13,1351713512,1531122641,true +2018,8,13,1991586478,1689182115,false +2018,8,13,2128017847,1665340836,false +2018,8,13,1901124099,844385089,false +2018,8,13,2077506189,59137465,true +2018,8,13,1420270761,479786531,true +2018,8,14,659386556,400645297,true +2018,8,14,1053562329,1376058397,true +2018,8,14,822711450,998350007,true +2018,8,14,847981823,134426292,true +2018,8,14,1014671096,1691508829,false +2018,8,14,820634692,1638704424,false +2018,8,14,1831859173,658370954,true +2018,8,14,1156067773,1950277454,true +2018,8,14,1834682987,2020665311,false +2018,8,14,1873474,1797737599,true +2018,8,15,954427723,545534743,false +2018,8,15,427707486,142425619,false +2018,8,15,1230479593,1726484812,false +2018,8,15,1466357472,773443861,false +2018,8,15,1655793884,1842780713,true +2018,8,15,2061580771,1294581050,false +2018,8,15,879555559,1588526964,true +2018,8,15,308318840,588793665,true +2018,8,15,1844590405,1424187502,true +2018,8,15,1244782639,981820223,true +2018,8,16,1993601670,1503125759,false +2018,8,16,1294465327,533688204,true +2018,8,16,49191805,717352623,false +2018,8,16,310680741,631663433,false +2018,8,16,111080929,1687620809,false +2018,8,16,1856485550,270152633,true +2018,8,16,68329652,1650030158,true +2018,8,16,829755397,123582940,true +2018,8,16,1122325396,2013933435,true +2018,8,16,1880148256,1398298133,false +2018,8,17,1708422264,1574507013,true +2018,8,17,1764942723,579727808,true +2018,8,17,1556884109,1283901035,true +2018,8,17,2027383148,18071296,false +2018,8,17,1342745898,1785645533,false +2018,8,17,1933198121,1333146929,true +2018,8,17,87774744,1515348917,false +2018,8,17,338699957,1917195452,true +2018,8,17,2145092769,548148846,true +2018,8,17,1563298159,60088136,false +2018,8,18,2090447653,2094954394,false +2018,8,18,1194527503,1738574198,false +2018,8,18,2110837533,1468251990,true +2018,8,18,1009309721,978633130,true +2018,8,18,1455694812,2100599576,true +2018,8,18,1018187651,1733561235,false +2018,8,18,1933760110,918958331,true +2018,8,18,1939184208,1576120943,false +2018,8,18,383188323,2138019787,true +2018,8,18,144911685,138429501,false +2018,8,19,991564134,666281349,true +2018,8,19,1148660740,975664725,true +2018,8,19,1318177774,627616004,false +2018,8,19,1078766132,1399066669,true +2018,8,19,2049363836,704953622,false +2018,8,19,775561858,1737511718,true +2018,8,19,844732681,1424470568,true +2018,8,19,949582996,1495948986,true +2018,8,19,1893669057,1348690252,false +2018,8,19,1811608145,1465308775,true +2018,8,20,1197875235,1462265420,false +2018,8,20,835412487,620405539,false +2018,8,20,1921655305,358551546,false +2018,8,20,1440083538,390634563,true +2018,8,20,330997083,1537580302,false +2018,8,20,1890551250,546063385,true +2018,8,20,693405010,775612706,false +2018,8,20,1855839255,1051599543,false +2018,8,20,1365975694,1712804021,false +2018,8,20,1111020110,1746974097,false +2018,8,21,61263480,834477111,true +2018,8,21,1886984580,1892545791,true +2018,8,21,1318821537,1889993100,true +2018,8,21,22956459,1664624840,false +2018,8,21,1503799325,590385297,true +2018,8,21,1067749284,183483200,false +2018,8,21,1004405516,1691449980,true +2018,8,21,1742914029,1892744856,false +2018,8,21,1511863506,1639638321,false +2018,8,21,121040590,2102262390,false +2018,8,22,22623144,1350316776,true +2018,8,22,208445472,218785786,true +2018,8,22,2031968395,1757094838,false +2018,8,22,561377273,1491832993,true +2018,8,22,1119005882,1991231950,true +2018,8,22,1649503967,1716969691,true +2018,8,22,383202882,1517732511,false +2018,8,22,1096980316,871643436,true +2018,8,22,1750141760,359683611,true +2018,8,22,619867141,1821077796,false +2018,8,23,435848681,975688387,true +2018,8,23,248558666,1668130206,true +2018,8,23,1685447547,965707855,false +2018,8,23,951871922,1917116420,false +2018,8,23,1119184534,550007822,false +2018,8,23,921246285,1875441284,false +2018,8,23,2049020349,2136201554,false +2018,8,23,2047778433,596919680,true +2018,8,23,430661866,1341975934,false +2018,8,23,27086721,714776734,true +2018,8,24,888402293,682312342,false +2018,8,24,230244615,1776854590,false +2018,8,24,1734102467,1669268574,false +2018,8,24,781626418,44108361,false +2018,8,24,445195896,2023177070,true +2018,8,24,1252919952,251364750,false +2018,8,24,2036380962,850030273,true +2018,8,24,315238329,143682861,true +2018,8,24,694805650,2094200749,false +2018,8,24,404880661,895654331,true +2018,8,25,1404549471,1149467970,false +2018,8,25,1213550228,2132068392,true +2018,8,25,634052941,195306370,false +2018,8,25,1375799425,1603341968,false +2018,8,25,23386764,805722106,false +2018,8,25,1440751045,190901626,true +2018,8,25,1531403265,697407636,false +2018,8,25,1047620390,2076639790,false +2018,8,25,285875750,1677617858,false +2018,8,25,1454554989,1974729009,false +2018,8,26,582936605,849528922,true +2018,8,26,92137850,787359756,false +2018,8,26,2084923139,1012063403,true +2018,8,26,842884265,2059189208,true +2018,8,26,382133685,705550268,false +2018,8,26,924006679,938393167,false +2018,8,26,597559672,2044604090,true +2018,8,26,1715189932,468510044,false +2018,8,26,1307857996,454666224,false +2018,8,26,1586379506,1121993037,true +2018,8,27,419555274,1020026345,false +2018,8,27,1394911517,367262782,true +2018,8,27,987950051,1538541711,false +2018,8,27,761651418,1306364252,false +2018,8,27,1617928920,2069636187,true +2018,8,27,612007859,626814208,false +2018,8,27,1821026370,504729334,true +2018,8,27,54843956,1920220896,true +2018,8,27,1572186886,654128606,false +2018,8,27,386828559,1421032905,false +2018,8,28,2020422278,670617317,false +2018,8,28,1080845301,82038932,false +2018,8,28,1734622573,861836199,false +2018,8,28,61261018,41759323,false +2018,8,28,1772672235,1288171602,true +2018,8,28,1924847584,937973785,true +2018,8,28,108173633,89477961,false +2018,8,28,770399008,1289907255,true +2018,8,28,1582228700,968711822,false +2018,8,28,1030608642,1860311229,false +2018,8,29,803600770,1440052471,true +2018,8,29,458543852,971756303,false +2018,8,29,767020857,147426459,false +2018,8,29,1659972459,1795851405,false +2018,8,29,910368487,1321444603,false +2018,8,29,1430472229,1254871404,true +2018,8,29,1496789736,1848380108,true +2018,8,29,1040235823,1096020893,true +2018,8,29,200898056,1869612150,false +2018,8,29,1639059111,733360545,false +2018,8,30,1657189791,779053471,false +2018,8,30,1062549148,1741257111,false +2018,8,30,42542538,340743514,true +2018,8,30,1482860709,860956400,true +2018,8,30,1564376816,470967768,true +2018,8,30,1312830397,2030601602,false +2018,8,30,337875472,608773541,true +2018,8,30,1813825087,1080807162,true +2018,8,30,1975420310,1591744991,true +2018,8,30,809900304,83769301,false +2018,8,31,456138400,1424938464,false +2018,8,31,704696415,1656010604,false +2018,8,31,1480858367,806105179,true +2018,8,31,1965616515,1303288509,false +2018,8,31,347422292,2036897496,false +2018,8,31,1898270937,1340421096,false +2018,8,31,407593242,1993823247,true +2018,8,31,1161352117,934553770,false +2018,8,31,610802306,1293597339,true +2018,8,31,336527811,1920766002,true +2018,9,1,1275893346,1997232372,true +2018,9,1,1717304939,2124041504,false +2018,9,1,1397972653,1714877621,false +2018,9,1,1229356562,1844996764,false +2018,9,1,1403085867,910547608,false +2018,9,1,1146698832,259378109,false +2018,9,1,1088418925,1423599020,false +2018,9,1,429772662,515228340,false +2018,9,1,124164909,1141590967,true +2018,9,1,1237677810,1584019846,false +2018,9,2,1506851740,608341434,false +2018,9,2,652432759,1167099948,false +2018,9,2,1619200591,205239063,true +2018,9,2,1137033465,1617163382,true +2018,9,2,1303677635,1414038218,true +2018,9,2,1040630224,1764603402,true +2018,9,2,1708940213,1761448361,true +2018,9,2,1659162537,930235721,false +2018,9,2,1638925252,619906486,false +2018,9,2,827860492,1021199889,true +2018,9,3,1966023693,1640672734,false +2018,9,3,163838875,1880987526,true +2018,9,3,1804273864,63787711,false +2018,9,3,1942507104,393709679,false +2018,9,3,1169657878,1044234311,true +2018,9,3,1991582904,789705675,false +2018,9,3,31710572,1040828749,false +2018,9,3,439150024,1201122216,true +2018,9,3,236739697,500120945,false +2018,9,3,1116476791,1388193886,true +2018,9,4,579616807,311215848,true +2018,9,4,2093096805,1054048450,false +2018,9,4,73270188,790280447,true +2018,9,4,477315416,844909552,false +2018,9,4,1587742719,2131579254,false +2018,9,4,649519381,243058610,true +2018,9,4,1655197324,1926077960,false +2018,9,4,1669921441,1433962164,false +2018,9,4,959910509,1791388854,true +2018,9,4,79670100,1062113287,false +2018,9,5,1229763227,1967339699,false +2018,9,5,717192023,850873761,false +2018,9,5,1365675406,989673214,false +2018,9,5,62372435,1838643670,false +2018,9,5,991963891,1464945937,true +2018,9,5,355163519,1553234947,true +2018,9,5,943712907,556083505,false +2018,9,5,914299991,1437590245,false +2018,9,5,943159114,1478735675,false +2018,9,5,802774678,223934281,true +2018,9,6,1298662474,585963012,false +2018,9,6,197638484,1022118939,true +2018,9,6,969469246,1350357165,true +2018,9,6,900565583,2060765450,false +2018,9,6,267601495,691848306,true +2018,9,6,2057471946,1242202920,false +2018,9,6,1259554786,1753454306,false +2018,9,6,1492330636,1690819417,true +2018,9,6,724951053,1466188157,true +2018,9,6,1956211423,1636981873,true +2018,9,7,1556947337,1974980400,false +2018,9,7,2019953312,1082901322,true +2018,9,7,1708216329,783500583,true +2018,9,7,2000254398,923339071,true +2018,9,7,353436626,1273996262,true +2018,9,7,1507529667,2009417157,true +2018,9,7,70505424,396379824,true +2018,9,7,442798127,1198020136,true +2018,9,7,188103156,1269397155,true +2018,9,7,1317476565,1357152907,true +2018,9,8,995657663,1876171793,true +2018,9,8,1171306616,1950965181,false +2018,9,8,130994508,1052763130,false +2018,9,8,1266185047,491188614,false +2018,9,8,1562198836,790118585,false +2018,9,8,1543921268,500988789,true +2018,9,8,1666675748,1139618884,true +2018,9,8,661185447,874946134,true +2018,9,8,1630895355,730537863,false +2018,9,8,758551537,1415482074,false +2018,9,9,71699142,2126951877,false +2018,9,9,349871740,23481881,true +2018,9,9,215991292,1927216136,false +2018,9,9,765791822,75349746,false +2018,9,9,1028673218,1445968899,true +2018,9,9,800891298,502954874,true +2018,9,9,72194083,570775283,true +2018,9,9,2069968213,625909232,false +2018,9,9,154909820,1408712803,true +2018,9,9,1205316599,191583485,false +2018,9,10,325300111,163703265,false +2018,9,10,853850967,1045163293,false +2018,9,10,1754042105,1185046533,true +2018,9,10,138835110,1995712905,false +2018,9,10,341505022,1282774609,true +2018,9,10,1663400583,950180097,false +2018,9,10,825864683,1737568993,true +2018,9,10,1735982498,2035871542,true +2018,9,10,1914225784,1531110364,false +2018,9,10,1425773464,83682774,false +2018,9,11,392680954,1156757538,false +2018,9,11,1979456869,1760219226,false +2018,9,11,1309624124,1304095157,true +2018,9,11,11274649,643724334,false +2018,9,11,1784025941,228578243,true +2018,9,11,1404279787,966847183,false +2018,9,11,240965608,1523236708,false +2018,9,11,1681086224,799860195,false +2018,9,11,811255418,506763624,false +2018,9,11,1319534313,1977789830,true +2018,9,12,1425268204,853987524,true +2018,9,12,1590868054,74919743,false +2018,9,12,1649382210,265969792,false +2018,9,12,321544531,1608876113,true +2018,9,12,873633585,1974645050,false +2018,9,12,1738942259,313516141,true +2018,9,12,1482144304,1131175320,false +2018,9,12,1583608333,1099932141,true +2018,9,12,58499845,1742817848,true +2018,9,12,115884986,1788670752,false +2018,9,13,163851259,1815330050,false +2018,9,13,1761080817,1039034447,true +2018,9,13,1962826058,872579314,true +2018,9,13,1229235815,1215948930,true +2018,9,13,2117928794,1705474727,true +2018,9,13,866740482,766922545,true +2018,9,13,1778661856,1235791710,false +2018,9,13,1265653622,1349469024,true +2018,9,13,1367895311,1420260136,false +2018,9,13,297704174,775389404,true +2018,9,14,1606352976,486640754,false +2018,9,14,259221383,1976871571,true +2018,9,14,2072172087,1607559740,true +2018,9,14,681599690,972145976,false +2018,9,14,1604790202,2095800218,true +2018,9,14,501081456,577847218,true +2018,9,14,1566922960,286392902,false +2018,9,14,679200265,2042561297,false +2018,9,14,1601760001,704158584,false +2018,9,14,1355656545,366474742,false +2018,9,15,42632844,364044598,false +2018,9,15,570404541,1067696044,false +2018,9,15,171183412,620313567,false +2018,9,15,1450298687,1347679598,false +2018,9,15,1472385493,790789794,false +2018,9,15,282955026,1996048523,true +2018,9,15,59307584,46301970,false +2018,9,15,31160890,1817925220,true +2018,9,15,255173921,1456035621,false +2018,9,15,1421672456,588608380,true +2018,9,16,1554444245,1145762362,true +2018,9,16,1688982448,359448847,false +2018,9,16,1510250616,800293111,false +2018,9,16,1511514481,202383592,true +2018,9,16,1960152726,1762236709,true +2018,9,16,667479879,1742126458,true +2018,9,16,1655337283,853859790,true +2018,9,16,1966428459,1796140122,false +2018,9,16,1038455685,1964841286,true +2018,9,16,1711008326,1382924212,true +2018,9,17,2107627607,471426548,false +2018,9,17,1517788063,584300367,true +2018,9,17,2090062254,407893994,true +2018,9,17,1584200756,2112371769,false +2018,9,17,1064038978,96094378,false +2018,9,17,2074744697,1617623613,false +2018,9,17,1404232722,1172118455,false +2018,9,17,530695859,2109100193,false +2018,9,17,573523413,557047228,true +2018,9,17,1227970217,1493370855,true +2018,9,18,915849174,2048954730,true +2018,9,18,2066570911,1744096392,false +2018,9,18,522951922,1305977957,true +2018,9,18,724649627,1601023711,true +2018,9,18,1343188653,1516630994,true +2018,9,18,1340656226,1976673981,false +2018,9,18,1158400920,2035338085,true +2018,9,18,1495931827,1272046940,true +2018,9,18,1467907459,1594299144,false +2018,9,18,1770910107,395979439,false +2018,9,19,2012907821,715718241,true +2018,9,19,1098261515,1692625727,false +2018,9,19,427617132,4896496,true +2018,9,19,976108016,1610181140,true +2018,9,19,1865402379,288257147,false +2018,9,19,1287118192,1238535776,false +2018,9,19,1432200396,1982634229,true +2018,9,19,695815471,1156038363,true +2018,9,19,1522546509,477854771,false +2018,9,19,652798619,424574238,false +2018,9,20,953865927,699199745,true +2018,9,20,945800364,850553173,true +2018,9,20,1047190235,1873477757,true +2018,9,20,664257453,2057884454,true +2018,9,20,1068148459,2125935035,false +2018,9,20,445691351,445945503,true +2018,9,20,1092412852,1487717154,false +2018,9,20,540753389,749292675,true +2018,9,20,567417628,1063047173,true +2018,9,20,2078301212,450452307,true +2018,9,21,1730953851,2062501980,true +2018,9,21,160974533,674024167,true +2018,9,21,697515457,659173813,true +2018,9,21,2064735232,1051225712,true +2018,9,21,1159558726,863285165,false +2018,9,21,1665625594,1032001848,true +2018,9,21,755354158,1883316115,true +2018,9,21,2130795386,263565446,true +2018,9,21,1013565859,1092810241,true +2018,9,21,1505041196,1222842719,true +2018,9,22,940043601,1095063187,true +2018,9,22,874310542,1213057830,false +2018,9,22,963044331,1207393540,true +2018,9,22,180390542,1182615629,true +2018,9,22,1971250848,289739860,false +2018,9,22,2004746032,686322825,false +2018,9,22,1399407819,1844966341,false +2018,9,22,1050864870,938525774,false +2018,9,22,722834754,1453374169,false +2018,9,22,947088216,499327871,true +2018,9,23,750733327,646806668,false +2018,9,23,893994696,628048274,false +2018,9,23,1024015741,1179248518,false +2018,9,23,1638345826,525156251,false +2018,9,23,963813298,295382009,true +2018,9,23,1862855439,720410873,false +2018,9,23,409145952,1384356190,false +2018,9,23,1575775401,933304733,false +2018,9,23,1343909924,1752235358,false +2018,9,23,634367133,1068948767,true +2018,9,24,91705265,165298812,true +2018,9,24,2066581603,1344485408,false +2018,9,24,1773173467,1356372062,true +2018,9,24,686170963,1917854121,true +2018,9,24,48820249,1173610704,true +2018,9,24,893225662,1626064005,false +2018,9,24,2056258333,389809095,true +2018,9,24,2085331027,259743951,false +2018,9,24,788745102,1050507803,false +2018,9,24,1001906236,538190880,false +2018,9,25,1103452429,25207212,true +2018,9,25,484486518,1696213801,true +2018,9,25,171518466,1664915244,true +2018,9,25,1994144332,1034226081,true +2018,9,25,585185011,1575497927,true +2018,9,25,389952520,99159486,false +2018,9,25,641930826,1120442972,true +2018,9,25,146006058,971999941,true +2018,9,25,467360976,425512718,false +2018,9,25,1333569227,89446987,true +2018,9,26,300744017,334040392,true +2018,9,26,654895353,735446589,true +2018,9,26,443815458,2017212820,true +2018,9,26,1603259179,648756689,false +2018,9,26,2120574334,51069391,true +2018,9,26,151984927,1529865694,false +2018,9,26,1966380533,826394018,false +2018,9,26,1601764798,1851347298,true +2018,9,26,969315,242410059,false +2018,9,26,374435040,1967491339,false +2018,9,27,569005355,1289080598,false +2018,9,27,699076096,288035116,false +2018,9,27,1051584679,1203129620,true +2018,9,27,482763326,782230369,false +2018,9,27,545382157,822908923,true +2018,9,27,1171262797,2060117446,false +2018,9,27,127272502,167827250,false +2018,9,27,1595458577,1112644513,true +2018,9,27,330938386,222156138,false +2018,9,27,2012392734,1415097275,true +2018,9,28,744915623,1813664353,false +2018,9,28,326724706,464893666,true +2018,9,28,1938180846,77161888,false +2018,9,28,18483336,1142154581,false +2018,9,28,1649097710,534016587,false +2018,9,28,1945248780,1352801272,false +2018,9,28,1567360991,515173217,false +2018,9,28,1139799283,1778713259,true +2018,9,28,1958449987,1566305537,false +2018,9,28,1282461078,1870297887,true +2018,9,29,667755060,2097236621,true +2018,9,29,860829207,697166830,false +2018,9,29,1672093266,323252923,true +2018,9,29,1037916491,447374829,false +2018,9,29,1416402974,1839612416,false +2018,9,29,81277244,252852720,true +2018,9,29,1278824612,591124690,false +2018,9,29,1908686491,1511236471,false +2018,9,29,1182298299,1018011311,false +2018,9,29,103283634,1908132086,false +2018,9,30,1821878605,442635258,false +2018,9,30,749893303,1596984514,false +2018,9,30,529095777,619627520,true +2018,9,30,1934024847,1577095616,true +2018,9,30,1675636952,1385225668,false +2018,9,30,2140091495,1782873452,false +2018,9,30,1869488936,1074693482,false +2018,9,30,798538153,1413149910,true +2018,9,30,2114081648,1520902484,true +2018,9,30,2001392943,1380754587,false +2018,9,31,1921879483,291933518,true +2018,9,31,189446099,429777940,true +2018,9,31,1980130128,510757727,false +2018,9,31,1063372707,1987953190,true +2018,9,31,476689503,1073925397,false +2018,9,31,757560924,1330812537,false +2018,9,31,1992445805,508221800,true +2018,9,31,473260807,866163715,false +2018,9,31,899407940,1096462986,false +2018,9,31,119764421,1977463340,false +2018,10,1,662461026,1246479419,true +2018,10,1,1331424298,1095841777,false +2018,10,1,683174668,1312759418,true +2018,10,1,1108957633,407247056,true +2018,10,1,1867505051,1343988138,false +2018,10,1,2130985861,263005543,true +2018,10,1,1714309274,786782346,false +2018,10,1,2119075443,1696580106,false +2018,10,1,195716294,73931144,false +2018,10,1,866187407,492790716,false +2018,10,2,1722165021,658668633,false +2018,10,2,1907751139,970018895,false +2018,10,2,606449015,217811427,true +2018,10,2,1099021923,344666264,true +2018,10,2,199315805,2068978646,false +2018,10,2,2109765314,7031795,false +2018,10,2,1151162139,764868230,true +2018,10,2,1343323542,1662339738,true +2018,10,2,1675673412,1218843363,false +2018,10,2,337539424,276050032,true +2018,10,3,1977962271,324902830,true +2018,10,3,861568850,477459531,false +2018,10,3,2138112176,1895614508,false +2018,10,3,1332295658,157595635,false +2018,10,3,2050508129,1208327179,false +2018,10,3,2138664375,818955075,false +2018,10,3,777514616,1708545857,true +2018,10,3,1098274024,1215566125,false +2018,10,3,882095319,174199291,false +2018,10,3,1712006515,1658554266,true +2018,10,4,2123189693,676880379,false +2018,10,4,1080111862,1157884760,false +2018,10,4,1040272510,880339970,true +2018,10,4,1285592224,742250386,true +2018,10,4,1935659379,1485010701,false +2018,10,4,610034883,1303938184,true +2018,10,4,830719110,57615501,true +2018,10,4,505357023,1123289987,false +2018,10,4,460072495,1275884521,true +2018,10,4,106769479,105981493,false +2018,10,5,1665100732,174851304,false +2018,10,5,418078386,1376630279,true +2018,10,5,857470003,626464726,true +2018,10,5,1374756169,1105837061,true +2018,10,5,1908734264,979694462,true +2018,10,5,2114907166,1627645030,true +2018,10,5,54461367,257346127,false +2018,10,5,1850144529,816738526,true +2018,10,5,1566535239,886091088,false +2018,10,5,997847169,1330361156,true +2018,10,6,2054745978,549283487,true +2018,10,6,183046505,158760492,true +2018,10,6,1361843355,1909065400,false +2018,10,6,1649113009,969966473,true +2018,10,6,1525813642,2032096954,false +2018,10,6,1058601178,1098094960,true +2018,10,6,1490412722,309301086,true +2018,10,6,1525911127,1710742417,true +2018,10,6,641373776,718239582,false +2018,10,6,1181292069,1416080394,false +2018,10,7,1951129732,1502524795,true +2018,10,7,2034568587,1173569686,false +2018,10,7,232448480,762085739,false +2018,10,7,895122688,283296072,false +2018,10,7,2096377457,640042062,false +2018,10,7,1936479152,2092078170,true +2018,10,7,434418559,1554838655,true +2018,10,7,480919059,874993327,true +2018,10,7,234094698,944069084,true +2018,10,7,763061905,688112409,true +2018,10,8,1581036404,1579531990,false +2018,10,8,752539698,1924889225,false +2018,10,8,392598366,1753296876,false +2018,10,8,554908348,798150062,false +2018,10,8,360994580,1557339744,true +2018,10,8,346268892,1403086000,false +2018,10,8,990630631,267211842,false +2018,10,8,2061835295,1188002820,true +2018,10,8,1405634312,55736100,false +2018,10,8,2005490675,787641753,false +2018,10,9,1548840246,158280731,false +2018,10,9,1737913314,108807104,false +2018,10,9,1839182818,1665080021,false +2018,10,9,2100824766,1130612644,false +2018,10,9,232743740,816766697,true +2018,10,9,1144809349,320089901,false +2018,10,9,1216169185,1782791603,false +2018,10,9,25696797,1934868130,false +2018,10,9,1749246235,1712129104,true +2018,10,9,1164709178,1535889512,false +2018,10,10,1490193664,1097438372,false +2018,10,10,1470447627,2141471144,true +2018,10,10,1345612492,489009505,false +2018,10,10,805253508,41127117,false +2018,10,10,1351823941,307452050,true +2018,10,10,1003976794,1595247380,true +2018,10,10,1265805586,600228118,true +2018,10,10,1902963216,2054147233,false +2018,10,10,1323119432,90811143,true +2018,10,10,1848860842,1649523421,true +2018,10,11,163406494,1293448206,false +2018,10,11,3174868,764795649,true +2018,10,11,240121602,758159368,false +2018,10,11,1480102826,564238111,true +2018,10,11,402353540,1196213130,false +2018,10,11,678431334,398623243,true +2018,10,11,355209944,181219631,false +2018,10,11,2137120659,395781863,false +2018,10,11,1992336606,1018889128,true +2018,10,11,992534175,2041687883,true +2018,10,12,787608296,1876393484,false +2018,10,12,707032034,1444614971,false +2018,10,12,924135103,991604959,false +2018,10,12,1876173424,1841081448,true +2018,10,12,220057747,2129935805,true +2018,10,12,1796235861,1735785070,true +2018,10,12,1359584859,1339655355,false +2018,10,12,1459940357,1945024527,true +2018,10,12,1377018700,816952780,true +2018,10,12,32902953,1298604450,false +2018,10,13,729103988,244198873,true +2018,10,13,690323992,299240224,false +2018,10,13,680392607,167643721,true +2018,10,13,207711378,996028574,false +2018,10,13,460984328,1636452133,false +2018,10,13,1820203928,2136577560,false +2018,10,13,2033113221,377805609,false +2018,10,13,279256443,1488839635,true +2018,10,13,1632913721,1137404499,false +2018,10,13,1112427642,1965687487,true +2018,10,14,292173543,1888821982,true +2018,10,14,1616173951,854188201,false +2018,10,14,114385020,1648654749,true +2018,10,14,1094556639,2136917523,false +2018,10,14,1361410928,1528795703,false +2018,10,14,1404025896,1594497039,true +2018,10,14,1053618182,1206746300,true +2018,10,14,889126759,2008436340,false +2018,10,14,290991768,505996666,false +2018,10,14,1180184935,1596089154,true +2018,10,15,75496820,662285608,false +2018,10,15,1022769109,1258461713,true +2018,10,15,871752430,1542647487,true +2018,10,15,839967060,396183261,false +2018,10,15,1367962042,775982483,true +2018,10,15,1732070116,80331155,true +2018,10,15,514968998,442651024,false +2018,10,15,222998823,1475676194,true +2018,10,15,849136974,152053924,false +2018,10,15,399742569,1867268872,true +2018,10,16,620804451,1644571583,false +2018,10,16,1415683959,754876552,true +2018,10,16,955671701,2089014001,false +2018,10,16,1447508138,1429805330,true +2018,10,16,610156358,2011309959,true +2018,10,16,481511858,1404918661,true +2018,10,16,2052239502,442240063,false +2018,10,16,1141688830,429517392,true +2018,10,16,1183216184,837989094,false +2018,10,16,1535326428,1255778973,false +2018,10,17,1181777854,1131502589,true +2018,10,17,1144562597,841997581,true +2018,10,17,575498356,1301698618,true +2018,10,17,1596346764,889912236,true +2018,10,17,6034267,896877823,false +2018,10,17,1067986922,1825823528,false +2018,10,17,1531072343,332821531,true +2018,10,17,1785725732,844455280,false +2018,10,17,1059288252,1678355576,false +2018,10,17,1746993015,1773359683,true +2018,10,18,614200684,1194640505,false +2018,10,18,538514000,654900826,true +2018,10,18,1700957953,822628410,true +2018,10,18,515809273,112507952,true +2018,10,18,1397290675,443190518,false +2018,10,18,17898178,1616204182,true +2018,10,18,898619647,1063331889,false +2018,10,18,1601662965,338525080,false +2018,10,18,1202059059,1286163589,true +2018,10,18,1419846700,1563223641,true +2018,10,19,1247320942,1912590109,true +2018,10,19,969665513,345823470,false +2018,10,19,663787456,2141995075,true +2018,10,19,1731651198,426728811,true +2018,10,19,317595478,602754703,true +2018,10,19,1461399719,1352018836,false +2018,10,19,260968717,730148757,false +2018,10,19,1138596194,1595091188,true +2018,10,19,455299255,463792679,false +2018,10,19,2128260846,161905484,true +2018,10,20,1674092789,1959148932,false +2018,10,20,1611819765,1335427723,true +2018,10,20,585408889,412075532,false +2018,10,20,290344010,990914017,false +2018,10,20,1107244550,280869717,true +2018,10,20,956824007,1974420349,true +2018,10,20,1210726547,1779729581,false +2018,10,20,1872325934,1374770454,true +2018,10,20,845612367,1109932141,false +2018,10,20,265121301,1305709403,true +2018,10,21,1179040127,719140790,false +2018,10,21,358130767,403722447,true +2018,10,21,825161324,2085112676,true +2018,10,21,1449856582,1396007111,false +2018,10,21,1436757981,1143423061,true +2018,10,21,1504117571,1096481150,false +2018,10,21,185668047,200986727,false +2018,10,21,1713334948,1717352575,true +2018,10,21,1540549478,593361583,false +2018,10,21,1958824027,2086991569,false +2018,10,22,2137414992,727816404,false +2018,10,22,1170046044,1801686698,false +2018,10,22,2108708884,1261064349,true +2018,10,22,1369728446,1981441690,true +2018,10,22,1204146936,2120075280,false +2018,10,22,33379256,136969827,false +2018,10,22,1881546384,2009501612,false +2018,10,22,692854215,548607999,true +2018,10,22,1385651723,1099848108,true +2018,10,22,1604202907,638616452,true +2018,10,23,279320796,537337428,true +2018,10,23,1826205856,2046074595,true +2018,10,23,1634732628,203328696,false +2018,10,23,1539450942,548688565,false +2018,10,23,1616504642,6679672,true +2018,10,23,506058528,222284692,true +2018,10,23,935645622,1000550503,false +2018,10,23,2033229609,572912565,true +2018,10,23,1158623612,1052228382,true +2018,10,23,545031906,385897142,true +2018,10,24,1757787119,1238699492,false +2018,10,24,342775413,123374055,true +2018,10,24,1292099621,1531418480,true +2018,10,24,2114468774,1629204848,true +2018,10,24,373490862,1935797108,true +2018,10,24,1410778433,594310944,false +2018,10,24,1195694688,841360627,false +2018,10,24,283492844,388930005,false +2018,10,24,2090539993,1184163021,true +2018,10,24,299626354,1608533168,false +2018,10,25,93066183,80188166,false +2018,10,25,1592835791,637957013,false +2018,10,25,186045464,762975413,true +2018,10,25,843517614,30342831,true +2018,10,25,246478486,2077514142,true +2018,10,25,1984196895,965861,false +2018,10,25,51348390,312527839,true +2018,10,25,285584027,1775201673,false +2018,10,25,1980790297,43391296,false +2018,10,25,1545824195,1755338667,true +2018,10,26,1425627993,1459278083,false +2018,10,26,1283193941,83614157,false +2018,10,26,1292916051,1542142528,true +2018,10,26,261892429,1317211887,false +2018,10,26,1194940399,1808688700,true +2018,10,26,280934702,314721842,false +2018,10,26,474705393,191440696,true +2018,10,26,492570847,717217919,false +2018,10,26,566636955,981733286,false +2018,10,26,1597743274,59329515,true +2018,10,27,843776878,797851340,true +2018,10,27,1746707518,1079826145,true +2018,10,27,1362621904,1183340344,false +2018,10,27,1752787386,1507591469,false +2018,10,27,439578247,1524226204,false +2018,10,27,841991988,1006731134,false +2018,10,27,240793421,281348576,true +2018,10,27,800385848,1627632126,true +2018,10,27,1035513058,1553070474,true +2018,10,27,1981853739,1544898828,false +2018,10,28,1523646569,563915229,true +2018,10,28,133864355,586548032,false +2018,10,28,659136762,1579914030,false +2018,10,28,124842503,317150780,true +2018,10,28,36238241,1662390881,false +2018,10,28,1324401692,579549181,true +2018,10,28,1810304590,989031063,true +2018,10,28,1449291588,555323557,false +2018,10,28,654263904,1111853159,true +2018,10,28,1041760928,4684021,true +2018,10,29,1674419658,469434947,true +2018,10,29,331819601,1730086897,true +2018,10,29,1074692622,1111209170,false +2018,10,29,2132080520,1172774502,false +2018,10,29,1043887734,497951585,true +2018,10,29,422766381,492824903,true +2018,10,29,1058617180,488260551,false +2018,10,29,2111781439,965594139,false +2018,10,29,1800156766,638771782,true +2018,10,29,669637183,197368568,false +2018,10,30,237244366,1453401420,true +2018,10,30,1941369154,1395953592,true +2018,10,30,1609586625,97466925,false +2018,10,30,1120067706,255576173,false +2018,10,30,745859465,1339468578,true +2018,10,30,1791207773,1878748484,false +2018,10,30,1983860350,165977140,false +2018,10,30,371915954,1588685393,false +2018,10,30,916253023,1251475424,true +2018,10,30,863782948,1167525368,true +2018,10,31,176417211,736300435,false +2018,10,31,1120605753,1017583354,false +2018,10,31,833597208,656847697,false +2018,10,31,2108765616,779306311,false +2018,10,31,1689568727,291680543,false +2018,10,31,608187582,272770846,false +2018,10,31,1542592516,40589327,true +2018,10,31,646388124,1670397998,true +2018,10,31,1095031443,1696323583,false +2018,10,31,1126282598,1131838132,true +2018,11,1,1258240958,988048992,true +2018,11,1,1445054271,2111222462,true +2018,11,1,775057788,993719725,false +2018,11,1,1903703340,218556967,true +2018,11,1,719490883,22288465,false +2018,11,1,1517040940,1944438759,false +2018,11,1,180097180,2060391216,false +2018,11,1,1074474102,1479454658,false +2018,11,1,1496472115,846615313,false +2018,11,1,1133747175,896276359,false +2018,11,2,1833772385,514760088,true +2018,11,2,717912057,1901483352,false +2018,11,2,1561193062,1033679249,false +2018,11,2,967907305,1748633267,true +2018,11,2,1951038413,1411886010,true +2018,11,2,1750354639,863040052,true +2018,11,2,1330065843,1111217241,true +2018,11,2,1157524182,933492293,false +2018,11,2,262720992,1954944814,true +2018,11,2,225396183,616875329,false +2018,11,3,1060092412,25303724,true +2018,11,3,1645197309,448719783,false +2018,11,3,1793455939,1395328134,false +2018,11,3,1183035468,1345615737,false +2018,11,3,113056638,1781942496,true +2018,11,3,623204369,465739742,false +2018,11,3,492280118,1004451536,true +2018,11,3,1475218857,165450387,true +2018,11,3,104948056,1235230712,false +2018,11,3,1778262055,413671642,true +2018,11,4,1242053308,2137967183,false +2018,11,4,800099516,390264808,true +2018,11,4,50858073,1834754691,false +2018,11,4,2078521328,1832153341,true +2018,11,4,214839557,1724920437,true +2018,11,4,1624247305,346842023,true +2018,11,4,789087247,1559691961,true +2018,11,4,1127734072,658690282,true +2018,11,4,1881529372,169711397,false +2018,11,4,1741467490,648955962,false +2018,11,5,1758164493,1615099347,false +2018,11,5,902996639,697474959,false +2018,11,5,853738450,673705255,true +2018,11,5,426066628,419532280,false +2018,11,5,1809318780,500721835,true +2018,11,5,2101817238,1037630582,true +2018,11,5,776167659,1613260267,true +2018,11,5,1450174068,238355212,false +2018,11,5,1103295717,219730325,true +2018,11,5,1424316317,2124322888,true +2018,11,6,277782151,1622676099,false +2018,11,6,1246347355,929335206,false +2018,11,6,1149865021,1313446781,false +2018,11,6,2018257595,314990498,false +2018,11,6,108945362,1623381467,true +2018,11,6,175287966,466371168,true +2018,11,6,1384004914,1106994754,true +2018,11,6,1708814654,2042227741,false +2018,11,6,174265840,1207468603,false +2018,11,6,1452881794,2116905203,true +2018,11,7,83211190,900274258,true +2018,11,7,494894261,375711641,true +2018,11,7,1671475747,1736982015,false +2018,11,7,635843199,767846683,true +2018,11,7,1263981769,1850703239,false +2018,11,7,664791083,1654705609,false +2018,11,7,2012250175,1089964949,true +2018,11,7,1241441954,440827336,false +2018,11,7,1718658443,1084583745,false +2018,11,7,1882137509,1253488644,false +2018,11,8,1108587856,719818935,false +2018,11,8,523737640,1916730502,true +2018,11,8,2051359873,982663620,false +2018,11,8,453502687,1533878091,true +2018,11,8,1938587274,1507358955,false +2018,11,8,1695534937,1366484783,true +2018,11,8,152133928,1473475284,false +2018,11,8,1060300610,1391706262,false +2018,11,8,684301425,2025066296,false +2018,11,8,260876962,99323343,false +2018,11,9,1467735480,95540623,false +2018,11,9,577531177,405278564,false +2018,11,9,1697830473,962456362,true +2018,11,9,1117601270,1982811352,true +2018,11,9,1372161850,1148303243,true +2018,11,9,1813423190,1516298882,true +2018,11,9,1324536257,469290249,false +2018,11,9,161985895,104969095,true +2018,11,9,1391267789,1779481392,false +2018,11,9,35636143,275793691,true +2018,11,10,1836391131,993920342,false +2018,11,10,267231488,327327382,true +2018,11,10,445888309,237462359,false +2018,11,10,578316349,236365313,false +2018,11,10,1637181471,812450694,true +2018,11,10,803264485,1705906980,false +2018,11,10,757288357,463703076,true +2018,11,10,1942470503,1979361083,false +2018,11,10,397371137,826173443,true +2018,11,10,688500910,894756321,true +2018,11,11,1379826894,1204571425,true +2018,11,11,2002279531,1503494096,true +2018,11,11,173841551,1495741792,true +2018,11,11,1929546497,577272446,true +2018,11,11,1645249321,399966365,false +2018,11,11,402773278,2062783954,true +2018,11,11,351807071,743832795,false +2018,11,11,965469806,1881286484,true +2018,11,11,832163201,1149791299,false +2018,11,11,1029210597,1652926125,true +2018,11,12,523659192,963453465,true +2018,11,12,253198992,1793657172,false +2018,11,12,1086003907,1059866366,false +2018,11,12,1659763956,193387442,true +2018,11,12,864087808,1448251933,false +2018,11,12,1235240033,555701295,false +2018,11,12,202184202,702060556,false +2018,11,12,1747598533,1667266783,true +2018,11,12,1083514726,801016201,true +2018,11,12,2121784541,1026714310,false +2018,11,13,921604839,129608545,false +2018,11,13,696149467,236196337,false +2018,11,13,1348365009,1103119898,false +2018,11,13,1579314395,1311198164,false +2018,11,13,2044450798,1179031359,true +2018,11,13,514305977,46231981,true +2018,11,13,577447944,746110814,true +2018,11,13,916693544,677846747,false +2018,11,13,475374470,117280531,false +2018,11,13,917109021,97886059,false +2018,11,14,602755362,722518712,true +2018,11,14,168298982,319155532,true +2018,11,14,715052785,2145550226,false +2018,11,14,408348546,388804323,false +2018,11,14,1245099569,246484630,false +2018,11,14,364418083,174052346,false +2018,11,14,1688466330,414298132,false +2018,11,14,1222618582,335562300,false +2018,11,14,55210027,1933592816,true +2018,11,14,2122679485,670513345,false +2018,11,15,381146783,281765169,false +2018,11,15,408213111,1964574291,false +2018,11,15,850632729,992210181,true +2018,11,15,1922125174,1832439923,false +2018,11,15,347430560,1737277594,true +2018,11,15,156806260,1268859209,true +2018,11,15,60016259,561417298,false +2018,11,15,1318784339,1131446531,true +2018,11,15,113633222,592796218,true +2018,11,15,1155932162,1751886672,false +2018,11,16,67714843,576126676,true +2018,11,16,1155248442,185363445,true +2018,11,16,583584278,337661581,false +2018,11,16,2015838780,482406839,true +2018,11,16,1441311912,1986092670,false +2018,11,16,972550604,1737075974,false +2018,11,16,6309368,849002048,true +2018,11,16,1076605983,1460888627,true +2018,11,16,1703567670,608954877,false +2018,11,16,1437903487,2018547991,true +2018,11,17,238848771,625015563,true +2018,11,17,1763747014,1133517936,true +2018,11,17,895422076,289876651,true +2018,11,17,1784825344,295471456,false +2018,11,17,396215564,1602674123,false +2018,11,17,1136212706,1051283416,true +2018,11,17,1549617806,828504530,false +2018,11,17,1553415113,126713554,false +2018,11,17,2003983214,1420231139,true +2018,11,17,117409149,681949091,false +2018,11,18,1516483780,1410285859,true +2018,11,18,504421118,1089481165,true +2018,11,18,1831784091,765977045,false +2018,11,18,369840112,1262304031,true +2018,11,18,815908586,826577525,false +2018,11,18,1470946956,593699332,true +2018,11,18,1421926372,699732252,false +2018,11,18,1667069218,2006140994,false +2018,11,18,423337642,415101634,true +2018,11,18,1084589700,1602640557,false +2018,11,19,1091563397,1448480588,true +2018,11,19,2115735043,699631214,true +2018,11,19,1894468406,436499551,false +2018,11,19,1372402357,504635116,true +2018,11,19,228113653,731082861,true +2018,11,19,1667307336,1633013636,true +2018,11,19,463235898,1747090556,true +2018,11,19,1941289610,1180770946,false +2018,11,19,2048587116,335435412,true +2018,11,19,1586522728,1952095019,true +2018,11,20,1646601985,985255900,false +2018,11,20,599457315,1342927538,true +2018,11,20,1229143221,1124956356,false +2018,11,20,1514087895,141117538,true +2018,11,20,126997222,917816022,true +2018,11,20,2131717302,1557549941,true +2018,11,20,1392286774,1922970372,true +2018,11,20,1598480514,748548421,false +2018,11,20,2016855550,239937676,false +2018,11,20,292322344,1360221407,false +2018,11,21,450921533,736058305,false +2018,11,21,1542210700,1954854172,false +2018,11,21,1613115652,971038058,true +2018,11,21,271076839,818359504,false +2018,11,21,89443213,1438571111,true +2018,11,21,20180295,553428138,true +2018,11,21,1509818757,1950681313,false +2018,11,21,1869413643,573776095,false +2018,11,21,1753962793,341982010,true +2018,11,21,1134404725,608484532,false +2018,11,22,2122587468,759067237,false +2018,11,22,392516013,1540492419,true +2018,11,22,1275552864,2138946683,true +2018,11,22,635165377,1655282384,true +2018,11,22,1903526417,1857455525,true +2018,11,22,1067832711,1080708996,true +2018,11,22,2009626394,1103401144,false +2018,11,22,318142328,1665849456,true +2018,11,22,747050333,1550967306,true +2018,11,22,1898481101,399782049,false +2018,11,23,1845730914,2003849843,true +2018,11,23,2027772812,175847471,false +2018,11,23,522006721,2109116787,false +2018,11,23,1880923723,1388216451,true +2018,11,23,1839691060,1044178166,true +2018,11,23,805319627,1939280218,false +2018,11,23,737228344,2075476527,false +2018,11,23,1607678400,189892835,false +2018,11,23,856159624,997277302,true +2018,11,23,1196324444,17868226,true +2018,11,24,1442494681,1642733627,true +2018,11,24,427082618,2018677751,false +2018,11,24,1295727518,1038407527,true +2018,11,24,558606203,1495420354,true +2018,11,24,1337722305,1179068861,false +2018,11,24,724203058,345544738,true +2018,11,24,740861837,665107231,false +2018,11,24,636043638,1246930572,false +2018,11,24,734533952,2031295169,true +2018,11,24,1178756138,628937692,true +2018,11,25,324012295,103445935,false +2018,11,25,953736167,788475576,false +2018,11,25,1060785219,2134723635,false +2018,11,25,13354117,817391070,true +2018,11,25,438223215,2118277885,true +2018,11,25,1413901875,2025192719,true +2018,11,25,1193690626,146285085,false +2018,11,25,1870430695,17968531,false +2018,11,25,2055538718,990241574,false +2018,11,25,665045316,1569160596,true +2018,11,26,83951083,1316545826,true +2018,11,26,1889551950,161382332,true +2018,11,26,1940241649,1794250771,false +2018,11,26,279159939,1759862996,true +2018,11,26,199875022,608510778,false +2018,11,26,1973190703,1297906530,false +2018,11,26,925857047,73486426,false +2018,11,26,888396312,1364994416,false +2018,11,26,768422986,146469807,true +2018,11,26,394009812,1022561026,false +2018,11,27,510504473,1221243736,false +2018,11,27,217296486,985203688,true +2018,11,27,1943525239,1833089625,false +2018,11,27,1899557878,443670003,false +2018,11,27,380457770,90260108,false +2018,11,27,1305231041,1933431979,false +2018,11,27,1977432021,1058424342,false +2018,11,27,962076822,193521179,true +2018,11,27,1432879282,1430217782,false +2018,11,27,1798730611,681683036,false +2018,11,28,677935989,897738381,true +2018,11,28,1093255969,1239623123,true +2018,11,28,1645064725,511388464,true +2018,11,28,1377505518,419689062,true +2018,11,28,1433573734,2000508195,false +2018,11,28,869525701,1467472407,false +2018,11,28,2004141513,1435162636,false +2018,11,28,341230675,1659717058,false +2018,11,28,811998563,207457598,true +2018,11,28,1065587725,1308052722,false +2018,11,29,808747928,1787318128,false +2018,11,29,1854228259,2016965640,true +2018,11,29,661871547,728362938,true +2018,11,29,1357783748,624352297,true +2018,11,29,297177394,124092846,true +2018,11,29,1644951108,1888173749,false +2018,11,29,1518215552,2036682152,false +2018,11,29,33066248,149553191,false +2018,11,29,2011169014,1447231399,false +2018,11,29,1928165432,1260259003,true +2018,11,30,660614388,635221582,false +2018,11,30,421607852,978083626,false +2018,11,30,1762846466,18168034,false +2018,11,30,1904037529,1261839951,true +2018,11,30,2082077232,404825180,false +2018,11,30,2104255444,1122187435,true +2018,11,30,775266425,417079016,true +2018,11,30,1863544536,1495720175,false +2018,11,30,504737815,2033346155,false +2018,11,30,1397902796,1919595795,false +2018,11,31,245709076,1747200483,false +2018,11,31,161824311,2032735666,false +2018,11,31,571881851,1908039356,false +2018,11,31,1338621225,1469965724,true +2018,11,31,314350316,865532535,false +2018,11,31,1877642500,1814175701,true +2018,11,31,1192154765,1007482415,false +2018,11,31,1749056547,2047770757,true +2018,11,31,724202631,558699744,true +2018,11,31,1303538967,1254527362,true +2019,1,1,1865265803,1387421313,false +2019,1,1,1449581618,2129896826,true +2019,1,1,1978401057,1972990285,false +2019,1,1,25493391,1806021859,false +2019,1,1,1109281189,1271824330,false +2019,1,1,1583511015,1274361410,true +2019,1,1,631971644,1332403579,true +2019,1,1,2028246473,517406567,true +2019,1,1,439266095,1946266935,true +2019,1,1,310878733,1841532843,true +2019,1,2,1269981546,1348365206,true +2019,1,2,447857057,1618666324,true +2019,1,2,1205134350,795081988,false +2019,1,2,442525696,352431795,false +2019,1,2,434767194,578382413,true +2019,1,2,514011373,608480142,false +2019,1,2,1280318813,540925885,false +2019,1,2,818601293,2129268216,false +2019,1,2,933035298,276341300,true +2019,1,2,1353412662,34819495,true +2019,1,3,1359690591,1079890614,true +2019,1,3,2140680843,6604642,true +2019,1,3,995981004,452318302,true +2019,1,3,1590423737,2088459815,false +2019,1,3,104663210,111430087,false +2019,1,3,1402005881,507495951,false +2019,1,3,1954844196,1301526075,true +2019,1,3,578153887,676107037,false +2019,1,3,1950628240,97791053,false +2019,1,3,1935274073,1679823999,false +2019,1,4,1102648512,19654320,true +2019,1,4,729222264,1708228839,false +2019,1,4,816567249,791118174,false +2019,1,4,61447820,1372951297,false +2019,1,4,98933840,467769461,false +2019,1,4,1422447870,201882942,false +2019,1,4,1709098624,701835084,false +2019,1,4,617270895,1412997726,true +2019,1,4,1414410024,1665068467,true +2019,1,4,581917704,832623749,true +2019,1,5,328711732,2078126685,true +2019,1,5,1708626038,2011401924,true +2019,1,5,88761099,945963230,false +2019,1,5,950320588,694140243,true +2019,1,5,1246527856,660262601,true +2019,1,5,1385968518,1227867893,true +2019,1,5,432583226,2012993232,true +2019,1,5,467419993,1039447614,true +2019,1,5,494035284,1433256436,false +2019,1,5,116825752,572599216,true +2019,1,6,443303613,280213031,true +2019,1,6,1341214548,1450523114,true +2019,1,6,1673759365,1007085293,true +2019,1,6,386217867,1150813024,false +2019,1,6,437167090,1203937248,false +2019,1,6,1574661619,1995699539,true +2019,1,6,702759513,156844583,true +2019,1,6,1003737929,1635540484,false +2019,1,6,1089657428,643327285,true +2019,1,6,348267575,976219223,false +2019,1,7,388062280,1310886006,true +2019,1,7,162294559,886830431,true +2019,1,7,4731276,2014647988,false +2019,1,7,547438,141310088,false +2019,1,7,1441493903,37827917,false +2019,1,7,889492310,1466291553,true +2019,1,7,1794536638,1688324256,true +2019,1,7,824212969,1046564400,false +2019,1,7,1307882790,1419742262,true +2019,1,7,1181740821,1200445652,false +2019,1,8,55987469,161015949,false +2019,1,8,1034925562,1680143269,false +2019,1,8,1703101925,1535353235,true +2019,1,8,38477179,2083014219,false +2019,1,8,1403803248,263020599,true +2019,1,8,913819697,1629378255,true +2019,1,8,1575260402,1679410836,false +2019,1,8,523885560,1607604552,false +2019,1,8,1262344890,1134231971,true +2019,1,8,11306967,366996726,false +2019,1,9,1292372590,1657795705,true +2019,1,9,1414945261,1006683053,false +2019,1,9,1910660836,2049046649,false +2019,1,9,1560472746,1766918354,false +2019,1,9,1474966113,1360401161,true +2019,1,9,1850669445,80048997,false +2019,1,9,307064477,1801600160,false +2019,1,9,489499867,170742477,false +2019,1,9,386917346,1670395473,true +2019,1,9,1458239954,1458448468,false +2019,1,10,1848660500,781926939,false +2019,1,10,81294918,2059964315,false +2019,1,10,1947483275,1560096674,false +2019,1,10,633718315,1373841020,false +2019,1,10,17013253,1088381476,true +2019,1,10,635314929,1903780093,false +2019,1,10,494512156,423955910,false +2019,1,10,12838597,1366134896,false +2019,1,10,1292152322,821148039,true +2019,1,10,460555464,1447947474,false +2019,1,11,952687726,546334850,false +2019,1,11,1477693072,1104328190,true +2019,1,11,1543227652,227047489,true +2019,1,11,1694016662,1157207647,false +2019,1,11,1657721244,1597150033,false +2019,1,11,834142805,1050272536,true +2019,1,11,552800306,1791811161,true +2019,1,11,701993031,981677536,true +2019,1,11,2126316141,1540756318,true +2019,1,11,215962894,304659204,false +2019,1,12,191344625,1567449598,false +2019,1,12,1676874845,660568655,false +2019,1,12,1196261636,620491971,true +2019,1,12,1794323845,1046768582,true +2019,1,12,367668012,1043184354,false +2019,1,12,682425520,1519608180,true +2019,1,12,219589579,12311297,false +2019,1,12,1149454640,2096383434,false +2019,1,12,1272662245,1037291222,false +2019,1,12,13711466,1735848494,true +2019,1,13,1529908892,1493952704,false +2019,1,13,1481691211,886866901,false +2019,1,13,2050444285,400538535,false +2019,1,13,1223982402,939042049,false +2019,1,13,997211186,854383304,false +2019,1,13,334729015,456019757,false +2019,1,13,2020696385,401111609,true +2019,1,13,1565672953,1326851808,false +2019,1,13,772516816,1525150272,true +2019,1,13,1160067347,29117448,false +2019,1,14,377319349,855496417,false +2019,1,14,690964498,1558143547,true +2019,1,14,1370679088,345606694,false +2019,1,14,1106220953,1902197299,true +2019,1,14,1452193572,1901312775,true +2019,1,14,2038086870,621083212,false +2019,1,14,827430063,550269966,false +2019,1,14,1912925243,1504736772,true +2019,1,14,705030376,234488829,true +2019,1,14,1880758663,2095817887,true +2019,1,15,1073621431,221214751,false +2019,1,15,1169245822,1847485826,true +2019,1,15,466200311,129983258,false +2019,1,15,137892580,1556221153,false +2019,1,15,307908972,221221963,false +2019,1,15,341232535,1427185106,true +2019,1,15,1024496944,1490354971,false +2019,1,15,2136541150,1650929386,true +2019,1,15,1125274523,1144968911,true +2019,1,15,1988504388,1559781,true +2019,1,16,247228644,1604054715,true +2019,1,16,35401624,1531663117,true +2019,1,16,1451198431,2047406059,true +2019,1,16,331333100,1375843168,false +2019,1,16,789160615,1668576401,true +2019,1,16,1639065253,1176377503,true +2019,1,16,468984086,1342953186,false +2019,1,16,587270138,1609808588,true +2019,1,16,600301144,2041775146,true +2019,1,16,661567357,2085632710,false +2019,1,17,35970307,1090264628,true +2019,1,17,1555406455,1781396480,false +2019,1,17,1925899631,1246052455,true +2019,1,17,1102379585,1081080708,true +2019,1,17,733963333,344576795,true +2019,1,17,632400103,942705099,false +2019,1,17,1326247814,1985299224,true +2019,1,17,135407536,527655172,true +2019,1,17,700606072,12846148,true +2019,1,17,1523022949,1989800933,false +2019,1,18,1330560299,1318604250,false +2019,1,18,384613226,1784394020,true +2019,1,18,115261097,79580111,true +2019,1,18,1687227394,2075031064,false +2019,1,18,1965193446,2016502830,false +2019,1,18,1157284196,2038552387,true +2019,1,18,2051484672,293402761,false +2019,1,18,1617504859,1849374635,false +2019,1,18,930777733,1869056731,false +2019,1,18,1358654123,93036620,false +2019,1,19,1830709929,1707717225,false +2019,1,19,2005915769,2041937680,true +2019,1,19,1817036976,472033414,true +2019,1,19,339035485,835057923,true +2019,1,19,208255235,44807753,true +2019,1,19,1833967086,1448140238,false +2019,1,19,1608955049,1987134351,true +2019,1,19,423343077,447257232,true +2019,1,19,105911565,1477706982,true +2019,1,19,2046109229,344807600,true +2019,1,20,1615047779,1451530543,true +2019,1,20,498827126,2010894666,true +2019,1,20,884784653,522133786,false +2019,1,20,1739562793,2108741679,false +2019,1,20,634011335,1154992591,false +2019,1,20,1288151499,1836665285,false +2019,1,20,1299564163,1213938008,true +2019,1,20,1232835890,1845325086,true +2019,1,20,582487618,2044335787,false +2019,1,20,211067421,984420382,true +2019,1,21,2023604813,1275779177,true +2019,1,21,2011318284,1451242901,true +2019,1,21,510651795,968060072,false +2019,1,21,302129510,1855204742,true +2019,1,21,917101852,2141288806,true +2019,1,21,760919377,1665855008,false +2019,1,21,54665939,366327923,false +2019,1,21,754938324,2105184270,false +2019,1,21,1582556439,704158453,false +2019,1,21,1229902857,1946975016,false +2019,1,22,1121647814,340049883,true +2019,1,22,1711363472,1923754162,true +2019,1,22,1446647245,1446587848,false +2019,1,22,246070664,1351284385,false +2019,1,22,828057185,1091665007,true +2019,1,22,631521457,1947823016,false +2019,1,22,1962192737,661551699,false +2019,1,22,2123775880,2092699125,true +2019,1,22,528573596,850735963,false +2019,1,22,796375675,1873780458,true +2019,1,23,564897265,325900749,false +2019,1,23,315659143,1230945329,true +2019,1,23,1803397754,990103168,false +2019,1,23,1192016348,331981570,true +2019,1,23,786756935,912776810,true +2019,1,23,710489954,1112255599,false +2019,1,23,1010937685,93584083,false +2019,1,23,1308236073,1349132821,false +2019,1,23,1302226115,1082181335,false +2019,1,23,132129586,1131401348,true +2019,1,24,1454076783,654290937,true +2019,1,24,299859221,575426867,true +2019,1,24,771976,251615478,false +2019,1,24,421714023,751761386,true +2019,1,24,351952567,2138916128,true +2019,1,24,1036012275,388542430,true +2019,1,24,1304757478,722239451,false +2019,1,24,706938404,356676336,false +2019,1,24,1239410137,1817093919,true +2019,1,24,1906149934,18837977,false +2019,1,25,520975624,1525106951,false +2019,1,25,742833882,925503009,true +2019,1,25,2003362185,798606289,false +2019,1,25,1094634757,145522483,true +2019,1,25,375938527,2146049252,true +2019,1,25,1496415016,708047346,true +2019,1,25,557122576,179396271,false +2019,1,25,1054833803,2131048083,false +2019,1,25,1327799375,1318014593,false +2019,1,25,1891657202,1710895791,true +2019,1,26,1865816343,350519981,true +2019,1,26,322380766,1887985093,false +2019,1,26,1898694040,1295605391,false +2019,1,26,1514909120,1656429735,true +2019,1,26,1287715764,1842744662,true +2019,1,26,1766889127,1581051014,true +2019,1,26,913263879,370445724,false +2019,1,26,6412121,1680050327,false +2019,1,26,357985971,258281259,false +2019,1,26,1830620883,1081908621,true +2019,1,27,17332701,1463064974,false +2019,1,27,1155616160,1178649685,true +2019,1,27,1198670630,2106719545,false +2019,1,27,786800016,2048094682,true +2019,1,27,2059811200,358269539,true +2019,1,27,383756975,1249500094,true +2019,1,27,472086832,600567801,true +2019,1,27,787385227,1419088451,true +2019,1,27,1941314017,13988392,false +2019,1,27,762672690,1915112930,false +2019,1,28,1919960846,1408002924,false +2019,1,28,51217512,179387196,false +2019,1,28,762574893,1273766967,true +2019,1,28,394810629,171604590,true +2019,1,28,1305817702,220163191,false +2019,1,28,786213250,101883623,false +2019,1,28,1492529080,377972661,false +2019,1,28,307029521,265311065,true +2019,1,28,2043062228,625522557,false +2019,1,28,1037690852,1917577564,false +2019,1,29,1952601382,914233675,false +2019,1,29,196397156,2085428947,true +2019,1,29,1732295859,1466344507,true +2019,1,29,677588360,1868288826,true +2019,1,29,712559068,241578020,false +2019,1,29,1955890832,298538347,false +2019,1,29,1734412073,37610217,true +2019,1,29,1482512847,1126264431,false +2019,1,29,1569960870,1228454439,false +2019,1,29,1698718534,964814975,true +2019,1,30,1157806198,1971385998,true +2019,1,30,1806501185,1401690436,false +2019,1,30,961553578,1321591670,false +2019,1,30,1743668019,106986729,true +2019,1,30,1637705086,770269347,false +2019,1,30,1623030918,1239597996,true +2019,1,30,1610514317,387917096,false +2019,1,30,194392090,440559080,false +2019,1,30,1031481749,1939323371,true +2019,1,30,1331923620,72741582,true +2019,1,31,2015978468,1090713411,false +2019,1,31,965882163,1949330509,true +2019,1,31,1123749770,1321940114,false +2019,1,31,1840705147,1551118033,true +2019,1,31,991997661,1815519740,true +2019,1,31,1830909123,241937895,false +2019,1,31,1044442358,249054085,true +2019,1,31,4626303,1632159463,false +2019,1,31,1941986350,1675102770,true +2019,1,31,527566588,1940529456,false +2019,2,1,1085021127,1132028689,true +2019,2,1,781256116,2032736157,true +2019,2,1,752964415,155882792,false +2019,2,1,2049937943,1749518649,false +2019,2,1,1475526152,442777279,true +2019,2,1,455829743,599572321,true +2019,2,1,1027894419,901218179,false +2019,2,1,655156144,242476499,false +2019,2,1,906983411,1605939311,true +2019,2,1,1103059787,1726049621,false +2019,2,2,1402354557,1365984132,false +2019,2,2,294728546,937077759,true +2019,2,2,1334561849,816207028,false +2019,2,2,121656042,451622483,false +2019,2,2,346318848,753330823,false +2019,2,2,1524721315,1545360024,false +2019,2,2,1666811536,1237889135,true +2019,2,2,1101997806,44997697,true +2019,2,2,138650887,310732220,false +2019,2,2,42520034,1845362496,true +2019,2,3,279044223,1064903599,false +2019,2,3,1202930616,1140437911,true +2019,2,3,1123520198,140134949,true +2019,2,3,1286346463,1504272264,true +2019,2,3,2112965734,452011463,true +2019,2,3,1954441823,1004933934,false +2019,2,3,191626608,598772687,true +2019,2,3,1523871819,821623396,true +2019,2,3,2016613073,80643761,false +2019,2,3,1964578675,1241348683,false +2019,2,4,1333368859,294458453,true +2019,2,4,1642076832,400262606,false +2019,2,4,1497902590,1235136931,false +2019,2,4,1083895027,1808226539,false +2019,2,4,84259824,1561481362,true +2019,2,4,1122034242,378735800,true +2019,2,4,1676218285,281357552,true +2019,2,4,1077132172,1916642897,false +2019,2,4,198167447,1826034086,true +2019,2,4,1728593623,305852728,true +2019,2,5,791821607,1050465058,false +2019,2,5,1971783445,974893519,true +2019,2,5,376249058,1853119788,true +2019,2,5,762581988,1506940655,false +2019,2,5,1039658386,1499103573,false +2019,2,5,1224024648,1228528081,false +2019,2,5,1586309817,1392183481,true +2019,2,5,234770089,1329906107,true +2019,2,5,654644852,1245394646,false +2019,2,5,490880450,477451568,true +2019,2,6,325605702,1391148184,true +2019,2,6,459070782,484177054,true +2019,2,6,1644073605,1529274876,false +2019,2,6,903677501,222001845,true +2019,2,6,892054113,65878469,true +2019,2,6,304528111,1347312877,false +2019,2,6,721637050,2037143109,true +2019,2,6,1746120204,838087403,true +2019,2,6,1704160318,288096040,true +2019,2,6,653659605,1504777661,true +2019,2,7,864987534,701638581,true +2019,2,7,1359969751,1400340522,true +2019,2,7,552631381,1880074772,false +2019,2,7,1293849013,1442107274,true +2019,2,7,1675260051,237725548,true +2019,2,7,1504987289,1236682977,true +2019,2,7,600455314,560901631,false +2019,2,7,1988807710,1320560682,false +2019,2,7,323102778,841907591,false +2019,2,7,29515479,1023856675,false +2019,2,8,1416718527,2053259139,false +2019,2,8,227132862,338575090,true +2019,2,8,1379502560,431463481,true +2019,2,8,739695907,1180272757,true +2019,2,8,90614831,855346836,true +2019,2,8,401229060,1984090407,false +2019,2,8,193225753,1193773098,false +2019,2,8,1105162705,1094397143,true +2019,2,8,28391339,1399106564,false +2019,2,8,15713788,525560488,false +2019,2,9,1992706369,1785901583,false +2019,2,9,1632368120,1295338513,false +2019,2,9,258535330,616642101,false +2019,2,9,582527847,1923839705,false +2019,2,9,309639232,1643371914,false +2019,2,9,821654955,454472256,true +2019,2,9,2064198284,1101539613,true +2019,2,9,2027185434,565657043,true +2019,2,9,1224125341,712036676,true +2019,2,9,994018169,1144085665,true +2019,2,10,668968863,1435594111,true +2019,2,10,1380681625,1249647050,false +2019,2,10,445168326,1161219145,true +2019,2,10,519172899,1206641565,false +2019,2,10,1934984630,978256744,true +2019,2,10,1115361524,1201828030,true +2019,2,10,1704278430,789036195,true +2019,2,10,1162762141,224429845,true +2019,2,10,572933758,2044912616,false +2019,2,10,320820961,1524252262,false +2019,2,11,1607729736,1912632123,false +2019,2,11,1508761672,1030506735,false +2019,2,11,1276848611,1895298579,true +2019,2,11,805460507,1170815578,false +2019,2,11,1576391988,350965997,false +2019,2,11,1223782766,868532705,false +2019,2,11,1422880186,1714795319,true +2019,2,11,1127512827,138667648,false +2019,2,11,1813740391,349841641,false +2019,2,11,880219134,229350525,true +2019,2,12,178011568,64181855,true +2019,2,12,1355139924,1821446584,false +2019,2,12,1763776271,1470452818,true +2019,2,12,1811749001,1491264125,false +2019,2,12,1615945587,913879816,true +2019,2,12,534870420,1881250507,true +2019,2,12,807030343,1078801169,true +2019,2,12,434669948,400394321,false +2019,2,12,1326573343,501223203,false +2019,2,12,775164184,1622966162,true +2019,2,13,1752972327,511088122,false +2019,2,13,1412548767,655090407,true +2019,2,13,2076868587,1812831308,false +2019,2,13,250121168,803308643,true +2019,2,13,1285444624,1905380899,true +2019,2,13,545485775,617870267,false +2019,2,13,58791189,1452174409,true +2019,2,13,830959892,1347616705,true +2019,2,13,968074039,2079284609,false +2019,2,13,18672888,1484896031,false +2019,2,14,1668540171,88734917,false +2019,2,14,467566022,1083283012,false +2019,2,14,979508335,2069136580,true +2019,2,14,1506484567,1719918796,true +2019,2,14,1776310491,994489933,true +2019,2,14,504271272,1356526278,true +2019,2,14,804117051,1489840756,true +2019,2,14,445082605,131969704,true +2019,2,14,1654129117,848493196,true +2019,2,14,640736768,1747896016,true +2019,2,15,1808433940,163567437,false +2019,2,15,721792838,1108793818,true +2019,2,15,998559647,1284611523,true +2019,2,15,1257476458,37461828,true +2019,2,15,1183661047,904643544,true +2019,2,15,921698924,2053013004,true +2019,2,15,1763918865,644999052,true +2019,2,15,1465482493,1545558347,true +2019,2,15,1206768947,37237578,false +2019,2,15,956271911,340717312,true +2019,2,16,772545522,1306556112,true +2019,2,16,2125426338,400376706,true +2019,2,16,805707122,1549244321,false +2019,2,16,449802103,1761844311,true +2019,2,16,723839354,255829254,true +2019,2,16,202194097,1653487456,true +2019,2,16,2055898885,142097149,false +2019,2,16,1802489939,1078150045,true +2019,2,16,1815703369,338904238,true +2019,2,16,2107096781,131938027,true +2019,2,17,1267698548,1976527421,true +2019,2,17,648310691,2040299672,true +2019,2,17,1859504016,306501390,false +2019,2,17,1718890498,178084508,false +2019,2,17,148076870,1912298715,true +2019,2,17,1033590651,1354443730,false +2019,2,17,1669047167,85628203,true +2019,2,17,70563357,1449925520,true +2019,2,17,1286903822,1665357320,false +2019,2,17,386238618,489763685,false +2019,2,18,1811623662,560785418,true +2019,2,18,1119423051,1973765616,true +2019,2,18,1731710647,1743043339,false +2019,2,18,2064956963,1152244230,true +2019,2,18,823288911,886254632,false +2019,2,18,88964853,1101799330,true +2019,2,18,2014066705,2048858839,true +2019,2,18,1597416603,1701480533,false +2019,2,18,1975246036,1038736440,true +2019,2,18,1115545307,636981042,true +2019,2,19,1888244124,1724932844,true +2019,2,19,683089955,1202494564,false +2019,2,19,2080925917,125402908,false +2019,2,19,450421103,2024099579,true +2019,2,19,1218293907,476953969,true +2019,2,19,1140882935,383771526,false +2019,2,19,373095588,728167718,false +2019,2,19,386150516,1507489946,true +2019,2,19,456943933,859624590,true +2019,2,19,859417894,389917274,false +2019,2,20,595115308,1720909969,true +2019,2,20,675298008,103150740,false +2019,2,20,431461488,1663334159,false +2019,2,20,1908923027,1248890648,false +2019,2,20,1235040615,1084761402,true +2019,2,20,488001743,598972998,true +2019,2,20,493395814,947575350,true +2019,2,20,234490925,778928666,true +2019,2,20,1523915687,23804835,false +2019,2,20,866305328,782568462,true +2019,2,21,289362373,1294136545,false +2019,2,21,371365739,1410244782,false +2019,2,21,606330695,1672762154,false +2019,2,21,1090523156,848796601,false +2019,2,21,1561900521,525447133,false +2019,2,21,903581451,18667035,true +2019,2,21,404969114,738377606,false +2019,2,21,2064068269,779928364,false +2019,2,21,2126466932,414640010,false +2019,2,21,174621708,1637969649,false +2019,2,22,1706691860,1491627395,true +2019,2,22,2082782309,414514518,false +2019,2,22,233262105,952466096,true +2019,2,22,223950313,416352918,false +2019,2,22,443011512,1166924543,true +2019,2,22,1101627703,202574677,true +2019,2,22,867801765,414245536,false +2019,2,22,1509821975,1491441422,true +2019,2,22,1258550722,1569803813,false +2019,2,22,285241682,1330639335,false +2019,2,23,969741391,548278415,true +2019,2,23,1076396738,1158497035,false +2019,2,23,1219693461,482420712,false +2019,2,23,1087592940,264649492,true +2019,2,23,33891393,2035846191,true +2019,2,23,1683406019,1735430710,false +2019,2,23,1387278823,267021762,true +2019,2,23,1973598052,1111836699,true +2019,2,23,653367655,990646159,false +2019,2,23,1309351141,1232732092,true +2019,2,24,1015559979,225044142,false +2019,2,24,250376347,1773387674,true +2019,2,24,222006903,80616502,true +2019,2,24,133004378,2084428571,true +2019,2,24,735256895,1738499940,true +2019,2,24,535851924,1330275329,true +2019,2,24,413635649,1716143482,true +2019,2,24,95241336,73485102,true +2019,2,24,1955784023,269494741,true +2019,2,24,783380299,464444985,false +2019,2,25,1954384441,161950758,true +2019,2,25,682670520,2120983416,false +2019,2,25,409230378,2066631654,false +2019,2,25,574301907,1600145552,true +2019,2,25,1308283203,1809409869,true +2019,2,25,430904817,1883600814,true +2019,2,25,1860288091,698405233,true +2019,2,25,1337924019,1677559272,false +2019,2,25,169067710,299161192,false +2019,2,25,27282897,151853447,true +2019,2,26,1026831658,2143864184,false +2019,2,26,972782978,321892641,false +2019,2,26,1551214035,1607735479,false +2019,2,26,322346397,597191880,false +2019,2,26,1142704988,375215788,false +2019,2,26,417318081,598814916,true +2019,2,26,1454531840,455157206,false +2019,2,26,110836010,1077427594,true +2019,2,26,518998632,1519520542,true +2019,2,26,498937593,1698970353,false +2019,2,27,436467168,1067008926,true +2019,2,27,917027611,165250255,false +2019,2,27,1332334563,814704452,true +2019,2,27,466165270,296867211,false +2019,2,27,107288893,2110903615,true +2019,2,27,681258844,2008368965,true +2019,2,27,1778788142,995599408,false +2019,2,27,2055875842,660381346,true +2019,2,27,890882392,915062683,true +2019,2,27,2020005088,27261530,true +2019,2,28,2029990750,1891520780,true +2019,2,28,1362641798,2043734304,true +2019,2,28,359249015,557735653,false +2019,2,28,333181591,1609931649,true +2019,2,28,428799340,304282585,true +2019,2,28,744694621,1514535352,true +2019,2,28,1585672625,2136885627,false +2019,2,28,1622105011,1398269379,false +2019,2,28,1535769246,2053389673,false +2019,2,28,865741466,1299869708,true +2019,2,29,595549174,157768704,true +2019,2,29,2091679515,1110987145,false +2019,2,29,527908661,1111606426,true +2019,2,29,1351829308,301475870,true +2019,2,29,75841378,1605290686,false +2019,2,29,922689072,347589288,false +2019,2,29,921588057,1969306837,true +2019,2,29,1034835984,1775902714,false +2019,2,29,1013990596,735917115,false +2019,2,29,1156056767,16926894,false +2019,2,30,1453047874,1050880966,false +2019,2,30,1542289095,1642671521,true +2019,2,30,321963881,985395482,false +2019,2,30,2123459655,87340396,true +2019,2,30,1963860903,2041193053,true +2019,2,30,970057559,764055864,true +2019,2,30,618397998,1434721588,false +2019,2,30,827636844,374328772,true +2019,2,30,1167869804,1017634054,true +2019,2,30,1648404631,802008946,true +2019,2,31,2048243909,797316599,false +2019,2,31,920455867,1073408181,false +2019,2,31,1871660935,267464484,false +2019,2,31,1843454486,1176797243,true +2019,2,31,1384668743,1040411134,false +2019,2,31,191262625,623235236,true +2019,2,31,260155443,1673713108,true +2019,2,31,932293436,1161592231,true +2019,2,31,1538540028,632583666,true +2019,2,31,1949038246,1211177709,false +2019,3,1,668382809,1061588448,true +2019,3,1,1188816804,71573274,false +2019,3,1,666816870,820914849,true +2019,3,1,1763899374,1751689642,false +2019,3,1,1569438780,1367935234,false +2019,3,1,115881145,183184568,true +2019,3,1,1196885652,334739548,false +2019,3,1,1034680562,1955349443,false +2019,3,1,1018091108,1290706522,false +2019,3,1,918561935,47871759,false +2019,3,2,1367111077,2012421014,false +2019,3,2,1780904059,1428057911,true +2019,3,2,331183309,1930285618,true +2019,3,2,1706902020,41533188,false +2019,3,2,842704027,700106427,false +2019,3,2,798362080,393869738,true +2019,3,2,458150992,954994929,true +2019,3,2,2118097432,81844731,true +2019,3,2,1215994211,1822549326,false +2019,3,2,101008257,713218863,true +2019,3,3,1073544273,363477341,true +2019,3,3,1314440842,1443061323,false +2019,3,3,287885857,1571025990,true +2019,3,3,692749443,1902600023,true +2019,3,3,867765418,519819798,true +2019,3,3,457684403,449078241,true +2019,3,3,1834343361,1267968938,false +2019,3,3,227149261,1720893562,true +2019,3,3,1776150488,713885263,false +2019,3,3,1185721255,787096883,false +2019,3,4,2012011195,818443627,false +2019,3,4,1350576707,48931141,false +2019,3,4,1520862565,2030093774,false +2019,3,4,209991818,1730120808,false +2019,3,4,140106278,235203970,true +2019,3,4,1032659522,703926405,false +2019,3,4,929790101,162636839,true +2019,3,4,1283137584,2006186182,true +2019,3,4,4728064,1053566794,true +2019,3,4,1810101853,886375219,true +2019,3,5,2104850261,1293767477,true +2019,3,5,156004089,949471324,true +2019,3,5,498560445,841921568,false +2019,3,5,929322326,40333504,true +2019,3,5,1099492705,169596121,false +2019,3,5,1792423915,906916336,false +2019,3,5,1850451982,1904715577,true +2019,3,5,147308895,74677666,true +2019,3,5,1590749108,385400823,true +2019,3,5,883472400,620051243,true +2019,3,6,120289523,1978471108,true +2019,3,6,1613232274,1460953119,false +2019,3,6,1891492917,108410030,true +2019,3,6,541514515,1339021059,true +2019,3,6,339568900,1719581248,false +2019,3,6,146650287,1921248089,true +2019,3,6,1023889823,955145175,false +2019,3,6,828267798,1607104618,false +2019,3,6,475809020,444659232,true +2019,3,6,223092245,1316547739,false +2019,3,7,1367657394,1584500161,true +2019,3,7,2015954390,1091249461,false +2019,3,7,1550801031,1133577373,true +2019,3,7,356421123,1537525475,true +2019,3,7,88498072,1998879787,true +2019,3,7,1412413699,759321935,true +2019,3,7,1668509517,1400587227,false +2019,3,7,38501391,2042624176,false +2019,3,7,2104583322,914553730,true +2019,3,7,1829459929,609036831,false +2019,3,8,464373823,1603473727,false +2019,3,8,280539879,558200845,false +2019,3,8,288470276,1808353116,false +2019,3,8,12496899,266979895,false +2019,3,8,1270507298,102713872,false +2019,3,8,409086024,1572409567,true +2019,3,8,2093306710,551476789,true +2019,3,8,1428157361,1364766015,true +2019,3,8,1663791193,1119725901,true +2019,3,8,1839600700,576532843,false +2019,3,9,1586153388,150163561,true +2019,3,9,79656696,646702112,false +2019,3,9,1413446391,2043313870,false +2019,3,9,1989551001,366006571,true +2019,3,9,1648935280,2077138545,false +2019,3,9,1024241258,1551346036,true +2019,3,9,1964049669,438401950,true +2019,3,9,1569120694,1936612819,true +2019,3,9,987744574,434373218,false +2019,3,9,74033910,494257710,false +2019,3,10,952727057,221220760,false +2019,3,10,93042031,1236327070,true +2019,3,10,1501368088,717176995,false +2019,3,10,816353026,1505203846,true +2019,3,10,1125538807,1217205312,false +2019,3,10,182390108,228367838,true +2019,3,10,2138546380,1252645871,false +2019,3,10,1112914119,481868584,false +2019,3,10,1335873012,1869791460,true +2019,3,10,1206097741,40720580,false +2019,3,11,1837077666,1621543051,true +2019,3,11,74751469,2043069085,true +2019,3,11,2060132841,1184048116,false +2019,3,11,177512201,1289442668,true +2019,3,11,871429193,174095048,true +2019,3,11,1010078823,994072314,true +2019,3,11,720199660,1064672670,true +2019,3,11,1965092937,409343788,true +2019,3,11,1966240772,2073980092,false +2019,3,11,1037960336,1617476551,false +2019,3,12,393551893,43980718,false +2019,3,12,1649203183,226150686,true +2019,3,12,1078608401,562853693,false +2019,3,12,1024442503,886220850,false +2019,3,12,1355005575,1323300516,true +2019,3,12,576004236,2000758508,false +2019,3,12,1333885530,1926208030,false +2019,3,12,999783624,1991756149,false +2019,3,12,1034555837,1766195223,false +2019,3,12,1789273617,161074140,false +2019,3,13,1095305824,324079534,false +2019,3,13,620733964,1867544965,false +2019,3,13,1963265815,431239754,false +2019,3,13,1762614484,1369773084,true +2019,3,13,1893273956,286377837,false +2019,3,13,983044194,1974040061,false +2019,3,13,1225460133,759286029,false +2019,3,13,1970071494,1061289039,false +2019,3,13,343279831,1434297361,false +2019,3,13,1032207246,1103864649,false +2019,3,14,280655666,1316809686,false +2019,3,14,1515067891,311995224,false +2019,3,14,824525060,668407994,true +2019,3,14,1200433314,2005674007,true +2019,3,14,1892116373,1304740127,false +2019,3,14,2032729915,674533990,false +2019,3,14,1260682749,414253425,false +2019,3,14,16514268,1850305787,true +2019,3,14,2057116212,107078403,true +2019,3,14,544229456,1722699584,false +2019,3,15,1341046979,256234215,true +2019,3,15,399371711,158583906,false +2019,3,15,1003314649,957575444,false +2019,3,15,346174960,1689107667,false +2019,3,15,1715112121,1058045291,false +2019,3,15,1119695477,1430716089,false +2019,3,15,1154983610,129121151,true +2019,3,15,6578019,459333114,false +2019,3,15,1114380619,698130221,false +2019,3,15,673052260,2009829086,true +2019,3,16,701368003,1083360099,false +2019,3,16,104427576,715427469,true +2019,3,16,112234459,1028710834,true +2019,3,16,621119758,408281799,true +2019,3,16,474379302,548539503,false +2019,3,16,1871919420,1330128556,true +2019,3,16,362599917,960470679,false +2019,3,16,1523402543,1803723889,true +2019,3,16,2054883878,1656257467,true +2019,3,16,875868347,1397587021,false +2019,3,17,553499539,1997481866,true +2019,3,17,1084750097,661034764,true +2019,3,17,901525536,1404649448,false +2019,3,17,826588667,34270980,true +2019,3,17,2093270554,705449350,true +2019,3,17,570663475,1602888805,false +2019,3,17,1487131355,915144314,true +2019,3,17,475786229,1088776723,true +2019,3,17,1005027062,884409167,true +2019,3,17,1376845338,311897748,true +2019,3,18,2115050513,821150326,false +2019,3,18,1669095254,2134149299,true +2019,3,18,182275819,2010265390,false +2019,3,18,239412855,10088934,true +2019,3,18,1662538814,61171768,true +2019,3,18,1921297334,252397729,true +2019,3,18,1433022846,376782829,false +2019,3,18,641801850,1889272589,false +2019,3,18,1107483305,1430454935,true +2019,3,18,277570584,1666447084,false +2019,3,19,332582044,487487765,false +2019,3,19,1551777654,728937863,false +2019,3,19,427253680,1222178107,false +2019,3,19,1592427171,271467261,false +2019,3,19,850442376,1693706855,true +2019,3,19,1638777526,941127244,false +2019,3,19,1529531736,423149630,true +2019,3,19,1071055774,386573287,false +2019,3,19,1218708490,1850218137,false +2019,3,19,1512367263,2014504636,true +2019,3,20,391023124,586987778,true +2019,3,20,779243441,47610865,true +2019,3,20,575750358,1440632147,true +2019,3,20,305552647,276677576,false +2019,3,20,529596886,99822674,false +2019,3,20,495997275,267429474,true +2019,3,20,1337566267,2007702633,false +2019,3,20,691874005,311747935,true +2019,3,20,1144103252,745973181,false +2019,3,20,1614172591,701389708,false +2019,3,21,306513933,1198177369,false +2019,3,21,1285284058,1564860955,false +2019,3,21,1503768477,695097391,true +2019,3,21,643561555,1937020700,true +2019,3,21,1628166101,1539347464,true +2019,3,21,424614383,212607789,true +2019,3,21,867152348,226550737,true +2019,3,21,1291389627,175271073,true +2019,3,21,1297167927,459812794,true +2019,3,21,1562937935,480277639,true +2019,3,22,1854621035,1588524322,true +2019,3,22,1361561250,1545401946,false +2019,3,22,1273751203,1398884176,true +2019,3,22,1838936119,1425497416,true +2019,3,22,1548459035,206731606,true +2019,3,22,777843657,78498373,false +2019,3,22,1037485888,932032453,true +2019,3,22,1383462332,552062060,true +2019,3,22,290129603,295855087,true +2019,3,22,1990985972,2110857573,true +2019,3,23,1116202773,1212240874,true +2019,3,23,69242348,463332789,true +2019,3,23,282999303,2073706907,true +2019,3,23,218038306,1080630611,false +2019,3,23,910132346,1175031905,true +2019,3,23,1618970409,846461653,true +2019,3,23,1894660459,1908413517,true +2019,3,23,861320716,974116872,false +2019,3,23,1434353599,347246297,false +2019,3,23,1681084098,502136029,true +2019,3,24,1141380248,1828336798,true +2019,3,24,366080224,208808425,true +2019,3,24,2038304283,512654473,true +2019,3,24,1232019923,1168241130,false +2019,3,24,953017069,1773810790,true +2019,3,24,899574335,1631146516,true +2019,3,24,806926551,1014740975,false +2019,3,24,1187719007,485636878,true +2019,3,24,295217482,208731799,false +2019,3,24,140015733,1126296810,true +2019,3,25,274134204,1163678153,true +2019,3,25,550645675,656717437,true +2019,3,25,2043352601,231452762,true +2019,3,25,1476080376,1649599709,true +2019,3,25,1677712826,1944223060,true +2019,3,25,767431902,624167417,false +2019,3,25,1443970973,14673773,false +2019,3,25,1486630435,424102268,false +2019,3,25,1651040985,721603524,true +2019,3,25,1562704087,1324786048,true +2019,3,26,735170687,91324859,true +2019,3,26,184756788,116013245,false +2019,3,26,1770352229,54939396,true +2019,3,26,444914765,1210134068,false +2019,3,26,1084837526,488311356,true +2019,3,26,1090474372,1679574666,true +2019,3,26,204068438,1289486255,false +2019,3,26,1122126638,1887482250,false +2019,3,26,1730701303,1084051216,false +2019,3,26,1902699013,217976462,true +2019,3,27,1741889790,115999311,true +2019,3,27,1750211121,2043306167,true +2019,3,27,1215984190,338812119,true +2019,3,27,1497849345,1665915969,true +2019,3,27,89090550,58501077,false +2019,3,27,1132818877,1556999256,true +2019,3,27,1028753435,196041556,true +2019,3,27,618242066,2098369842,true +2019,3,27,1587082059,868894041,false +2019,3,27,948609535,1878063201,true +2019,3,28,1102704928,1121847106,true +2019,3,28,1501458013,1951289749,true +2019,3,28,1566060260,832061203,true +2019,3,28,1915518023,1642123336,false +2019,3,28,965132234,1355138422,false +2019,3,28,1744398611,2128039693,true +2019,3,28,1762610231,731832986,false +2019,3,28,1139617187,563716978,true +2019,3,28,2082853326,2001111862,false +2019,3,28,2022273066,1151478203,false +2019,3,29,506205741,588704859,false +2019,3,29,784914111,8443358,true +2019,3,29,152149669,137304025,true +2019,3,29,251321431,721717127,false +2019,3,29,140717394,56939613,false +2019,3,29,1039621371,1620717841,false +2019,3,29,1477148762,1871002528,false +2019,3,29,1469130575,547856814,false +2019,3,29,1191832130,1096498740,true +2019,3,29,537669838,404095297,false +2019,3,30,470377107,84813779,false +2019,3,30,630251328,1834375309,false +2019,3,30,836924496,1896818755,true +2019,3,30,1263928666,2027449029,true +2019,3,30,462291935,245138464,true +2019,3,30,614203787,1562688941,true +2019,3,30,1278728312,1324725814,true +2019,3,30,58701137,107512765,true +2019,3,30,926627135,1853703472,true +2019,3,30,1312299203,1976254479,false +2019,3,31,1132342407,507294399,true +2019,3,31,1466880264,1261436414,true +2019,3,31,101827575,114000011,false +2019,3,31,267827691,297264825,false +2019,3,31,439159433,574891152,true +2019,3,31,1114870233,794455361,true +2019,3,31,1883437463,1354676655,false +2019,3,31,68566251,2089567617,false +2019,3,31,2138632308,1594921625,true +2019,3,31,1481715391,2083878283,true +2019,4,1,737402437,1869505643,false +2019,4,1,1498008659,1790908441,false +2019,4,1,1160400915,1404752772,false +2019,4,1,431279660,1861316063,true +2019,4,1,1696246093,1162385537,true +2019,4,1,1307387989,941928284,false +2019,4,1,1180130100,93029486,false +2019,4,1,1268551981,155992868,true +2019,4,1,150146027,1809396348,true +2019,4,1,444184829,1098489048,false +2019,4,2,1350031382,164987623,false +2019,4,2,2033677249,507946833,false +2019,4,2,816202327,1002036687,false +2019,4,2,232770483,1508853917,true +2019,4,2,1953008141,990119468,false +2019,4,2,1781137014,1757796576,false +2019,4,2,1945114386,802866223,true +2019,4,2,263198707,1768578372,true +2019,4,2,1388182136,2045653924,true +2019,4,2,270730566,305303153,true +2019,4,3,637682801,1896275507,false +2019,4,3,1862723403,819934786,false +2019,4,3,243052005,849037977,false +2019,4,3,998698929,1140839101,false +2019,4,3,854319207,1305878825,true +2019,4,3,1831566486,977321071,true +2019,4,3,791613826,1517798200,false +2019,4,3,1089422875,1355472062,true +2019,4,3,1016774295,493553211,true +2019,4,3,2131990136,923426002,true +2019,4,4,1201646791,684522699,false +2019,4,4,1435467880,1555509277,true +2019,4,4,1516827802,1369450179,true +2019,4,4,1555310873,1305347504,true +2019,4,4,628646754,1616533301,false +2019,4,4,2021609942,688427484,false +2019,4,4,1330703502,1424878060,true +2019,4,4,1199418150,1251694036,false +2019,4,4,1609421668,2105004975,true +2019,4,4,800452123,1225170245,true +2019,4,5,1737929184,982482826,false +2019,4,5,2111017477,835545632,true +2019,4,5,481303466,1987025089,true +2019,4,5,443089856,1579943186,false +2019,4,5,1630889050,25797691,true +2019,4,5,922474839,655422805,true +2019,4,5,1792579213,1190983044,true +2019,4,5,80046012,1673907422,true +2019,4,5,552336752,1852631742,true +2019,4,5,1026339252,1837403679,true +2019,4,6,1203131597,727498774,false +2019,4,6,1675013716,917833420,true +2019,4,6,43832263,1094747795,true +2019,4,6,1631826595,1609654116,false +2019,4,6,829514940,106179422,false +2019,4,6,1227762994,423616886,true +2019,4,6,1024952583,1043452885,true +2019,4,6,143862744,864200399,true +2019,4,6,331297191,1464365058,false +2019,4,6,1303454405,1963266584,false +2019,4,7,831868387,1303386196,true +2019,4,7,139424177,576105879,false +2019,4,7,596674442,624092326,true +2019,4,7,1059651289,443999998,false +2019,4,7,476817510,214120951,false +2019,4,7,1321156851,1745102741,false +2019,4,7,1006372596,942301543,false +2019,4,7,369100918,234982955,true +2019,4,7,688007296,1710188113,true +2019,4,7,128423361,1593517190,false +2019,4,8,552473912,547621527,true +2019,4,8,759160062,2070568012,false +2019,4,8,86750929,387377025,true +2019,4,8,1522307550,1287716765,true +2019,4,8,1603900630,951246522,false +2019,4,8,799966338,1481383900,false +2019,4,8,31746485,2126946895,false +2019,4,8,393104932,850065741,true +2019,4,8,1367536168,867083201,false +2019,4,8,789190501,1394003096,false +2019,4,9,1331760272,1148254596,false +2019,4,9,1953608517,590047061,true +2019,4,9,294211665,1124601734,true +2019,4,9,1730294068,1827080998,true +2019,4,9,1385547100,1083052682,true +2019,4,9,1915401664,1866573581,true +2019,4,9,1254613603,2129217416,false +2019,4,9,53062852,505706753,true +2019,4,9,1092453488,1151842118,false +2019,4,9,1088327975,1852128752,true +2019,4,10,197901520,1708594152,true +2019,4,10,290612621,140223656,true +2019,4,10,402401939,356667698,false +2019,4,10,898752144,194565225,false +2019,4,10,1571334932,186147235,true +2019,4,10,483916359,1736118791,false +2019,4,10,139563329,429636030,true +2019,4,10,636461656,779220706,false +2019,4,10,2050086304,796323858,true +2019,4,10,441306308,1773720358,false +2019,4,11,754704701,1016847336,true +2019,4,11,1540074783,2098120971,false +2019,4,11,524370953,506807711,true +2019,4,11,445627149,338723089,false +2019,4,11,1134289751,2018665242,false +2019,4,11,1996745729,1986729832,false +2019,4,11,852199282,2015053770,true +2019,4,11,1063013823,1929355527,false +2019,4,11,1689139072,121107268,false +2019,4,11,1743881234,294286111,true +2019,4,12,213588279,729551452,false +2019,4,12,379738023,579387879,false +2019,4,12,1740152332,895526458,true +2019,4,12,1753470237,532497407,true +2019,4,12,746142171,1632954858,true +2019,4,12,808141469,752082391,true +2019,4,12,69386054,2099300944,false +2019,4,12,1438957469,1294138580,true +2019,4,12,283139124,811492749,false +2019,4,12,298223006,905759798,false +2019,4,13,58072378,798236212,false +2019,4,13,975620604,347332074,true +2019,4,13,374761685,375270368,true +2019,4,13,967966765,1166462917,false +2019,4,13,1488001982,1329792524,true +2019,4,13,165347595,315683296,false +2019,4,13,809199036,293134138,true +2019,4,13,2062431980,1820232977,true +2019,4,13,1884637890,1661498832,false +2019,4,13,818261953,1020866735,true +2019,4,14,485707326,1441862957,true +2019,4,14,2092739958,835843364,false +2019,4,14,2186971,111054057,true +2019,4,14,1690192441,2077160699,true +2019,4,14,920135471,65631552,true +2019,4,14,317839744,862974060,false +2019,4,14,1648067594,2115444491,true +2019,4,14,1507338214,2112958361,false +2019,4,14,747871062,61168361,true +2019,4,14,155937584,1004000747,true +2019,4,15,674297554,1866507857,true +2019,4,15,952035609,898631008,false +2019,4,15,1895672848,1084528333,false +2019,4,15,501147324,620952225,true +2019,4,15,81530745,2026855063,true +2019,4,15,737846197,76858165,true +2019,4,15,1508393539,1813261115,true +2019,4,15,120834887,188802359,false +2019,4,15,322579358,1079339548,false +2019,4,15,930427697,175199314,false +2019,4,16,1047899014,682041318,true +2019,4,16,2130166520,1626927597,true +2019,4,16,1774133756,1323601751,true +2019,4,16,1763395535,531867350,false +2019,4,16,217730958,205506787,false +2019,4,16,1538431069,126513862,true +2019,4,16,268708989,2111733786,false +2019,4,16,2033258779,1569565331,true +2019,4,16,775432668,572726499,true +2019,4,16,2142432511,1442224114,false +2019,4,17,1980527746,323023334,false +2019,4,17,808165286,1786959379,true +2019,4,17,339530662,167573170,false +2019,4,17,2061233478,1675471996,true +2019,4,17,1258401138,64827277,false +2019,4,17,97868292,1582849516,false +2019,4,17,1183918595,1342529008,true +2019,4,17,491224463,182678325,true +2019,4,17,1132423553,598375438,false +2019,4,17,1836392709,1218284103,false +2019,4,18,1744559194,542913294,true +2019,4,18,1259838604,704950241,false +2019,4,18,1870927043,368950958,true +2019,4,18,1138556720,216389291,true +2019,4,18,739667724,1450302092,false +2019,4,18,1268053724,705308832,true +2019,4,18,69911153,1244148819,true +2019,4,18,1374584709,1482606606,false +2019,4,18,547257953,273728369,true +2019,4,18,806873066,945007518,true +2019,4,19,2007578747,840049190,false +2019,4,19,1736968051,1409787166,true +2019,4,19,216497,672457995,true +2019,4,19,661939211,1508893951,true +2019,4,19,1827792206,420857192,true +2019,4,19,724553751,2008644812,true +2019,4,19,1953623764,722702313,false +2019,4,19,2046330835,1553731845,false +2019,4,19,1242147815,1055521940,false +2019,4,19,1768839863,44181275,false +2019,4,20,214994379,1499934230,true +2019,4,20,364955833,1377359492,false +2019,4,20,1994240537,1175676342,true +2019,4,20,1591966847,1209347111,true +2019,4,20,2079567024,1224724119,true +2019,4,20,343147243,75486600,true +2019,4,20,1550687515,72740363,false +2019,4,20,821452155,1239530873,false +2019,4,20,773741531,809670695,true +2019,4,20,1201987699,338671173,true +2019,4,21,1806886997,926944987,false +2019,4,21,1842480853,1173695882,false +2019,4,21,2110831190,1961328434,false +2019,4,21,1767724427,1639629887,false +2019,4,21,51266156,610546579,true +2019,4,21,2087541840,439718294,false +2019,4,21,1330006660,392551184,true +2019,4,21,268185149,1145463226,true +2019,4,21,779099276,1567360294,false +2019,4,21,1004969221,158366115,false +2019,4,22,504358669,1092466308,true +2019,4,22,2026629439,1597497189,true +2019,4,22,1546246133,287610534,false +2019,4,22,633325404,1161161425,true +2019,4,22,1647010829,1187313524,true +2019,4,22,1624595470,1160086594,false +2019,4,22,77151689,1429319454,false +2019,4,22,1808432206,482346979,false +2019,4,22,2115840125,548714985,false +2019,4,22,816307273,1510527449,true +2019,4,23,1743793429,897690803,false +2019,4,23,806305241,786174991,true +2019,4,23,1310608353,1388608074,false +2019,4,23,727968336,2145472670,false +2019,4,23,1366731550,539147514,true +2019,4,23,1710564548,107592707,false +2019,4,23,632030260,1018030805,false +2019,4,23,210016542,1066758303,false +2019,4,23,1426964973,989685404,true +2019,4,23,262433173,347197752,false +2019,4,24,216756241,1841139807,true +2019,4,24,938422115,405359218,true +2019,4,24,950917406,376938537,false +2019,4,24,609180879,751690619,false +2019,4,24,1856440611,303661645,false +2019,4,24,2113388005,2023312478,false +2019,4,24,1515943141,22999097,false +2019,4,24,1526377073,98460857,false +2019,4,24,2053720782,2020141522,false +2019,4,24,1586276121,1091117001,false +2019,4,25,2044299105,2028950217,false +2019,4,25,348587180,421348107,true +2019,4,25,681728287,862198891,false +2019,4,25,215634941,1724207842,true +2019,4,25,511229990,10477525,true +2019,4,25,149131748,322381058,true +2019,4,25,1092400114,118386072,false +2019,4,25,1421054521,1880134288,true +2019,4,25,185042437,1238200113,true +2019,4,25,1450293738,1476244532,false +2019,4,26,981942882,1280316381,false +2019,4,26,102913910,823885978,true +2019,4,26,777422622,1755755950,false +2019,4,26,218115703,45720072,true +2019,4,26,1288258166,1117240568,false +2019,4,26,116909566,1327002135,true +2019,4,26,71437097,1260689258,false +2019,4,26,1042041580,1180914366,false +2019,4,26,862906634,89697636,false +2019,4,26,1866632853,693805593,false +2019,4,27,1543009281,1704766327,false +2019,4,27,151922505,1846893250,true +2019,4,27,1107541499,98797326,false +2019,4,27,830114711,119186582,true +2019,4,27,1384814707,802892976,true +2019,4,27,1885665900,23345714,true +2019,4,27,2111963287,410850135,false +2019,4,27,1262399827,990913943,true +2019,4,27,250799429,2062341079,true +2019,4,27,1724295051,673907373,false +2019,4,28,672964676,1224191051,true +2019,4,28,484241319,1356021924,true +2019,4,28,1693216905,874010355,false +2019,4,28,2025755380,1879888735,true +2019,4,28,1123259759,1454448927,true +2019,4,28,232906973,321256559,true +2019,4,28,409692964,1782221290,false +2019,4,28,55384579,296179530,false +2019,4,28,955819704,1425027569,false +2019,4,28,281403845,572569279,false +2019,4,29,445422745,1896307802,false +2019,4,29,368724580,1987752407,true +2019,4,29,2095900252,41025211,false +2019,4,29,1623551763,84053152,false +2019,4,29,2090980635,383594685,true +2019,4,29,577483915,689931677,true +2019,4,29,682456035,1556678040,false +2019,4,29,809878523,1364789961,false +2019,4,29,618496467,1461668029,false +2019,4,29,2117498225,1371799920,false +2019,4,30,1132482058,1094487100,true +2019,4,30,487978507,546353742,false +2019,4,30,1548210031,1236222676,false +2019,4,30,622580624,525672366,true +2019,4,30,754467059,1885376686,true +2019,4,30,262670920,571088263,false +2019,4,30,1432094444,1230824744,true +2019,4,30,491784994,1685160902,false +2019,4,30,2109291660,2061737288,false +2019,4,30,340049805,1925566784,false +2019,4,31,942477477,608088486,true +2019,4,31,2081957270,210215434,false +2019,4,31,451190589,1347492849,true +2019,4,31,1719667232,1624966691,false +2019,4,31,1786530203,1528627037,true +2019,4,31,1244985427,1536236982,true +2019,4,31,1612660949,1258554520,false +2019,4,31,1897102486,142341730,true +2019,4,31,839259911,1974244637,false +2019,4,31,74884784,268011840,false +2019,5,1,389630391,1920926453,false +2019,5,1,1275009513,2122611625,true +2019,5,1,1672209438,830800208,true +2019,5,1,548750132,212268259,true +2019,5,1,1874993729,773091196,true +2019,5,1,564677847,28707139,true +2019,5,1,760444452,2127178520,true +2019,5,1,1770548235,655292263,false +2019,5,1,344000126,142194594,false +2019,5,1,1860305022,2081836683,false +2019,5,2,814727860,10999979,false +2019,5,2,380698313,1023614259,true +2019,5,2,757402124,1355747668,true +2019,5,2,1519916597,1369705878,false +2019,5,2,1269806158,104933099,false +2019,5,2,1283276642,1705765269,false +2019,5,2,801478736,302880181,false +2019,5,2,897823917,1962930719,false +2019,5,2,870549794,198520394,true +2019,5,2,79761001,725734992,false +2019,5,3,205438168,1576214295,true +2019,5,3,1511337033,771585203,true +2019,5,3,805943324,1758810332,true +2019,5,3,1334381193,1840774503,true +2019,5,3,1800585232,2007270972,false +2019,5,3,635661499,630330909,false +2019,5,3,245401648,2009132478,true +2019,5,3,867744066,872175913,false +2019,5,3,2030363690,1402739406,true +2019,5,3,565200864,780631829,true +2019,5,4,1016186344,353203622,true +2019,5,4,931730149,267241086,true +2019,5,4,1293372999,1364423453,false +2019,5,4,1621884381,1995395108,false +2019,5,4,2085765931,477174987,false +2019,5,4,295570897,1123038576,true +2019,5,4,1513646059,1319956248,true +2019,5,4,1683607162,870165979,false +2019,5,4,526323024,483501560,true +2019,5,4,610731248,1683731453,false +2019,5,5,75121824,1622313614,false +2019,5,5,1569360831,1537332533,false +2019,5,5,901678435,1545601772,true +2019,5,5,2046236013,1914912540,true +2019,5,5,1315493865,1477237414,false +2019,5,5,1704028621,1523729481,false +2019,5,5,2048116407,986498549,false +2019,5,5,1706545079,1698465294,true +2019,5,5,1908017001,1558074012,true +2019,5,5,1919059967,575423284,false +2019,5,6,1397163949,1400938701,true +2019,5,6,2122962875,1086983143,false +2019,5,6,1097131349,347351997,false +2019,5,6,2113007425,134011206,true +2019,5,6,1090904259,1419734195,false +2019,5,6,777277533,722293597,true +2019,5,6,925605655,20941990,true +2019,5,6,36503984,1445689964,false +2019,5,6,78292448,2131982286,false +2019,5,6,1458384441,737607887,false +2019,5,7,1921903684,1187433401,false +2019,5,7,1753716595,1967978718,true +2019,5,7,380779998,640816062,true +2019,5,7,1592380824,1426529945,true +2019,5,7,1659898305,1492642016,false +2019,5,7,2073117913,1331660554,false +2019,5,7,671884377,813277572,false +2019,5,7,243820469,727183255,false +2019,5,7,381979952,1396349575,true +2019,5,7,1671333301,565630400,false +2019,5,8,1753252280,1572116117,false +2019,5,8,1287668529,1835138040,false +2019,5,8,1039506593,34703827,true +2019,5,8,310692342,925997727,true +2019,5,8,1449428345,1710503737,true +2019,5,8,836746196,612910150,true +2019,5,8,1793349239,2093455912,true +2019,5,8,251109258,290458303,false +2019,5,8,1780313793,51970601,true +2019,5,8,1328840071,889702066,true +2019,5,9,1666938264,483350490,false +2019,5,9,1535145332,2127369632,false +2019,5,9,1606872868,1165887849,true +2019,5,9,491490958,1337836196,true +2019,5,9,1308917209,975170098,false +2019,5,9,1217508396,1920101459,false +2019,5,9,1022481329,286766179,true +2019,5,9,2074346762,916219523,false +2019,5,9,1905065129,1287398565,true +2019,5,9,572073783,421634279,true +2019,5,10,1847337026,606971717,true +2019,5,10,678986671,1221972929,false +2019,5,10,1425254909,2058195793,true +2019,5,10,410317832,2059069975,false +2019,5,10,427559170,688573340,false +2019,5,10,1288231362,1291953451,false +2019,5,10,1823176059,441884344,false +2019,5,10,629270,853825330,true +2019,5,10,423327296,567129650,true +2019,5,10,994992726,537486192,false +2019,5,11,983878634,1381245632,true +2019,5,11,53393470,620396618,false +2019,5,11,814874227,1177478693,false +2019,5,11,2143485640,1841075184,true +2019,5,11,1209770202,628626667,false +2019,5,11,1559991131,635829921,true +2019,5,11,596083342,2010749257,true +2019,5,11,1645987960,768429510,true +2019,5,11,461572240,1643386323,true +2019,5,11,1508612644,1735342416,true +2019,5,12,852613253,1209479615,true +2019,5,12,911070665,65175832,false +2019,5,12,55406132,1909721577,false +2019,5,12,446587462,2134135354,false +2019,5,12,1517563936,709650646,false +2019,5,12,941928028,900029599,true +2019,5,12,199406142,1739461998,false +2019,5,12,1448991426,2145074993,false +2019,5,12,2093424127,392350498,false +2019,5,12,1793018504,2087943538,true +2019,5,13,1410579742,674142130,true +2019,5,13,1138274185,565992294,false +2019,5,13,849115375,1341143516,false +2019,5,13,2029107742,66049303,false +2019,5,13,1409634859,790888845,true +2019,5,13,2145262166,290454807,false +2019,5,13,1317696876,184642937,false +2019,5,13,2024722011,1363702220,true +2019,5,13,2132946638,1881645851,true +2019,5,13,555982027,1499716468,true +2019,5,14,726510455,656127373,false +2019,5,14,972972845,720344338,false +2019,5,14,1852228080,523619516,false +2019,5,14,2121555919,96686290,true +2019,5,14,904000154,261452258,false +2019,5,14,1726096299,1677378684,true +2019,5,14,1032259489,129556300,true +2019,5,14,391984167,349160723,false +2019,5,14,1370374200,587989204,false +2019,5,14,933488079,1402650807,false +2019,5,15,1330205962,1610098590,false +2019,5,15,532115540,1942277205,false +2019,5,15,1781003291,1707155184,true +2019,5,15,1264449910,1449889497,true +2019,5,15,1681334421,1992597823,false +2019,5,15,832579500,1984808410,false +2019,5,15,1608924984,222884813,true +2019,5,15,1096746651,1056178159,true +2019,5,15,2125927003,149093481,true +2019,5,15,1756395196,787023672,false +2019,5,16,1233031365,393037866,true +2019,5,16,1995986639,1120953025,false +2019,5,16,1895950542,777225029,true +2019,5,16,703520989,2123816898,true +2019,5,16,1894724990,1036099914,false +2019,5,16,718185980,351554536,true +2019,5,16,678297000,1953872486,false +2019,5,16,523303701,666959926,true +2019,5,16,1863685427,123157900,false +2019,5,16,839571035,1097572371,true +2019,5,17,646149907,207542403,false +2019,5,17,801103921,1046549160,true +2019,5,17,368163478,1267674702,false +2019,5,17,940123058,1984646294,true +2019,5,17,2003661082,1003450739,true +2019,5,17,1011427872,1450413742,false +2019,5,17,2055803204,963906307,false +2019,5,17,497201619,240540213,true +2019,5,17,2129308496,507900015,true +2019,5,17,247032978,1039537613,true +2019,5,18,1180075538,1871444371,false +2019,5,18,1160155730,1987855919,true +2019,5,18,1562019519,2046344579,false +2019,5,18,952921297,424600612,false +2019,5,18,413037978,636946419,false +2019,5,18,1389141382,1577176461,false +2019,5,18,1940480019,357313109,true +2019,5,18,858428127,608510296,false +2019,5,18,670110981,556602118,true +2019,5,18,1780333252,740136620,false +2019,5,19,1177653768,1272807963,true +2019,5,19,470962129,921181631,false +2019,5,19,770064666,336180915,false +2019,5,19,381318103,55993609,false +2019,5,19,1717873681,834525398,true +2019,5,19,1561294188,421906385,true +2019,5,19,149544840,1028391751,true +2019,5,19,357691330,1698743117,false +2019,5,19,972728650,255096276,true +2019,5,19,610384638,295626187,false +2019,5,20,738737776,478344205,true +2019,5,20,1874899129,1989562265,false +2019,5,20,1572549537,1922692454,true +2019,5,20,498002814,1757536165,false +2019,5,20,424549943,1171186476,false +2019,5,20,141537629,1179194900,true +2019,5,20,1920417600,668106205,false +2019,5,20,1203009768,278603063,false +2019,5,20,448811402,271283905,true +2019,5,20,342474040,275359154,true +2019,5,21,1186433597,417244107,false +2019,5,21,2132766930,274496679,true +2019,5,21,2120949682,6079293,true +2019,5,21,1282809977,699845446,false +2019,5,21,645844101,271047032,false +2019,5,21,717295187,1352320278,true +2019,5,21,902944026,274571008,false +2019,5,21,320670449,1757028043,false +2019,5,21,33858566,114722211,false +2019,5,21,1883333910,384563596,true +2019,5,22,1046162974,1770808119,true +2019,5,22,319504974,2142031599,true +2019,5,22,16812317,786767227,true +2019,5,22,58253754,1428520992,true +2019,5,22,829917207,873371123,false +2019,5,22,773324695,1027888866,false +2019,5,22,842684223,889511389,true +2019,5,22,1123851116,1715278188,false +2019,5,22,1047135396,423311755,false +2019,5,22,343479272,741804637,true +2019,5,23,450901483,1493565391,false +2019,5,23,728281800,821871664,false +2019,5,23,1058409274,378453300,false +2019,5,23,1018605678,268739574,true +2019,5,23,114814415,1603175484,false +2019,5,23,902773229,1133840709,false +2019,5,23,590796015,1408944863,true +2019,5,23,1257336230,2073510306,false +2019,5,23,1059321402,213906162,true +2019,5,23,1839817183,1927128006,false +2019,5,24,1278668420,680488530,true +2019,5,24,1883925970,1373993868,true +2019,5,24,499304130,809347290,true +2019,5,24,620488993,151579002,true +2019,5,24,1304026289,482496906,true +2019,5,24,1962325692,874414357,true +2019,5,24,1265187126,1326616951,true +2019,5,24,1531905939,300603268,false +2019,5,24,772951732,255593040,false +2019,5,24,536616623,1783982124,true +2019,5,25,190777286,41838497,true +2019,5,25,451236745,1526712142,true +2019,5,25,1955826460,693286170,false +2019,5,25,168257715,358161104,true +2019,5,25,550941554,720886146,false +2019,5,25,679762989,458543026,true +2019,5,25,2083324627,1215584548,false +2019,5,25,1014524936,1084919523,false +2019,5,25,664297136,1878141905,true +2019,5,25,174388503,1201020523,false +2019,5,26,1967863403,586863055,false +2019,5,26,1172988645,882062385,true +2019,5,26,1409378402,510801999,false +2019,5,26,1136684843,18208188,true +2019,5,26,1154570666,779499834,true +2019,5,26,1097055346,410633409,false +2019,5,26,1685115639,494396287,false +2019,5,26,153828019,131611232,false +2019,5,26,1835229529,480907295,true +2019,5,26,540675588,519824219,true +2019,5,27,477807584,1190553002,false +2019,5,27,2143596221,534265265,false +2019,5,27,107505481,1350868720,true +2019,5,27,1741946776,373765210,true +2019,5,27,820750133,1103561490,true +2019,5,27,1668739510,2097786908,false +2019,5,27,2003407014,511659059,true +2019,5,27,1863077779,2045566132,false +2019,5,27,1862124457,1542393718,false +2019,5,27,1213284168,1616684996,false +2019,5,28,1895424640,1767498778,true +2019,5,28,38685112,506893127,false +2019,5,28,1097484972,402348230,true +2019,5,28,1571948401,2121262966,true +2019,5,28,208122007,373427558,false +2019,5,28,349523305,35434598,false +2019,5,28,795947231,2069386536,true +2019,5,28,1891476196,1056330189,true +2019,5,28,1666749031,125003646,false +2019,5,28,1208368930,1923147684,true +2019,5,29,80829116,1638903852,true +2019,5,29,1807849637,508933287,false +2019,5,29,106760820,1668550091,true +2019,5,29,1927479603,1254201313,false +2019,5,29,37022086,2041647947,true +2019,5,29,988000480,32733383,false +2019,5,29,1606503384,1048869974,false +2019,5,29,1784500157,79027602,false +2019,5,29,336311408,1166362022,true +2019,5,29,286740746,926810572,true +2019,5,30,346343946,350818725,false +2019,5,30,1796185072,1978323797,false +2019,5,30,1987066378,1270873852,false +2019,5,30,1007344955,2114439762,false +2019,5,30,1408401137,1956555219,false +2019,5,30,484426372,1301195957,true +2019,5,30,130127344,943065580,false +2019,5,30,2144475850,1417381990,false +2019,5,30,527438634,1158876910,true +2019,5,30,76069237,1335269178,false +2019,5,31,223823542,1038782829,false +2019,5,31,1153552426,1980687920,false +2019,5,31,151147669,282933910,true +2019,5,31,1027444995,1113099701,true +2019,5,31,1448726542,932165004,false +2019,5,31,1752964613,1671666396,false +2019,5,31,465151782,1366336723,true +2019,5,31,1221157865,1028295114,true +2019,5,31,77432124,35373925,true +2019,5,31,1741966743,1773989834,false +2019,6,1,1601788906,2090998839,false +2019,6,1,100471517,175947711,true +2019,6,1,714803132,1878371428,true +2019,6,1,1060376653,244448758,false +2019,6,1,760814168,1892410165,true +2019,6,1,656729928,1412747396,true +2019,6,1,1149978161,712546492,true +2019,6,1,734751902,85143466,false +2019,6,1,611829723,196218561,false +2019,6,1,736459079,183246338,true +2019,6,2,159582097,1850192168,false +2019,6,2,105894953,670290530,false +2019,6,2,689674653,1798791720,false +2019,6,2,511735010,538684841,true +2019,6,2,1684819600,1818262497,false +2019,6,2,1504059152,1487848703,true +2019,6,2,1290484885,2059281458,false +2019,6,2,1935612556,502547764,true +2019,6,2,472523082,1095461088,true +2019,6,2,1908828308,915377713,false +2019,6,3,352862164,1844947318,false +2019,6,3,370775771,899788224,true +2019,6,3,1649246516,74934614,true +2019,6,3,985139442,289892127,true +2019,6,3,1008413964,1193840637,true +2019,6,3,809995657,1799637261,true +2019,6,3,147365701,1727442537,false +2019,6,3,1911498877,1707083352,true +2019,6,3,691403618,617015322,true +2019,6,3,796357547,312763668,false +2019,6,4,1832884321,1236790064,false +2019,6,4,1859954684,183777023,false +2019,6,4,1796327541,772982409,true +2019,6,4,1073190192,1208416713,false +2019,6,4,1949582854,322483697,false +2019,6,4,1734629958,1943570482,false +2019,6,4,265880495,1763769287,true +2019,6,4,1961157021,263961906,false +2019,6,4,2073358138,1791681690,true +2019,6,4,1399638813,1527554235,false +2019,6,5,1319598027,604171441,false +2019,6,5,1652087816,1396226473,false +2019,6,5,191620532,1193279614,false +2019,6,5,1830227784,560464634,false +2019,6,5,929947304,2005256289,false +2019,6,5,420563538,1042272878,true +2019,6,5,1490049866,1982016849,false +2019,6,5,983849460,1102794002,true +2019,6,5,976677709,296654661,false +2019,6,5,1697218473,647946865,false +2019,6,6,1392966555,1350246246,true +2019,6,6,1222792012,647053651,false +2019,6,6,1982330429,836755328,false +2019,6,6,95837757,2105237853,true +2019,6,6,1631262205,714800424,true +2019,6,6,1919315030,1654203112,false +2019,6,6,893368316,364436102,false +2019,6,6,1447783490,1896266472,true +2019,6,6,1889493930,592594902,true +2019,6,6,774614033,1307456195,false +2019,6,7,34192866,92108786,true +2019,6,7,945854503,1567671387,false +2019,6,7,1129821734,596715395,false +2019,6,7,1351463373,702961861,true +2019,6,7,1249740064,1987160491,true +2019,6,7,1983744021,582089821,true +2019,6,7,814422515,265795149,true +2019,6,7,291432651,112477489,true +2019,6,7,1565288464,306820408,true +2019,6,7,1706594474,2128052484,true +2019,6,8,1504196985,1086596829,true +2019,6,8,708258069,1647810131,true +2019,6,8,1060751810,1373931104,false +2019,6,8,1348251383,18278430,false +2019,6,8,813435952,548200992,true +2019,6,8,1448296864,168808044,true +2019,6,8,1258105795,808730374,false +2019,6,8,216736184,553149885,false +2019,6,8,24976912,795408452,false +2019,6,8,1119168415,883762658,false +2019,6,9,1641945648,1975812164,false +2019,6,9,1389002195,2069490945,true +2019,6,9,1440480059,1855619318,true +2019,6,9,652724010,300713382,true +2019,6,9,737548796,1727342790,false +2019,6,9,235821587,1464743722,false +2019,6,9,1036225512,926664410,true +2019,6,9,1578676497,470431205,true +2019,6,9,923600254,469992553,false +2019,6,9,1637346854,1965481182,true +2019,6,10,217738387,1880709707,true +2019,6,10,979706362,429985011,true +2019,6,10,1217954623,1536895252,false +2019,6,10,2007056596,210824631,false +2019,6,10,1997049443,1669883732,false +2019,6,10,981409474,960637556,true +2019,6,10,1019708162,337837270,false +2019,6,10,1058128258,1909436945,false +2019,6,10,1013099484,1011477231,false +2019,6,10,1232249361,859015885,true +2019,6,11,133098772,512893249,true +2019,6,11,411330791,2014413586,true +2019,6,11,1143430598,1298703229,false +2019,6,11,1116870463,1985085130,true +2019,6,11,1130688469,613138438,true +2019,6,11,942456077,366404858,false +2019,6,11,1576000605,1647892614,false +2019,6,11,1437376103,640504721,false +2019,6,11,1763481479,605025489,true +2019,6,11,1963545739,44847021,true +2019,6,12,1498986682,834033074,true +2019,6,12,1390925258,2054928071,false +2019,6,12,2086902871,712724584,true +2019,6,12,1364178177,1498108286,false +2019,6,12,1299520130,1452160263,true +2019,6,12,1328161895,1328784129,true +2019,6,12,1077973674,135639295,true +2019,6,12,116828373,1222113343,false +2019,6,12,93933005,443503717,true +2019,6,12,1798423000,557441133,false +2019,6,13,1465379242,844393438,true +2019,6,13,888760364,26979462,true +2019,6,13,319794522,319132223,true +2019,6,13,2105538415,299591105,false +2019,6,13,1009935376,242418742,true +2019,6,13,832516396,1872161601,false +2019,6,13,144520433,948514204,true +2019,6,13,459757201,1803625829,false +2019,6,13,360707891,1835068224,false +2019,6,13,1660955504,231180669,true +2019,6,14,1031358487,1414325582,true +2019,6,14,1695473746,544084700,true +2019,6,14,381566034,377095576,true +2019,6,14,208253643,1649503859,false +2019,6,14,96893177,410169969,false +2019,6,14,1817926226,1093004409,true +2019,6,14,616685599,223539673,false +2019,6,14,342451522,739504084,false +2019,6,14,1190157203,1814575655,true +2019,6,14,749423832,1737856319,true +2019,6,15,88521868,2035371130,false +2019,6,15,392663108,1205746775,false +2019,6,15,1792323873,549540498,true +2019,6,15,1785292333,1316752153,false +2019,6,15,1394247898,689138920,false +2019,6,15,1964431661,346652800,false +2019,6,15,2106357704,1593325330,false +2019,6,15,1256996666,237323579,false +2019,6,15,200428511,17979364,false +2019,6,15,383270138,222278067,false +2019,6,16,1262725765,1204307905,true +2019,6,16,1479130469,1485211352,false +2019,6,16,1136852119,1638654750,true +2019,6,16,969392974,1086466937,false +2019,6,16,2115288405,2097246252,false +2019,6,16,2108657064,863367686,false +2019,6,16,969538798,1445086738,false +2019,6,16,867664037,1289456401,true +2019,6,16,1938760102,571969875,false +2019,6,16,1717138676,139290350,false +2019,6,17,1066556109,156865572,true +2019,6,17,908837847,1783395642,false +2019,6,17,2125356798,1494387427,true +2019,6,17,1101099462,1762327352,true +2019,6,17,275776355,1735580663,false +2019,6,17,414485200,422785600,false +2019,6,17,1372687338,1495318451,true +2019,6,17,493153031,1815057846,false +2019,6,17,39998645,303556627,false +2019,6,17,1812306574,204098629,true +2019,6,18,1828270890,1734816124,false +2019,6,18,718628904,1316387408,true +2019,6,18,1850980713,1905908919,true +2019,6,18,338679387,1247976402,true +2019,6,18,1131671835,1048547293,true +2019,6,18,1816301182,1145492016,false +2019,6,18,125582032,70266620,true +2019,6,18,429634125,1064398494,true +2019,6,18,1097880933,820369039,true +2019,6,18,582901724,951221294,true +2019,6,19,1049179442,1609347371,false +2019,6,19,1360911611,1575728286,true +2019,6,19,1642386662,228873957,true +2019,6,19,260830831,489927038,false +2019,6,19,257202781,815575136,false +2019,6,19,318012546,1314086734,true +2019,6,19,1613027294,1612073096,false +2019,6,19,722745822,1751468696,true +2019,6,19,1974103665,1376673632,false +2019,6,19,2045549240,1083465534,false +2019,6,20,176959579,48119153,true +2019,6,20,990071202,1122480232,false +2019,6,20,689662693,1045236407,true +2019,6,20,782015662,697494496,true +2019,6,20,1831372008,2121209126,true +2019,6,20,1484036259,80135608,false +2019,6,20,2137519070,396218902,true +2019,6,20,298443116,2052915600,true +2019,6,20,410155890,1006848825,false +2019,6,20,519239189,860050651,true +2019,6,21,1927728050,345629420,false +2019,6,21,1140122600,430784997,true +2019,6,21,557756538,30915838,false +2019,6,21,1120731025,1528614298,false +2019,6,21,883896503,1928491074,false +2019,6,21,32013384,752198199,true +2019,6,21,407946882,1546928779,true +2019,6,21,169195639,1855872868,false +2019,6,21,58610001,7726044,false +2019,6,21,642128986,152867953,true +2019,6,22,1772689841,1266530038,false +2019,6,22,752632451,1157938097,true +2019,6,22,1049379387,1796539534,false +2019,6,22,1101144913,609719833,false +2019,6,22,704635849,695880651,true +2019,6,22,133493678,1385172151,true +2019,6,22,151839510,1160383965,true +2019,6,22,756589761,559567262,true +2019,6,22,1948703819,1918518220,true +2019,6,22,525944136,301984849,true +2019,6,23,1848466736,1684206806,false +2019,6,23,563363548,1693040097,false +2019,6,23,1149018345,1346138349,false +2019,6,23,1928026196,294479660,true +2019,6,23,406792342,346773043,true +2019,6,23,932043759,679929862,true +2019,6,23,914059820,621220239,false +2019,6,23,663554851,1660508212,false +2019,6,23,246268764,512346540,false +2019,6,23,454392600,466965372,true +2019,6,24,737611879,634054908,true +2019,6,24,3791924,1479780414,false +2019,6,24,1872004908,785880730,false +2019,6,24,767122506,433921099,false +2019,6,24,118736092,398438178,true +2019,6,24,1416500810,280641015,true +2019,6,24,160319757,365495763,false +2019,6,24,379726723,925976478,true +2019,6,24,463659820,679359380,true +2019,6,24,1406992847,1196730893,true +2019,6,25,317239158,573033665,true +2019,6,25,1345849164,1465314961,true +2019,6,25,2057260804,1420677055,true +2019,6,25,1888163689,2021907356,false +2019,6,25,1661037204,480792005,true +2019,6,25,1696431729,350891836,false +2019,6,25,1866387974,422052969,true +2019,6,25,688943067,404449499,false +2019,6,25,461093591,1537583563,true +2019,6,25,729997571,254414848,false +2019,6,26,1890799094,1574854256,false +2019,6,26,1381166367,1682192392,false +2019,6,26,393277891,1385160494,true +2019,6,26,688226443,1124909329,true +2019,6,26,2055804711,1590951616,true +2019,6,26,330225856,98473105,true +2019,6,26,1761532150,1597141455,false +2019,6,26,1102822288,67233861,true +2019,6,26,130941732,1635606933,true +2019,6,26,1871180669,907803780,true +2019,6,27,1071852405,1659983985,true +2019,6,27,637005900,727245596,true +2019,6,27,924252066,1930838259,true +2019,6,27,1306276380,2004300345,false +2019,6,27,141266235,611134295,true +2019,6,27,1928442538,1394420495,false +2019,6,27,391120294,1245097571,true +2019,6,27,736928652,153837118,false +2019,6,27,388345092,938397728,false +2019,6,27,1579540868,1013139486,false +2019,6,28,54877427,1516617712,true +2019,6,28,1822964799,1479626115,false +2019,6,28,631995479,801513187,false +2019,6,28,1494972186,397737365,false +2019,6,28,1279370167,920218721,false +2019,6,28,511148742,467408839,false +2019,6,28,98806226,44973190,false +2019,6,28,244592854,852204830,false +2019,6,28,1463320182,692089528,false +2019,6,28,899111341,514636354,true +2019,6,29,894880187,311273060,true +2019,6,29,843409983,1733536781,false +2019,6,29,469390390,1570531564,false +2019,6,29,1134322329,303985300,false +2019,6,29,1074649284,717695776,false +2019,6,29,2085013307,223168231,true +2019,6,29,388012780,1276595682,false +2019,6,29,641709729,641983982,false +2019,6,29,873399581,135023811,false +2019,6,29,280070538,385937204,false +2019,6,30,1845508135,279001778,false +2019,6,30,1182330858,1884250240,true +2019,6,30,976873920,1568196637,true +2019,6,30,498380159,1991290759,true +2019,6,30,70894076,893390616,true +2019,6,30,241746054,1036957986,true +2019,6,30,1631365112,1760705882,false +2019,6,30,1575430657,90816971,false +2019,6,30,615885873,43267384,false +2019,6,30,905089678,431859014,false +2019,6,31,1373604660,345388515,false +2019,6,31,2099441602,149103568,true +2019,6,31,1077718446,963303575,true +2019,6,31,1683680637,618105118,true +2019,6,31,993338123,699138936,true +2019,6,31,1257276350,1248445973,true +2019,6,31,178973216,193408086,true +2019,6,31,504601808,537137162,false +2019,6,31,1872297872,72823947,false +2019,6,31,969167095,1018904792,true +2019,7,1,1630972911,1079614640,false +2019,7,1,978179037,513284420,true +2019,7,1,990870488,191535584,true +2019,7,1,222563695,681327452,false +2019,7,1,547994211,124061623,true +2019,7,1,422955760,1252967631,true +2019,7,1,785203912,1853561186,false +2019,7,1,1320930918,573567354,true +2019,7,1,905877522,81221963,false +2019,7,1,1931645356,1213508765,true +2019,7,2,1588427300,317508168,false +2019,7,2,1159178197,1910145900,false +2019,7,2,1456551309,2023499356,false +2019,7,2,2143818400,1893832192,false +2019,7,2,845791189,1617696361,false +2019,7,2,1080029995,487062485,false +2019,7,2,364700747,1814897505,true +2019,7,2,64521257,2129340243,true +2019,7,2,2009032059,995613916,true +2019,7,2,521902407,1003370516,false +2019,7,3,1633129942,1658688024,false +2019,7,3,1757266568,150302185,false +2019,7,3,1116837428,1632686548,false +2019,7,3,22870966,1514675915,false +2019,7,3,1732031284,43188864,true +2019,7,3,912936772,1437351990,true +2019,7,3,1992839851,917346234,false +2019,7,3,819681582,1174196394,true +2019,7,3,398989465,1393492756,false +2019,7,3,804581825,876791869,false +2019,7,4,459829092,205147532,true +2019,7,4,1251199307,48611201,true +2019,7,4,1996226891,1733991497,false +2019,7,4,33633569,1859437114,true +2019,7,4,23658822,1621986933,false +2019,7,4,876751753,1324537734,true +2019,7,4,601026517,1120777034,true +2019,7,4,857882459,126827408,false +2019,7,4,2036261994,31202990,false +2019,7,4,1104389506,1485055716,false +2019,7,5,892395120,1758572531,false +2019,7,5,598523291,1815928309,true +2019,7,5,326584611,1412415893,true +2019,7,5,604081257,1249161092,false +2019,7,5,855983022,1998586380,true +2019,7,5,1762648538,1658065628,true +2019,7,5,1806070606,75298060,true +2019,7,5,617403846,1765953311,false +2019,7,5,255440745,1025139143,false +2019,7,5,839691709,756255523,true +2019,7,6,849876955,1447464468,false +2019,7,6,1339517812,1994058117,true +2019,7,6,1694450667,590604156,false +2019,7,6,1648232114,1431566078,false +2019,7,6,1236468892,1879929513,true +2019,7,6,1054473944,1608466728,false +2019,7,6,1244573535,712464036,false +2019,7,6,1838894921,599521828,false +2019,7,6,1557265998,95808736,false +2019,7,6,1290942307,872425964,false +2019,7,7,1308510584,2034776084,false +2019,7,7,624067024,1295362104,true +2019,7,7,1425015014,395531071,false +2019,7,7,964581451,633012786,true +2019,7,7,243927392,248094587,false +2019,7,7,243664947,1025709434,true +2019,7,7,1359891998,1888354909,true +2019,7,7,737196422,979803683,true +2019,7,7,466683876,1830729757,false +2019,7,7,872795411,739887023,false +2019,7,8,336524863,1616307131,false +2019,7,8,1488367624,343544122,true +2019,7,8,924688862,188642514,true +2019,7,8,1212301861,421729102,false +2019,7,8,608094001,1559690722,true +2019,7,8,456900925,1702560277,false +2019,7,8,965463778,1016397626,true +2019,7,8,1543198669,1281470700,true +2019,7,8,1368772032,981518782,false +2019,7,8,1836793289,1431401597,false +2019,7,9,1875605548,644530754,true +2019,7,9,709665275,835393258,false +2019,7,9,968742193,897337658,true +2019,7,9,297246591,807089588,true +2019,7,9,882039754,1547840476,true +2019,7,9,993207877,220029789,true +2019,7,9,1536332038,1444289308,false +2019,7,9,974145184,1145132798,false +2019,7,9,1119307782,2085782011,false +2019,7,9,1824558918,94429758,true +2019,7,10,960453901,610600995,false +2019,7,10,1259041365,1187406450,false +2019,7,10,600881943,657942221,false +2019,7,10,734183734,483818451,true +2019,7,10,781893551,713082206,true +2019,7,10,755208303,1036269059,true +2019,7,10,1558655752,286910525,false +2019,7,10,381029323,1731368797,true +2019,7,10,1932008056,2095572944,false +2019,7,10,1140605022,790409418,false +2019,7,11,1747892712,708916904,true +2019,7,11,2126738767,1974353412,false +2019,7,11,444347618,370296130,true +2019,7,11,955557231,632856214,true +2019,7,11,361331251,274620127,true +2019,7,11,1133763330,982419968,true +2019,7,11,1424067943,474496616,true +2019,7,11,43639883,2023001527,true +2019,7,11,1269131137,340239197,true +2019,7,11,623108787,1334095368,true +2019,7,12,725982406,36613688,false +2019,7,12,679864236,915110662,false +2019,7,12,492128110,812994762,true +2019,7,12,465828250,662118656,false +2019,7,12,384858004,1881832070,false +2019,7,12,1262209121,915111993,true +2019,7,12,1249584841,1051404720,false +2019,7,12,256564344,924760542,false +2019,7,12,898398727,403571172,true +2019,7,12,521434245,1072230502,true +2019,7,13,767330049,370066813,true +2019,7,13,1551394034,493406174,false +2019,7,13,785639936,1252909753,true +2019,7,13,1888186876,672698069,false +2019,7,13,814141388,1639743161,true +2019,7,13,802789418,2072081127,false +2019,7,13,1360444247,692180897,true +2019,7,13,1366063,1819921419,false +2019,7,13,434426138,1912196232,false +2019,7,13,158216863,1959260220,false +2019,7,14,1190322795,538551556,true +2019,7,14,840621320,1850094868,false +2019,7,14,713843001,1655513955,false +2019,7,14,689646919,1022908263,false +2019,7,14,1364058345,2121094138,false +2019,7,14,942838065,1853063440,true +2019,7,14,1167942473,671660307,true +2019,7,14,450689003,2127373155,false +2019,7,14,253655260,892024414,false +2019,7,14,440706953,93972737,false +2019,7,15,2047459811,966182895,true +2019,7,15,1072653507,461155337,true +2019,7,15,1730261494,1172140080,true +2019,7,15,1552966631,318813651,true +2019,7,15,1086859782,458435927,false +2019,7,15,687898962,1946493885,false +2019,7,15,544120093,676594225,false +2019,7,15,22285586,1678717101,true +2019,7,15,537498178,930578288,true +2019,7,15,977197187,1282663769,true +2019,7,16,205554234,1525649850,true +2019,7,16,1170701825,885547383,false +2019,7,16,401332776,1096021537,true +2019,7,16,1038801413,1736121946,true +2019,7,16,1870619710,460614108,false +2019,7,16,1228059216,552355757,true +2019,7,16,298978801,1967178574,false +2019,7,16,1380344548,1106977098,false +2019,7,16,600508277,376730046,true +2019,7,16,897804802,1444510051,false +2019,7,17,1424300653,1520481876,true +2019,7,17,361538409,853732462,true +2019,7,17,1616176655,383915059,false +2019,7,17,169492606,1746293912,true +2019,7,17,1619393839,1851213510,false +2019,7,17,1665654324,137736368,true +2019,7,17,1625931500,1199395995,false +2019,7,17,920179396,1400975039,false +2019,7,17,753712375,624517734,false +2019,7,17,723631255,911828175,true +2019,7,18,2137762473,164072700,true +2019,7,18,1174917360,785067659,false +2019,7,18,722318944,975280742,true +2019,7,18,1735275573,257208517,true +2019,7,18,2146034408,1531023713,false +2019,7,18,1119164145,533838844,false +2019,7,18,1936263372,162368480,false +2019,7,18,40909276,1150336377,true +2019,7,18,733718500,254489714,false +2019,7,18,1307519114,736539304,false +2019,7,19,937308044,25403283,false +2019,7,19,1240645023,1129165990,true +2019,7,19,851963956,436889940,false +2019,7,19,1124379786,1390395978,false +2019,7,19,457184083,697662689,true +2019,7,19,579316792,712965388,true +2019,7,19,498323167,1512799940,false +2019,7,19,1357628232,2048502161,true +2019,7,19,846471133,778357765,true +2019,7,19,1943138717,343310037,false +2019,7,20,1313717154,911421295,false +2019,7,20,1483625570,75249442,true +2019,7,20,1520511451,1394418585,true +2019,7,20,921614560,1877245002,true +2019,7,20,796141950,1597932436,false +2019,7,20,317566902,112737519,true +2019,7,20,1050226979,1822375240,false +2019,7,20,1419601229,1853606583,false +2019,7,20,1227495301,982237149,false +2019,7,20,1203838369,1970332092,false +2019,7,21,1736133398,849268176,false +2019,7,21,1131217086,876640156,false +2019,7,21,449611749,2072173947,true +2019,7,21,94386807,67501295,false +2019,7,21,1640579409,1693817915,true +2019,7,21,2084117314,224994837,false +2019,7,21,2057832757,690702208,true +2019,7,21,762526019,1753573528,true +2019,7,21,1024897175,680244318,false +2019,7,21,1172264942,1779019916,true +2019,7,22,163954176,2121259393,true +2019,7,22,1938319809,36009409,true +2019,7,22,692958207,602787987,true +2019,7,22,630334211,716692126,false +2019,7,22,988515984,1644256109,true +2019,7,22,114269593,1171451639,true +2019,7,22,1837590618,1864390323,true +2019,7,22,147082562,675356490,true +2019,7,22,1455124733,359855784,true +2019,7,22,446253277,370234649,true +2019,7,23,1910821864,1605272662,false +2019,7,23,1436321931,513468257,true +2019,7,23,316273525,385048683,true +2019,7,23,1236302277,854344289,false +2019,7,23,737082939,950183503,false +2019,7,23,1215234493,1488237610,false +2019,7,23,827094221,742962528,false +2019,7,23,2025587625,306519984,false +2019,7,23,1275806827,1228765026,false +2019,7,23,1831244746,1274939988,false +2019,7,24,236173129,1234453113,false +2019,7,24,618953024,849385503,false +2019,7,24,2126722428,606601800,false +2019,7,24,604917785,797452578,true +2019,7,24,229095849,1612625681,false +2019,7,24,705286680,1224439490,true +2019,7,24,641565452,1186767590,true +2019,7,24,1062164513,2146294226,false +2019,7,24,2108031605,1917090887,false +2019,7,24,480040790,526310079,true +2019,7,25,1670369857,1411449545,false +2019,7,25,586564675,383378254,true +2019,7,25,1274792329,650980759,false +2019,7,25,230011823,472378768,false +2019,7,25,751270130,1568943334,false +2019,7,25,1773048807,1465839127,true +2019,7,25,240757732,281571002,true +2019,7,25,125977254,1803519893,false +2019,7,25,535393045,388568208,true +2019,7,25,1574560475,261743726,true +2019,7,26,1712248237,1186306752,true +2019,7,26,1944320775,1133670110,false +2019,7,26,754457705,876030905,true +2019,7,26,1316741097,1082792644,false +2019,7,26,29740679,2105739371,true +2019,7,26,2051969188,50011191,true +2019,7,26,92764773,1893196909,false +2019,7,26,451100855,1994968164,false +2019,7,26,692409367,789897166,false +2019,7,26,1542302031,767177332,false +2019,7,27,1070539360,1848719756,true +2019,7,27,742600419,1196669024,false +2019,7,27,495709907,1640152358,false +2019,7,27,1896431897,1918047230,true +2019,7,27,297469954,1704106696,false +2019,7,27,1658317407,630516409,false +2019,7,27,329067788,735485177,true +2019,7,27,1204656585,879116959,true +2019,7,27,242925547,854251487,true +2019,7,27,1643420938,242482643,true +2019,7,28,6533183,1154403693,false +2019,7,28,2118439327,380859786,false +2019,7,28,179896993,966453017,true +2019,7,28,1152883600,2106286103,true +2019,7,28,1382250649,1396170987,true +2019,7,28,1588952395,168399279,false +2019,7,28,691764159,2100470533,true +2019,7,28,2036174724,559311338,true +2019,7,28,14424776,446881269,false +2019,7,28,2046484776,328396861,true +2019,7,29,258814301,1490672496,true +2019,7,29,612583100,793989071,false +2019,7,29,564841482,1724328004,true +2019,7,29,101230014,1231069048,false +2019,7,29,662966202,1331254694,false +2019,7,29,358163436,753286360,true +2019,7,29,100167791,948363046,false +2019,7,29,317006711,441944509,false +2019,7,29,2026661169,1512347628,true +2019,7,29,286944390,1340678482,false +2019,7,30,1674057586,881831190,true +2019,7,30,1027252640,1827581651,true +2019,7,30,510351963,581868606,false +2019,7,30,1548087249,666763697,false +2019,7,30,1846448824,49366173,true +2019,7,30,1181820067,952490868,true +2019,7,30,1843114624,100189766,true +2019,7,30,878494536,1032835513,false +2019,7,30,945558312,1350436456,true +2019,7,30,1716496260,675318556,false +2019,7,31,1757831236,1508685981,false +2019,7,31,1787760694,74459119,true +2019,7,31,344782892,258051243,false +2019,7,31,481090713,1665196049,false +2019,7,31,635615887,1978590534,false +2019,7,31,2147364495,1349408064,false +2019,7,31,581909829,1095505845,false +2019,7,31,972870030,689626547,false +2019,7,31,1115708861,661714279,true +2019,7,31,1146521873,853454063,true +2019,8,1,1536045526,1411454933,false +2019,8,1,382341785,1210077446,true +2019,8,1,108987656,1197499466,false +2019,8,1,1085998747,1779370181,false +2019,8,1,1868982057,2005028973,false +2019,8,1,1940269305,721032062,true +2019,8,1,1460070450,962521492,true +2019,8,1,546239694,2010633587,true +2019,8,1,1247805638,1461211220,true +2019,8,1,135696566,698586122,false +2019,8,2,361654647,2030014149,false +2019,8,2,1088073698,622277451,true +2019,8,2,30790390,408697712,false +2019,8,2,1186496916,59363940,false +2019,8,2,417341362,386698224,true +2019,8,2,1125286859,2087581472,true +2019,8,2,1449393043,472261012,true +2019,8,2,310847320,71776407,false +2019,8,2,421955790,1616857019,true +2019,8,2,439255693,751605627,false +2019,8,3,24699154,1369333400,false +2019,8,3,1813849407,1660550865,true +2019,8,3,113307761,1817515674,true +2019,8,3,1854229468,1757434368,true +2019,8,3,1206895571,1520943031,true +2019,8,3,959676603,662898027,false +2019,8,3,974083811,833131847,true +2019,8,3,1190855159,574559519,true +2019,8,3,139742632,246114862,true +2019,8,3,197480304,1867993399,false +2019,8,4,1756980292,145799814,true +2019,8,4,1108404166,1747646975,false +2019,8,4,1628621731,1092534034,true +2019,8,4,1818281071,523671049,false +2019,8,4,259437551,1492909080,false +2019,8,4,1752785971,1798068903,true +2019,8,4,1934163493,40286322,false +2019,8,4,1938520242,642612105,true +2019,8,4,239928361,422519548,true +2019,8,4,440886982,387174511,false +2019,8,5,2020059940,1159953325,true +2019,8,5,469306072,1548388407,true +2019,8,5,617813519,1648143085,false +2019,8,5,417294724,1425254055,false +2019,8,5,1302159036,212999445,true +2019,8,5,1911058978,217636543,false +2019,8,5,747456959,498256153,false +2019,8,5,1203473417,348818822,false +2019,8,5,509019642,756142277,false +2019,8,5,1049469102,1934375296,true +2019,8,6,1294680773,1631811103,false +2019,8,6,1981514725,472881493,true +2019,8,6,21188580,227869888,true +2019,8,6,1211781362,699256919,true +2019,8,6,697049288,1259018060,true +2019,8,6,555507252,1177672157,false +2019,8,6,527565005,673062988,false +2019,8,6,1413618300,134878385,true +2019,8,6,1899567222,968336150,false +2019,8,6,1639140993,667686414,true +2019,8,7,2107679703,995812493,true +2019,8,7,2107320809,266039249,true +2019,8,7,1757182125,1959484632,true +2019,8,7,568461020,1076328432,true +2019,8,7,1254770377,1757456997,true +2019,8,7,1044264995,360083294,true +2019,8,7,2015193067,1259045840,true +2019,8,7,963794233,481178831,false +2019,8,7,1725967380,204929419,true +2019,8,7,1753586911,1785728048,false +2019,8,8,169272824,989630806,true +2019,8,8,1775895052,1390509475,false +2019,8,8,1332584224,1285895402,false +2019,8,8,1283758166,827783004,true +2019,8,8,1956306385,1650123359,false +2019,8,8,198650915,1564754878,false +2019,8,8,403789532,926636092,false +2019,8,8,180695107,1835342206,true +2019,8,8,68703192,285730866,true +2019,8,8,220241348,1236704697,true +2019,8,9,1214976645,747438824,false +2019,8,9,625753118,864464966,false +2019,8,9,1661860923,1764310122,true +2019,8,9,40687368,244570273,false +2019,8,9,1385979062,1001475097,true +2019,8,9,1483766935,1800975942,true +2019,8,9,250660259,1148728364,false +2019,8,9,928369014,621248861,true +2019,8,9,721920177,1976638495,true +2019,8,9,574282325,1221624058,true +2019,8,10,2123899934,356372137,false +2019,8,10,1803452643,1520169571,true +2019,8,10,1573977581,1092683530,false +2019,8,10,938158064,2019213593,true +2019,8,10,2105026638,1936726742,true +2019,8,10,807089546,1581199772,false +2019,8,10,1057523168,779281541,true +2019,8,10,1897792301,1906856220,false +2019,8,10,104050386,1662742635,false +2019,8,10,1943241575,1579231892,true +2019,8,11,429248526,2107023600,false +2019,8,11,1982706041,122912268,false +2019,8,11,1589588981,1676875802,true +2019,8,11,1378468569,673222887,true +2019,8,11,356061267,2078868418,false +2019,8,11,788599720,1028175504,true +2019,8,11,883365782,1641168658,true +2019,8,11,1080147576,2117727976,true +2019,8,11,528358253,586311454,false +2019,8,11,265358575,1269419062,false +2019,8,12,424828247,1717522867,false +2019,8,12,25969203,399959187,false +2019,8,12,345114061,873897139,true +2019,8,12,1438513440,972763906,false +2019,8,12,814560522,1277761939,false +2019,8,12,790904750,1211169461,false +2019,8,12,2110539935,525499353,true +2019,8,12,211553828,1561258805,true +2019,8,12,528799576,1589000881,false +2019,8,12,550339547,897543770,true +2019,8,13,849910040,1729811690,false +2019,8,13,761709758,328270175,true +2019,8,13,1407746740,408742905,true +2019,8,13,360987095,2090035320,true +2019,8,13,1613159936,1071891934,false +2019,8,13,1083221119,664129704,false +2019,8,13,1650041984,965871352,false +2019,8,13,1881650952,488773082,true +2019,8,13,1078652800,803646506,true +2019,8,13,69603077,827487067,false +2019,8,14,1559722622,217303777,true +2019,8,14,1510100084,2088858446,false +2019,8,14,436604204,11022298,true +2019,8,14,313976440,377228724,false +2019,8,14,1189132409,1927671488,true +2019,8,14,620960100,458170899,true +2019,8,14,1601189460,1892099053,true +2019,8,14,947137350,1424678636,false +2019,8,14,1681915444,107077373,true +2019,8,14,1175329594,2123812989,true +2019,8,15,2132791432,117454834,true +2019,8,15,1864625438,608846978,false +2019,8,15,404629907,1843872276,false +2019,8,15,1666315919,181928683,false +2019,8,15,1004685418,1884403424,true +2019,8,15,582953771,1142365131,true +2019,8,15,1443352840,1853759635,false +2019,8,15,1681536400,922831083,false +2019,8,15,1180211384,1879428922,false +2019,8,15,1190821779,254946749,false +2019,8,16,1037136449,784157781,true +2019,8,16,362280858,727263915,true +2019,8,16,2085020570,308355172,true +2019,8,16,618902063,1729925368,true +2019,8,16,665854224,1946477309,false +2019,8,16,775573013,2090556067,true +2019,8,16,1543408168,1057392960,true +2019,8,16,1986236044,1557147158,false +2019,8,16,1722991259,265684918,false +2019,8,16,1806353038,847968129,false +2019,8,17,359558719,672166925,false +2019,8,17,387225890,2119094836,false +2019,8,17,329916613,1436954665,true +2019,8,17,895074841,1797319254,true +2019,8,17,139807366,1403722625,true +2019,8,17,364675028,481721281,true +2019,8,17,1392271836,1308467391,true +2019,8,17,192539109,2089790719,true +2019,8,17,906906812,149062948,false +2019,8,17,1930387880,556387719,true +2019,8,18,1358606468,291370009,true +2019,8,18,1331196193,282330325,false +2019,8,18,666394180,1654271216,true +2019,8,18,544819656,1969738744,true +2019,8,18,39198692,1066846669,false +2019,8,18,980167694,2041871768,true +2019,8,18,1699530483,1451466279,false +2019,8,18,2142992825,566659258,false +2019,8,18,2105743375,2115237986,false +2019,8,18,685499493,1546104986,true +2019,8,19,1641682457,1990489460,false +2019,8,19,2037931738,2074591349,true +2019,8,19,913888949,1016192654,false +2019,8,19,2032141036,1247519784,true +2019,8,19,717520334,1943242821,true +2019,8,19,1485983520,800091868,false +2019,8,19,1283772427,1119989679,false +2019,8,19,1843578452,1011273534,false +2019,8,19,1441074302,63450472,true +2019,8,19,700868880,541486330,false +2019,8,20,35650552,637589384,false +2019,8,20,916292437,1678085483,false +2019,8,20,719679672,1462100709,false +2019,8,20,2104374013,761343423,false +2019,8,20,2098617217,1134285117,true +2019,8,20,2145954317,1032654594,false +2019,8,20,294007603,278514971,true +2019,8,20,1141888388,617483467,true +2019,8,20,2048868983,493227409,false +2019,8,20,1665633429,481235331,true +2019,8,21,249140002,1994836766,false +2019,8,21,547228997,468263802,false +2019,8,21,645869919,2063451230,true +2019,8,21,1890067173,1497885768,false +2019,8,21,1629843642,1843403856,false +2019,8,21,473837540,62378032,false +2019,8,21,347229090,1427885397,false +2019,8,21,1192904682,335199604,true +2019,8,21,1397944833,521261886,true +2019,8,21,710664256,1217027046,true +2019,8,22,179353078,1226103800,true +2019,8,22,799615974,911590169,true +2019,8,22,399691225,1601278397,true +2019,8,22,2147438066,219048438,true +2019,8,22,1365010700,1953675990,true +2019,8,22,302315528,601566377,true +2019,8,22,904489292,86592070,false +2019,8,22,1644436488,139206180,true +2019,8,22,1498563707,310993413,false +2019,8,22,134864716,913895584,false +2019,8,23,1437479706,2090926734,false +2019,8,23,1379158787,614737133,false +2019,8,23,1777233389,1399129317,true +2019,8,23,678238356,1286894191,false +2019,8,23,1827531137,1872483178,true +2019,8,23,1588314639,1171065363,true +2019,8,23,538331273,195261203,false +2019,8,23,858750472,1651836075,false +2019,8,23,332379724,1774690786,true +2019,8,23,1485198209,323127657,true +2019,8,24,193499296,1392273580,false +2019,8,24,824757675,1932170103,true +2019,8,24,567889183,1019628947,false +2019,8,24,1240684695,1977206236,false +2019,8,24,256796571,1252633028,false +2019,8,24,1682818879,357303087,false +2019,8,24,1425438721,1606045771,false +2019,8,24,798965867,947491523,true +2019,8,24,1823366181,1355351334,false +2019,8,24,1702798170,1074990934,true +2019,8,25,1253883435,83888863,false +2019,8,25,557100601,797850472,true +2019,8,25,1005894293,829921869,true +2019,8,25,1207082953,758232514,false +2019,8,25,1250169104,1094204239,false +2019,8,25,935592658,519337175,false +2019,8,25,63091102,621215395,true +2019,8,25,385391910,88046086,false +2019,8,25,2069474222,2053068830,true +2019,8,25,2053729632,621241831,true +2019,8,26,580917050,1804213753,false +2019,8,26,33819061,118164769,false +2019,8,26,92966931,1101057259,false +2019,8,26,1624170332,24768902,false +2019,8,26,662821332,1146155590,true +2019,8,26,1243125124,873932133,false +2019,8,26,449269555,136598612,true +2019,8,26,1667792845,300658505,false +2019,8,26,1629055413,587165348,true +2019,8,26,1004432322,529502889,false +2019,8,27,1811089129,1535044365,true +2019,8,27,164067780,855522868,false +2019,8,27,21997233,2130524553,true +2019,8,27,675016031,892721843,true +2019,8,27,382452299,184031292,false +2019,8,27,995639828,1699732934,true +2019,8,27,1720196626,742986306,false +2019,8,27,1400973542,1667426784,true +2019,8,27,570599474,146245400,false +2019,8,27,764850261,409762121,true +2019,8,28,1869587653,1904475574,false +2019,8,28,1715243927,1250272278,true +2019,8,28,844014602,1248734657,true +2019,8,28,1735624476,1958646104,false +2019,8,28,1013684675,1032689240,true +2019,8,28,871860168,2109497836,true +2019,8,28,1757705432,1500085602,false +2019,8,28,1402226548,1997385649,true +2019,8,28,1132289219,176954221,true +2019,8,28,278735998,394896920,true +2019,8,29,1753960073,594958213,true +2019,8,29,2105502624,1718112573,true +2019,8,29,81249566,1408768896,true +2019,8,29,1315695176,1400018712,true +2019,8,29,438139266,701285558,false +2019,8,29,804021525,1186165342,false +2019,8,29,800761091,343725603,true +2019,8,29,2035220827,1976568101,true +2019,8,29,317234277,313305180,false +2019,8,29,1871706546,952248696,false +2019,8,30,1554687196,779333864,false +2019,8,30,311645428,1317086293,true +2019,8,30,1804704800,1386147285,false +2019,8,30,219864130,2064382915,true +2019,8,30,873071360,1907514271,false +2019,8,30,1180606442,1572926223,false +2019,8,30,194788925,79030992,true +2019,8,30,1845074997,1578686544,true +2019,8,30,2058421147,1116766162,true +2019,8,30,610223762,744383669,false +2019,8,31,2029975909,1122693871,true +2019,8,31,2096777370,1405478233,true +2019,8,31,1907039450,867374974,true +2019,8,31,1544962520,1715507340,false +2019,8,31,1803526198,35018909,false +2019,8,31,361587296,1256098765,true +2019,8,31,653273946,1167116051,true +2019,8,31,1680072603,666322159,false +2019,8,31,1221972550,2100975404,false +2019,8,31,1831139790,929656092,false +2019,9,1,2064595667,1728304814,true +2019,9,1,274945097,802827957,false +2019,9,1,1035755384,1932567092,false +2019,9,1,427692864,364457792,false +2019,9,1,2139795201,1356572309,true +2019,9,1,765483452,342303967,true +2019,9,1,59967870,1971164712,true +2019,9,1,262516516,1762478860,false +2019,9,1,2069616852,1450752756,false +2019,9,1,1083004756,2053792903,false +2019,9,2,406212527,1482376762,true +2019,9,2,1770787125,1620527289,true +2019,9,2,1928741848,1314110167,false +2019,9,2,2039992149,1522649349,true +2019,9,2,2021863692,1211753328,false +2019,9,2,372646646,223155887,true +2019,9,2,1869694951,706805352,true +2019,9,2,93064579,645811299,false +2019,9,2,1943605389,1765423367,false +2019,9,2,1582697842,844195473,false +2019,9,3,1635298061,911181930,true +2019,9,3,1290294012,763714564,false +2019,9,3,1108617137,919222792,true +2019,9,3,59336242,1864842380,false +2019,9,3,546004924,522946293,true +2019,9,3,1586166613,2111688080,false +2019,9,3,1680814918,499833898,false +2019,9,3,921471871,815668448,false +2019,9,3,2049026865,1043445620,true +2019,9,3,391010041,1544192035,false +2019,9,4,1815823988,1069661159,true +2019,9,4,1398803417,1235780527,false +2019,9,4,517095462,660752860,false +2019,9,4,1013462718,1358362038,false +2019,9,4,1776873489,718292450,false +2019,9,4,738604216,1247712006,false +2019,9,4,565713338,1675566895,true +2019,9,4,1396860105,1082683597,true +2019,9,4,1739988512,230515064,true +2019,9,4,1868833922,126559017,false +2019,9,5,544285341,135954225,false +2019,9,5,1681688228,1092641379,false +2019,9,5,272133035,2056116388,false +2019,9,5,242416589,1518969124,false +2019,9,5,72348800,1200660841,false +2019,9,5,1604265117,8598547,true +2019,9,5,1428931779,1379072280,false +2019,9,5,1410009559,1848600463,false +2019,9,5,1745435581,297248090,false +2019,9,5,574280290,1911077512,false +2019,9,6,1361677524,570305178,false +2019,9,6,1336676418,473660833,false +2019,9,6,1551862594,46061059,false +2019,9,6,721269978,599383079,true +2019,9,6,1265939247,1963685532,true +2019,9,6,1845164564,93579385,true +2019,9,6,731294915,1109349525,false +2019,9,6,1098178887,1441347578,false +2019,9,6,1633378770,1106525472,false +2019,9,6,1956553199,1890581160,false +2019,9,7,2104080672,539330164,true +2019,9,7,1609978266,1814747360,false +2019,9,7,747582074,1293325003,false +2019,9,7,364020621,1127435779,true +2019,9,7,1662878996,364634737,false +2019,9,7,810773013,1573994739,false +2019,9,7,2143360492,718603335,true +2019,9,7,1276081279,2016236455,true +2019,9,7,1192439329,102562117,false +2019,9,7,1353927719,1476679994,true +2019,9,8,1694914944,1487256286,false +2019,9,8,611303463,166569549,true +2019,9,8,1823790008,1944835409,false +2019,9,8,617575045,2057376145,false +2019,9,8,978351855,2027372116,true +2019,9,8,1395999475,1462078160,true +2019,9,8,1589168300,164040085,true +2019,9,8,1736725125,1284947157,true +2019,9,8,298949214,685527045,false +2019,9,8,1788174556,1969441532,false +2019,9,9,298174681,83634158,true +2019,9,9,1991966399,642348372,false +2019,9,9,1153698474,905419584,false +2019,9,9,2087187231,823416474,false +2019,9,9,1327300369,2127439509,true +2019,9,9,1633683175,396301684,false +2019,9,9,15917046,1703098083,true +2019,9,9,1426819386,593974969,true +2019,9,9,798287687,1493916884,true +2019,9,9,1101467877,2527394,false +2019,9,10,47625331,1578054711,false +2019,9,10,218274089,1793040190,true +2019,9,10,386029823,1823729419,false +2019,9,10,243394358,981448830,false +2019,9,10,1364802416,2054010638,false +2019,9,10,1280998114,1114614447,true +2019,9,10,325797366,410214883,true +2019,9,10,859109511,2097920581,false +2019,9,10,1134488226,1757041435,true +2019,9,10,14043919,650153020,false +2019,9,11,642316558,535674065,false +2019,9,11,1150452879,1362944127,false +2019,9,11,516825525,582855314,false +2019,9,11,1300041995,405308431,false +2019,9,11,16114044,2140685261,true +2019,9,11,903934510,886465769,false +2019,9,11,1609530387,1006471681,true +2019,9,11,1389085727,603660758,true +2019,9,11,1550136884,307051604,true +2019,9,11,352848580,1565744805,true +2019,9,12,653474036,563412334,false +2019,9,12,1772686334,2006032782,false +2019,9,12,1471217045,691646876,true +2019,9,12,1804867283,2042501281,false +2019,9,12,2044978598,567745201,true +2019,9,12,865196159,1021996689,false +2019,9,12,1160070482,594693143,false +2019,9,12,967645144,1254054981,true +2019,9,12,748074527,184712467,true +2019,9,12,1066085530,977991907,false +2019,9,13,1651873441,598036524,true +2019,9,13,646683495,534215777,false +2019,9,13,2147366849,170569256,false +2019,9,13,410433701,640108729,true +2019,9,13,318501543,1329930288,true +2019,9,13,8119174,734604373,false +2019,9,13,249842285,188217322,true +2019,9,13,498409424,1157037132,true +2019,9,13,176087842,2120736171,false +2019,9,13,1113808971,763391874,true +2019,9,14,1383893963,162880535,false +2019,9,14,633996510,1372560349,false +2019,9,14,809921368,2109081033,false +2019,9,14,1329313465,1718496498,true +2019,9,14,305701079,492285133,true +2019,9,14,1396390437,1792569352,true +2019,9,14,1836967233,1734026526,true +2019,9,14,1783746775,397095747,true +2019,9,14,2110523824,637010531,false +2019,9,14,1231401161,1916028493,false +2019,9,15,860034801,742456990,false +2019,9,15,512668434,1685829850,true +2019,9,15,949840797,510069387,false +2019,9,15,1086325557,689033180,true +2019,9,15,881921096,254722524,false +2019,9,15,2095256277,792634437,true +2019,9,15,1765768545,356904101,false +2019,9,15,1664857819,125011617,false +2019,9,15,179273949,250817667,true +2019,9,15,190704231,1169303792,true +2019,9,16,1077150145,473218918,false +2019,9,16,23826733,774653072,true +2019,9,16,3532327,1724494611,true +2019,9,16,1489077765,2130805773,false +2019,9,16,493845587,694537130,true +2019,9,16,453417947,734394674,true +2019,9,16,1531138542,1542458589,true +2019,9,16,1423033920,428205501,false +2019,9,16,1846840457,725620886,true +2019,9,16,9221198,1993091441,true +2019,9,17,1267392754,1478672192,false +2019,9,17,1901695897,197579396,false +2019,9,17,1642732286,577961202,true +2019,9,17,2071642725,1921263009,true +2019,9,17,1082022361,1102349349,true +2019,9,17,1279573851,1146390764,false +2019,9,17,879215872,1911534022,true +2019,9,17,98693743,1958289871,true +2019,9,17,1333206111,1718462393,false +2019,9,17,359058873,852572024,false +2019,9,18,1712861789,653074995,false +2019,9,18,479628905,640326734,true +2019,9,18,1971300537,227916139,false +2019,9,18,1675519272,1780747979,false +2019,9,18,46219635,1923445908,false +2019,9,18,801657465,966770302,false +2019,9,18,747522890,582691878,true +2019,9,18,1421369086,1151564826,false +2019,9,18,1971205491,354683093,false +2019,9,18,267344043,31630303,false +2019,9,19,1423685770,1161413071,true +2019,9,19,667202891,121771983,true +2019,9,19,671109877,1904076819,false +2019,9,19,2009987051,2066920279,true +2019,9,19,2026284339,459922945,true +2019,9,19,1840957944,1957958783,true +2019,9,19,1539217090,1366155637,true +2019,9,19,1662595650,727021249,false +2019,9,19,547858093,1795693576,false +2019,9,19,85050143,1163475471,false +2019,9,20,490566365,499699787,false +2019,9,20,996576853,2056363191,false +2019,9,20,1012328380,1577436188,false +2019,9,20,513079466,1220131106,true +2019,9,20,419675603,466995045,true +2019,9,20,1703376924,835719328,true +2019,9,20,506428776,1389362143,true +2019,9,20,1358671663,433218245,false +2019,9,20,1502232645,1787370933,true +2019,9,20,1472169183,1696290045,false +2019,9,21,1867671598,1282723353,true +2019,9,21,551984138,1552355422,true +2019,9,21,1538755764,1462802472,true +2019,9,21,1887712886,805145584,false +2019,9,21,1124961637,1361208755,true +2019,9,21,165473981,1352157138,true +2019,9,21,1854541196,1015688386,true +2019,9,21,888281538,689035790,true +2019,9,21,668546291,1737365489,false +2019,9,21,2054544261,296445942,true +2019,9,22,1622046067,46537464,false +2019,9,22,958159974,1572736870,true +2019,9,22,2073614129,756947444,true +2019,9,22,1259004733,1418914105,false +2019,9,22,640220627,1205359955,false +2019,9,22,571553227,1958068946,false +2019,9,22,1592936219,1249508720,true +2019,9,22,677008503,474906256,true +2019,9,22,1596727205,1996257346,true +2019,9,22,891207508,1742752637,true +2019,9,23,1960871100,296439978,false +2019,9,23,1672004056,1299260065,true +2019,9,23,704098945,564841023,true +2019,9,23,550574045,717178037,true +2019,9,23,6669274,1668873653,true +2019,9,23,2104715701,1358781747,false +2019,9,23,40345060,1018429735,true +2019,9,23,1435233914,1136887796,true +2019,9,23,1319793374,1214596334,true +2019,9,23,1877265132,144067798,true +2019,9,24,1693550785,789366995,false +2019,9,24,885444718,56281385,false +2019,9,24,1613581830,2146397877,true +2019,9,24,1480057605,559825103,false +2019,9,24,645215050,373376763,true +2019,9,24,1242879594,1831354700,false +2019,9,24,741001934,1303351405,true +2019,9,24,994955657,423751494,true +2019,9,24,1353934592,869537868,true +2019,9,24,2004716807,1427051297,false +2019,9,25,777021339,47448246,false +2019,9,25,388816034,1476498355,false +2019,9,25,448506105,137593543,true +2019,9,25,713438342,213210700,false +2019,9,25,1147841450,1253398250,false +2019,9,25,305074932,709329286,false +2019,9,25,929634930,1488116060,true +2019,9,25,400536518,489878678,true +2019,9,25,407036729,366940630,false +2019,9,25,1855063738,1573254934,false +2019,9,26,274412437,1599425736,true +2019,9,26,1144969863,1844954495,false +2019,9,26,1435755508,928006587,false +2019,9,26,2132827685,569130189,false +2019,9,26,603407500,175285677,true +2019,9,26,145870558,1406702596,true +2019,9,26,1964729762,1461244292,false +2019,9,26,1964061938,629496977,false +2019,9,26,388753607,1486201840,false +2019,9,26,1661911544,1810037426,false +2019,9,27,949467921,1393904457,true +2019,9,27,1094224714,1992359285,true +2019,9,27,1109907593,396820311,false +2019,9,27,1998407700,1666466284,false +2019,9,27,540538687,1839477764,true +2019,9,27,1753169375,1157705633,true +2019,9,27,540656825,784811362,false +2019,9,27,1042083771,320318585,true +2019,9,27,1155975882,1241864850,true +2019,9,27,244335428,1787704447,true +2019,9,28,1356802280,1835339882,true +2019,9,28,797504577,1069866578,false +2019,9,28,69641626,300201453,true +2019,9,28,1325148813,1145278248,false +2019,9,28,1759788958,1768182777,false +2019,9,28,984754049,695683498,true +2019,9,28,1043327181,916434601,true +2019,9,28,1979914024,1319627022,false +2019,9,28,10454250,1187953257,false +2019,9,28,1547529941,390319677,true +2019,9,29,1307063523,1851584429,false +2019,9,29,872461818,1189844715,true +2019,9,29,1980199722,1455706048,true +2019,9,29,2013687636,315225822,true +2019,9,29,677126187,1213471282,false +2019,9,29,1784473533,2067778025,true +2019,9,29,685860375,243201482,false +2019,9,29,110314832,1738703164,false +2019,9,29,1323491694,1340340481,true +2019,9,29,1411849370,1696250131,false +2019,9,30,1658114068,594697592,false +2019,9,30,275735311,712842229,true +2019,9,30,406698714,580062953,true +2019,9,30,811703554,1099648764,true +2019,9,30,654716344,338002297,true +2019,9,30,2024338120,1926989533,false +2019,9,30,974860593,1695521919,false +2019,9,30,364723154,1408837550,true +2019,9,30,994141081,2140266007,true +2019,9,30,764237793,583041424,true +2019,9,31,1772641075,234453274,false +2019,9,31,274541991,1456728789,false +2019,9,31,2045473372,333439572,true +2019,9,31,182365032,261960758,true +2019,9,31,663278920,1307663118,true +2019,9,31,795738495,1611091501,false +2019,9,31,1714400506,981423686,false +2019,9,31,1985302480,1233649303,true +2019,9,31,770188658,1381395240,true +2019,9,31,1261041297,80170025,true +2019,10,1,1164238165,1612559023,true +2019,10,1,227932142,798218591,false +2019,10,1,513645427,1988378475,true +2019,10,1,1502620526,900881125,true +2019,10,1,1203438331,740009440,false +2019,10,1,2132131123,1010283955,false +2019,10,1,667810107,92838153,false +2019,10,1,1775985241,1369288527,false +2019,10,1,1495439417,1243496912,false +2019,10,1,1559437610,693733108,true +2019,10,2,1465151381,661225892,false +2019,10,2,1508398819,115464337,false +2019,10,2,1078626660,711954296,false +2019,10,2,1020532277,318825171,true +2019,10,2,1153874705,1215407046,true +2019,10,2,242933963,1357660097,true +2019,10,2,516573796,676692850,true +2019,10,2,1100901252,1613849494,true +2019,10,2,1734241588,1641151134,false +2019,10,2,619470869,1182029772,false +2019,10,3,22132354,697667537,false +2019,10,3,245446742,1697986729,true +2019,10,3,884190687,1350382845,false +2019,10,3,1938108009,1681492384,false +2019,10,3,1934141043,356929347,false +2019,10,3,247669364,1207210344,true +2019,10,3,792985854,2092066705,false +2019,10,3,852890031,1960913050,true +2019,10,3,1558472868,2065171814,true +2019,10,3,952277995,941018667,false +2019,10,4,209312882,964291169,true +2019,10,4,1192618984,1621752623,true +2019,10,4,1738639304,882897000,false +2019,10,4,1925870601,2144773012,false +2019,10,4,294656576,312507785,false +2019,10,4,34808911,1400185822,true +2019,10,4,964987809,1899079650,true +2019,10,4,785274779,1911200789,true +2019,10,4,1098814202,1891766582,false +2019,10,4,99583408,1431991893,false +2019,10,5,902078475,2007827702,false +2019,10,5,764419173,1390912453,true +2019,10,5,1281059950,553045221,false +2019,10,5,333483528,1172341987,false +2019,10,5,802115150,1906200487,false +2019,10,5,1125643234,1263766116,false +2019,10,5,1183246479,401748440,true +2019,10,5,1775070046,205451806,false +2019,10,5,1579988133,2103553400,true +2019,10,5,1410261310,1501145004,true +2019,10,6,869122209,1109226713,false +2019,10,6,2065637857,873190101,true +2019,10,6,200271883,1122635081,true +2019,10,6,1124863633,1899761115,false +2019,10,6,615536808,44788531,false +2019,10,6,1705940803,1436232760,true +2019,10,6,1666149445,1167527023,true +2019,10,6,1992482141,986707582,false +2019,10,6,174806802,773170925,true +2019,10,6,1487896405,1617549895,true +2019,10,7,2029907509,376213778,false +2019,10,7,827903562,433345920,false +2019,10,7,360213667,1881120330,false +2019,10,7,168191460,1013672568,false +2019,10,7,1881386078,37885387,true +2019,10,7,1698667552,717078130,true +2019,10,7,1245993122,631634558,false +2019,10,7,1541624275,1631368877,true +2019,10,7,11352499,6218365,false +2019,10,7,805381638,929761309,false +2019,10,8,1695118658,1449486770,false +2019,10,8,33668474,1750289493,true +2019,10,8,1732544165,1555279476,false +2019,10,8,1723335929,857365062,false +2019,10,8,1616664659,499486779,true +2019,10,8,282189886,1032055777,false +2019,10,8,1173707884,918120381,true +2019,10,8,1169746064,874942913,false +2019,10,8,1044534314,47314198,true +2019,10,8,709789041,803777642,true +2019,10,9,1618392724,137370566,false +2019,10,9,343283439,623978169,true +2019,10,9,568321230,1780745639,false +2019,10,9,1906175085,411668384,true +2019,10,9,1972807372,248003894,true +2019,10,9,2066197513,971309374,true +2019,10,9,1949259108,2010549427,false +2019,10,9,152268268,879792630,true +2019,10,9,2056695532,1986773774,false +2019,10,9,276853786,865265680,true +2019,10,10,553527773,1189886108,false +2019,10,10,1392173738,1497662220,false +2019,10,10,1403174213,1218698497,true +2019,10,10,1441824100,2043180459,true +2019,10,10,1726617822,1310723292,true +2019,10,10,772147005,1315189341,true +2019,10,10,540358445,1975204112,false +2019,10,10,401325353,820952980,true +2019,10,10,2105319920,2065685188,false +2019,10,10,1602181428,1426439209,false +2019,10,11,292470217,1190925930,true +2019,10,11,2023515114,792075234,false +2019,10,11,1142328298,1945509237,true +2019,10,11,952969583,408158886,false +2019,10,11,1835571486,1812242819,false +2019,10,11,1838529758,1155051792,true +2019,10,11,1532680091,1579799839,true +2019,10,11,1754455999,925278051,true +2019,10,11,865400147,1491293555,false +2019,10,11,667268277,978172930,false +2019,10,12,494072507,1930968442,true +2019,10,12,1931324350,67033340,true +2019,10,12,1301831241,1153457809,false +2019,10,12,1330144143,329496661,false +2019,10,12,1344026388,1418034050,true +2019,10,12,1594171121,699296005,true +2019,10,12,1139549534,982006352,false +2019,10,12,639776311,639620844,false +2019,10,12,573637101,1788255863,false +2019,10,12,322923548,464723504,false +2019,10,13,675913085,60231770,false +2019,10,13,344101541,1521839800,false +2019,10,13,1435365304,1486393364,false +2019,10,13,1218201573,623158977,true +2019,10,13,1469733226,1762361579,false +2019,10,13,1112150963,1059607495,false +2019,10,13,839780348,660918286,true +2019,10,13,1525196460,1615859508,false +2019,10,13,204593229,1080747120,true +2019,10,13,996456478,1593393296,true +2019,10,14,656336260,1579030225,true +2019,10,14,53007273,1208402220,false +2019,10,14,363416701,1423980640,false +2019,10,14,307863744,114539901,false +2019,10,14,56633033,1000618871,false +2019,10,14,1570610504,444403359,true +2019,10,14,940992654,484062554,false +2019,10,14,92224012,1062810298,true +2019,10,14,1562358430,1337981193,true +2019,10,14,418157250,680957249,true +2019,10,15,1724014237,1701071488,false +2019,10,15,852225227,1360334910,true +2019,10,15,797875483,568618746,false +2019,10,15,1224904445,1944913454,false +2019,10,15,691961307,109672979,true +2019,10,15,1535160853,1231173655,false +2019,10,15,528529612,264075488,false +2019,10,15,1432554661,1039670747,true +2019,10,15,1257773657,530535248,false +2019,10,15,1865164435,827463419,false +2019,10,16,732635162,1591163253,false +2019,10,16,281935755,1136908351,false +2019,10,16,2004724499,1422034314,true +2019,10,16,2080504052,1383733003,false +2019,10,16,1743012601,1169978574,false +2019,10,16,750333758,1070869822,true +2019,10,16,2126727743,425052297,true +2019,10,16,75222957,204232938,false +2019,10,16,2099397844,825660890,true +2019,10,16,1598639848,661905750,false +2019,10,17,1058866135,53941794,false +2019,10,17,2129527285,1414323966,true +2019,10,17,18527925,1562606366,true +2019,10,17,1669484139,905328269,true +2019,10,17,479829605,819884311,true +2019,10,17,778813440,736925446,true +2019,10,17,1555626425,886728367,true +2019,10,17,579707068,1202293146,false +2019,10,17,690539022,1995748657,false +2019,10,17,1950582395,525856221,true +2019,10,18,1079956916,1754967409,false +2019,10,18,930640767,1671112531,true +2019,10,18,1153688034,1231563932,true +2019,10,18,1774582088,362207630,true +2019,10,18,1951207457,1799094671,true +2019,10,18,2112558052,119203415,true +2019,10,18,1256762211,1574430339,true +2019,10,18,1512997243,1656259992,false +2019,10,18,1525157708,680621114,false +2019,10,18,357956802,1147621628,true +2019,10,19,156876884,178129645,true +2019,10,19,1202080405,351079811,false +2019,10,19,1980683715,1625655974,true +2019,10,19,1311263218,1330735996,false +2019,10,19,361270980,1410115543,true +2019,10,19,1404533504,212070820,false +2019,10,19,950591994,870777409,true +2019,10,19,722898158,1933140571,false +2019,10,19,1753815898,1507308434,true +2019,10,19,1449116619,98972204,true +2019,10,20,1982447165,613097526,false +2019,10,20,1317551785,1036234012,true +2019,10,20,1731719651,241844318,true +2019,10,20,1854317479,590298232,false +2019,10,20,995052764,991143971,false +2019,10,20,1630358193,223375652,false +2019,10,20,59699225,1917066568,true +2019,10,20,103610657,1247811220,true +2019,10,20,840047761,2043401751,false +2019,10,20,1136947075,433402094,false +2019,10,21,2135283382,1404592193,false +2019,10,21,1668857890,692876561,false +2019,10,21,517820725,259061711,false +2019,10,21,2021552175,1987777393,false +2019,10,21,2097663161,683447680,false +2019,10,21,151590565,454141705,false +2019,10,21,1257106596,517505328,false +2019,10,21,1634292525,982667215,false +2019,10,21,520044792,304844826,false +2019,10,21,319022263,1881375856,true +2019,10,22,1601143790,25081864,true +2019,10,22,908733055,1296402751,false +2019,10,22,2105398760,120461279,false +2019,10,22,2119156660,2063846045,true +2019,10,22,1791929536,561926913,true +2019,10,22,1698898184,1497245501,true +2019,10,22,22430632,1504112759,true +2019,10,22,195690688,1496093464,false +2019,10,22,1845997329,1245882343,true +2019,10,22,695650603,1616705008,true +2019,10,23,1230658644,46593470,false +2019,10,23,1140971140,85381286,true +2019,10,23,1454524177,491996444,false +2019,10,23,600369320,1007196892,false +2019,10,23,194485896,1989232989,true +2019,10,23,771177003,162659987,true +2019,10,23,1953596239,220271192,true +2019,10,23,862469555,278768073,false +2019,10,23,396427826,525782441,true +2019,10,23,147642269,324693568,false +2019,10,24,293669673,1375884453,true +2019,10,24,2008283478,713188851,true +2019,10,24,699722653,1414228417,true +2019,10,24,2127173882,1688198893,false +2019,10,24,1989630329,281817498,true +2019,10,24,181594889,1031961459,false +2019,10,24,1964667067,1502532323,true +2019,10,24,859002201,1614695368,false +2019,10,24,1853798270,1307938263,true +2019,10,24,909220240,1000684782,true +2019,10,25,820599987,831903502,false +2019,10,25,1843626690,1271340086,false +2019,10,25,1172112393,388522690,false +2019,10,25,49792276,602616188,true +2019,10,25,1943313836,823157982,false +2019,10,25,1668548458,971055712,true +2019,10,25,1139053289,1859905408,false +2019,10,25,1885146757,1132855004,true +2019,10,25,1673464024,208915620,true +2019,10,25,1140622894,771363842,true +2019,10,26,2035724440,1027811638,false +2019,10,26,1576517680,2037382102,false +2019,10,26,857645252,578287301,false +2019,10,26,1057866322,805595504,false +2019,10,26,1486525991,919137922,true +2019,10,26,1777333148,1963390704,false +2019,10,26,1253755358,2119797590,false +2019,10,26,283113083,1135136490,true +2019,10,26,1945229671,96141876,false +2019,10,26,1873580202,146275668,true +2019,10,27,1885739336,1355958005,true +2019,10,27,1442724586,1298830369,true +2019,10,27,709207385,1777090898,false +2019,10,27,1146544620,840969729,true +2019,10,27,87337436,213121257,true +2019,10,27,1329977083,266520074,true +2019,10,27,1877362393,557189195,false +2019,10,27,1219803718,629728562,true +2019,10,27,1302078336,86632061,false +2019,10,27,1259968990,1839587042,false +2019,10,28,1126226310,371121018,true +2019,10,28,1821927363,60017676,true +2019,10,28,2116214956,890736537,false +2019,10,28,158242452,1206139010,false +2019,10,28,218439924,835020511,true +2019,10,28,213970874,2038798476,true +2019,10,28,1309138489,169545891,false +2019,10,28,2048450349,412986597,true +2019,10,28,156752406,180629785,false +2019,10,28,1639731234,1193576879,false +2019,10,29,282943852,933535126,false +2019,10,29,381426496,664080632,false +2019,10,29,2109477094,659791272,false +2019,10,29,2054210098,1825696739,true +2019,10,29,917274691,1083953760,false +2019,10,29,524096939,2102424993,false +2019,10,29,1004029526,1690114656,true +2019,10,29,1755396669,1369715118,false +2019,10,29,572520447,1433326721,true +2019,10,29,264674701,1973377810,false +2019,10,30,618820220,1820038193,true +2019,10,30,105685197,1209101469,true +2019,10,30,61406706,442925537,false +2019,10,30,94721665,764953168,true +2019,10,30,852504313,661294870,false +2019,10,30,963543294,141570672,false +2019,10,30,30471818,1677076722,false +2019,10,30,1403737117,235096319,true +2019,10,30,1151767979,1818656090,true +2019,10,30,1034627100,527907460,true +2019,10,31,3620511,1840547334,true +2019,10,31,2058722474,485997424,true +2019,10,31,1951375252,1761489427,true +2019,10,31,571503232,1412579201,true +2019,10,31,307393274,973013491,false +2019,10,31,1855123237,911132909,false +2019,10,31,1160843273,59673304,false +2019,10,31,1905246632,551350597,false +2019,10,31,1127277094,1429695973,true +2019,10,31,908879990,481674820,false +2019,11,1,1069758495,1570155718,false +2019,11,1,7133078,1157301008,false +2019,11,1,845419643,358070403,false +2019,11,1,1588454383,1121888914,false +2019,11,1,399777363,780431538,false +2019,11,1,1825608555,18260465,true +2019,11,1,1581225576,108943430,true +2019,11,1,264581897,533574226,false +2019,11,1,1564333688,166058672,false +2019,11,1,1446890308,1309466480,true +2019,11,2,1054568704,744056598,false +2019,11,2,438601818,818893990,false +2019,11,2,1373443558,1125497029,false +2019,11,2,873578324,2130304125,true +2019,11,2,1914633416,943041319,false +2019,11,2,1015394778,2038036196,true +2019,11,2,1070026721,459195624,false +2019,11,2,264299074,636272793,false +2019,11,2,834206400,691144978,true +2019,11,2,1438187095,1283656293,true +2019,11,3,1214113352,829548095,false +2019,11,3,23327429,1154337091,false +2019,11,3,378937264,1492065079,true +2019,11,3,1729358390,157253770,false +2019,11,3,1680226944,1361669528,true +2019,11,3,1705550462,361692581,true +2019,11,3,1818197429,1346759251,true +2019,11,3,1629024108,1222885647,true +2019,11,3,1015327391,875325733,true +2019,11,3,1486112278,1528451712,false +2019,11,4,420378636,1986003805,false +2019,11,4,1202918589,1621069460,true +2019,11,4,1205520092,106589053,true +2019,11,4,167660185,1471564373,false +2019,11,4,452209950,238446246,false +2019,11,4,2065504307,1338410345,true +2019,11,4,106384182,2000249295,true +2019,11,4,880503826,1644977622,false +2019,11,4,15631052,1279527984,true +2019,11,4,1774538978,1532371098,true +2019,11,5,1581295972,1261196649,false +2019,11,5,249686375,400210061,true +2019,11,5,1462583548,492566862,false +2019,11,5,1165272026,269906236,false +2019,11,5,568930046,390280297,false +2019,11,5,1759129975,1625626058,true +2019,11,5,220063332,22160254,false +2019,11,5,233842797,1917514742,false +2019,11,5,202249194,272170434,true +2019,11,5,760661362,640600530,false +2019,11,6,1091904406,1572969604,false +2019,11,6,2088392611,581775571,false +2019,11,6,291568905,1811344498,false +2019,11,6,1737575545,757744453,true +2019,11,6,2052663373,535937991,false +2019,11,6,289530618,685445447,true +2019,11,6,1806024482,871398914,false +2019,11,6,1099568246,345542865,false +2019,11,6,747122423,1575975153,true +2019,11,6,1295093347,1258231935,true +2019,11,7,744779278,1495625790,true +2019,11,7,591559741,898479893,true +2019,11,7,1446703747,1470008257,true +2019,11,7,1928702329,1935731182,true +2019,11,7,1861354533,1204556342,false +2019,11,7,2034369252,649388760,true +2019,11,7,1114831468,1846732084,false +2019,11,7,279512068,486590997,true +2019,11,7,868232681,1675068534,false +2019,11,7,1286142048,1589756780,false +2019,11,8,1864427782,187180501,false +2019,11,8,2075425199,1117848978,true +2019,11,8,1572799986,536528002,false +2019,11,8,1348668257,1295685980,true +2019,11,8,219625481,1783739744,false +2019,11,8,1497460997,1844210284,true +2019,11,8,1449250074,28483477,false +2019,11,8,590819439,447792477,true +2019,11,8,357699420,524058526,false +2019,11,8,712042929,151076766,false +2019,11,9,835316458,897968282,false +2019,11,9,247045701,1263788916,true +2019,11,9,1457332771,878107261,true +2019,11,9,462812663,701261538,true +2019,11,9,1123664452,172789098,false +2019,11,9,80206702,1959640676,true +2019,11,9,1577893387,1676300655,false +2019,11,9,1681636206,244356287,false +2019,11,9,1811621458,271279944,false +2019,11,9,158734623,583247707,true +2019,11,10,192379093,392581637,true +2019,11,10,1376016128,871484118,true +2019,11,10,1742437294,1531699867,true +2019,11,10,1399871316,2020867549,true +2019,11,10,1835149346,1393161676,true +2019,11,10,1974419492,1092976745,true +2019,11,10,813814227,1382799851,false +2019,11,10,1913196078,2045572080,true +2019,11,10,1801457534,226911591,true +2019,11,10,1192429402,1555834157,false +2019,11,11,552698051,561201182,false +2019,11,11,734156939,411818633,false +2019,11,11,2013622447,884642029,true +2019,11,11,1037011943,1578691154,true +2019,11,11,1103501050,529167860,true +2019,11,11,2068700135,1561494071,false +2019,11,11,1509091903,260848005,false +2019,11,11,942867667,861470159,false +2019,11,11,1239131028,1375037120,true +2019,11,11,1091234163,434797583,false +2019,11,12,846483528,174164032,false +2019,11,12,1534949664,1533256691,true +2019,11,12,1416145082,1753008233,true +2019,11,12,355570804,40126167,true +2019,11,12,1588456320,1159310903,false +2019,11,12,1067156300,241228585,false +2019,11,12,2022523475,816847521,true +2019,11,12,140294622,1948431479,false +2019,11,12,1059015301,1742359014,true +2019,11,12,1876589577,1445230092,false +2019,11,13,818236648,973619556,false +2019,11,13,1849963825,380703644,true +2019,11,13,442175850,215649736,true +2019,11,13,1765632060,309843883,false +2019,11,13,184654039,1227609779,false +2019,11,13,645119635,649914798,false +2019,11,13,1635218305,566518835,true +2019,11,13,2033927732,81051166,true +2019,11,13,682522044,1804308370,true +2019,11,13,1373777455,248662261,false +2019,11,14,1898660715,1888423029,false +2019,11,14,67933740,1113048906,true +2019,11,14,819733020,25055297,false +2019,11,14,1472209101,587992839,false +2019,11,14,1208468831,1864232240,true +2019,11,14,1938242052,1001495537,true +2019,11,14,307751765,187812791,false +2019,11,14,1491351987,136812405,true +2019,11,14,50835383,733059060,false +2019,11,14,639436859,1072295165,true +2019,11,15,806029762,1735248484,true +2019,11,15,1160150459,622177974,true +2019,11,15,603981249,1268267547,true +2019,11,15,1277341232,2048896782,false +2019,11,15,1000768819,428700711,false +2019,11,15,973308854,1756942676,false +2019,11,15,1000323268,1992685289,true +2019,11,15,2001845658,1515601985,true +2019,11,15,1371647372,158546317,true +2019,11,15,1799583232,675480405,true +2019,11,16,14105424,257222372,false +2019,11,16,1354891589,940532804,false +2019,11,16,593163196,333002828,false +2019,11,16,1449140580,768855021,true +2019,11,16,1274639051,1285535856,false +2019,11,16,731608335,1361986199,false +2019,11,16,727789281,377042688,true +2019,11,16,1698729267,248586594,false +2019,11,16,1659614010,1341730224,true +2019,11,16,298024226,1118294821,true +2019,11,17,936221113,1901892645,true +2019,11,17,1708644511,775704895,true +2019,11,17,1066053964,1512960921,false +2019,11,17,587544691,1494836483,false +2019,11,17,1428493389,544588885,true +2019,11,17,260594014,355872957,true +2019,11,17,541497005,910528087,false +2019,11,17,971901564,1740516231,true +2019,11,17,1697740249,364002242,true +2019,11,17,1713059708,54796654,false +2019,11,18,388015241,1016747749,true +2019,11,18,1795995702,1789459853,true +2019,11,18,380238590,926736119,false +2019,11,18,300417664,1156691500,false +2019,11,18,719917150,44025319,true +2019,11,18,515028058,2011149062,true +2019,11,18,1603542059,1324242635,false +2019,11,18,649001923,1382089793,true +2019,11,18,1679558408,482969548,false +2019,11,18,1510242008,943312066,true +2019,11,19,530864201,573170599,true +2019,11,19,1944850957,1283321048,true +2019,11,19,541655559,37557793,true +2019,11,19,1162804120,1190417023,true +2019,11,19,1326497702,1138920441,true +2019,11,19,2068302449,1326901924,true +2019,11,19,1187502867,445305400,true +2019,11,19,847103088,1409880882,false +2019,11,19,157109028,1216201208,false +2019,11,19,1770426987,1817718735,false +2019,11,20,1709394402,903535866,true +2019,11,20,1248721510,1270854549,false +2019,11,20,1458326435,1006492826,true +2019,11,20,1160656695,55320614,false +2019,11,20,1926281273,1426401288,true +2019,11,20,704154151,333657841,false +2019,11,20,466320812,1502609362,false +2019,11,20,156008052,190215043,false +2019,11,20,1728682033,1647541273,false +2019,11,20,52844488,1321659949,true +2019,11,21,1519422815,1736855567,true +2019,11,21,597236731,979209338,false +2019,11,21,1965205708,825192592,true +2019,11,21,1398203836,470939928,true +2019,11,21,660477599,2133888242,true +2019,11,21,416841977,2126499655,false +2019,11,21,743560376,832409693,true +2019,11,21,1332939542,1096861172,false +2019,11,21,1704750460,1143631695,true +2019,11,21,841760852,1367934200,true +2019,11,22,1723045731,1247946774,false +2019,11,22,1255314688,795884355,true +2019,11,22,852474593,1440245350,false +2019,11,22,253822012,618534615,false +2019,11,22,21828672,1971550550,false +2019,11,22,1870983035,1442215894,true +2019,11,22,914307454,1730711651,true +2019,11,22,1275678991,780381723,true +2019,11,22,19544,1011896149,true +2019,11,22,376582489,1858688275,true +2019,11,23,1117232889,260508180,false +2019,11,23,1473220979,452887503,false +2019,11,23,829169172,1973498057,false +2019,11,23,1049307000,1253775238,true +2019,11,23,492109136,1535767018,false +2019,11,23,1361127343,861039874,false +2019,11,23,557972297,821406460,false +2019,11,23,2067544896,367655339,true +2019,11,23,500082834,124434994,true +2019,11,23,1172799561,437743408,false +2019,11,24,288850309,305558063,false +2019,11,24,2127581280,140475643,false +2019,11,24,721022143,1613735572,false +2019,11,24,1790140902,884631793,false +2019,11,24,1537368217,1346917166,true +2019,11,24,212202949,17345845,false +2019,11,24,715038717,632393933,true +2019,11,24,2135553631,1602820833,false +2019,11,24,222378855,1302016564,false +2019,11,24,1682605531,1049149935,false +2019,11,25,243981564,649023657,false +2019,11,25,1126844703,1054842063,true +2019,11,25,1475636974,1522242305,false +2019,11,25,2029785608,1500236381,false +2019,11,25,1251045803,1143038788,false +2019,11,25,2112173765,2041547562,true +2019,11,25,1417014163,417621562,true +2019,11,25,1140602214,135221088,true +2019,11,25,2043203101,1094561154,true +2019,11,25,348195319,1374124200,true +2019,11,26,532122802,1524766721,false +2019,11,26,180562624,879957970,false +2019,11,26,89554870,1297380812,false +2019,11,26,245186947,1513844793,true +2019,11,26,1739823718,2122783701,true +2019,11,26,1069793743,1498184465,true +2019,11,26,1997898906,858059020,false +2019,11,26,1503804349,917869355,false +2019,11,26,1298882965,1975349676,false +2019,11,26,72224694,879422922,true +2019,11,27,1322517335,297336556,false +2019,11,27,2033959701,847781790,true +2019,11,27,1691094590,2099765327,true +2019,11,27,1870132924,184333576,false +2019,11,27,1174352594,344256842,true +2019,11,27,1198307319,437807915,true +2019,11,27,24785927,1955127542,false +2019,11,27,1611129160,1456401277,true +2019,11,27,1268416111,1186036479,true +2019,11,27,938824680,1482989009,true +2019,11,28,1196994818,335837220,false +2019,11,28,1768901667,1014777104,false +2019,11,28,361837189,1868243410,false +2019,11,28,390906334,1123825952,true +2019,11,28,692049600,772634565,false +2019,11,28,489683945,266484903,true +2019,11,28,621821185,1433653175,false +2019,11,28,1481770976,1141225385,true +2019,11,28,369493401,698738309,true +2019,11,28,634386896,140810294,false +2019,11,29,1306512985,1762680653,true +2019,11,29,1796701663,1987694757,false +2019,11,29,770909754,2026735731,false +2019,11,29,369472414,532901641,false +2019,11,29,1156245172,142950295,true +2019,11,29,1250381268,571943908,false +2019,11,29,121417072,348701911,false +2019,11,29,155638501,516955726,true +2019,11,29,210049880,1896516346,false +2019,11,29,1729542427,1926730233,true +2019,11,30,24647220,1251304785,true +2019,11,30,1772216192,828495846,true +2019,11,30,768417876,1206988151,true +2019,11,30,126951182,1998709924,false +2019,11,30,340655607,648669747,false +2019,11,30,1828921683,1221244781,false +2019,11,30,33874542,167769641,true +2019,11,30,1467154662,1490185089,false +2019,11,30,929058607,490696736,true +2019,11,30,1224488255,1341819272,false +2019,11,31,488521039,1838723074,false +2019,11,31,295206606,1807565039,false +2019,11,31,915023394,623207142,false +2019,11,31,991222116,548712814,false +2019,11,31,5085678,1493867970,false +2019,11,31,1869220425,1623518018,true +2019,11,31,1755339762,465140120,true +2019,11,31,1943696539,335730874,false +2019,11,31,812777786,1278801209,false +2019,11,31,1702185414,188837420,true +2020,1,1,635823082,710260720,true +2020,1,1,797719859,1805538672,true +2020,1,1,1154248460,1924740542,false +2020,1,1,1606379905,1468162153,false +2020,1,1,146761480,2107207908,false +2020,1,1,1739023083,346452557,true +2020,1,1,1863741772,1292269039,false +2020,1,1,2136561053,150000042,true +2020,1,1,1695926731,1587644039,true +2020,1,1,2146949168,2011301204,true +2020,1,2,1523212129,1671169194,true +2020,1,2,689281408,352062661,true +2020,1,2,327649343,1389323042,true +2020,1,2,795842200,792658070,false +2020,1,2,1453176287,150665548,true +2020,1,2,401793134,30310939,true +2020,1,2,860165190,2036688787,true +2020,1,2,337607326,897771842,false +2020,1,2,1560830573,789286697,false +2020,1,2,773673833,1829135945,true +2020,1,3,310157864,1413663461,true +2020,1,3,1770132128,309027053,true +2020,1,3,1601833972,2083809356,true +2020,1,3,982798408,731369746,false +2020,1,3,533598046,1671964465,false +2020,1,3,471465239,1695952162,true +2020,1,3,1607508750,1948717937,true +2020,1,3,550057495,13920397,false +2020,1,3,1772081052,426323892,true +2020,1,3,1953673750,783051053,false +2020,1,4,1560397486,1661104448,false +2020,1,4,1211056814,1939268747,true +2020,1,4,2023361565,1065831474,false +2020,1,4,689689605,977063461,false +2020,1,4,1727187212,373871819,true +2020,1,4,1962130837,1442772556,true +2020,1,4,591749427,1445205387,false +2020,1,4,1249665950,984431929,false +2020,1,4,577855700,2038496074,false +2020,1,4,1714482199,977559650,false +2020,1,5,537847311,1984556955,true +2020,1,5,2140959529,653523004,true +2020,1,5,2103530127,728643349,true +2020,1,5,170232484,774785765,true +2020,1,5,318991053,2138324055,true +2020,1,5,1602303220,1570583667,true +2020,1,5,1958846034,937257443,true +2020,1,5,999147373,451813337,true +2020,1,5,61872323,1179116581,true +2020,1,5,581992726,893514654,true +2020,1,6,1315306917,273787458,true +2020,1,6,1030355514,1220246492,true +2020,1,6,536797451,1758474108,false +2020,1,6,1529228900,1312902980,true +2020,1,6,124239286,1050980438,false +2020,1,6,965383567,148438804,true +2020,1,6,1893295439,420687293,true +2020,1,6,431413940,1560650308,true +2020,1,6,44253631,1463051734,true +2020,1,6,190844160,2111834713,true +2020,1,7,1461536469,388769105,false +2020,1,7,1170800209,1235434649,true +2020,1,7,274053557,2008560964,false +2020,1,7,17870033,2106832030,false +2020,1,7,1631127499,831637763,false +2020,1,7,716596927,602836334,true +2020,1,7,492362035,1402336365,false +2020,1,7,1509269201,21595301,true +2020,1,7,232843988,704720544,true +2020,1,7,877491377,813952867,true +2020,1,8,1708169894,2043119137,false +2020,1,8,1083938936,2121317092,false +2020,1,8,217021651,16106189,false +2020,1,8,889895239,233014045,false +2020,1,8,681663161,1247252275,true +2020,1,8,1752886848,271005618,true +2020,1,8,1593160951,1440855592,false +2020,1,8,738250381,1593295892,false +2020,1,8,385013763,2104285876,true +2020,1,8,705562689,1633503671,false +2020,1,9,477812487,683777955,false +2020,1,9,20809521,1952301977,false +2020,1,9,698278640,1258899819,false +2020,1,9,1140618257,1560801427,false +2020,1,9,565478384,2137608221,true +2020,1,9,1561383299,1842492966,false +2020,1,9,99666413,743073944,false +2020,1,9,1526468570,1938513868,false +2020,1,9,953394317,1598493506,false +2020,1,9,1009445298,1132379883,true +2020,1,10,399298008,1818168299,false +2020,1,10,1964051132,840814077,true +2020,1,10,774052534,278256037,true +2020,1,10,1867822918,29828502,true +2020,1,10,963979943,1137084095,true +2020,1,10,1161344613,1990515774,true +2020,1,10,1952539787,323880988,false +2020,1,10,1825468889,1541825122,false +2020,1,10,1827438679,319512732,false +2020,1,10,1804232175,12460494,false +2020,1,11,1194761730,634906715,false +2020,1,11,1919281434,409447523,true +2020,1,11,1621862424,1849991404,true +2020,1,11,1354028539,1400743944,true +2020,1,11,464293040,1255274196,true +2020,1,11,89590788,1888482832,false +2020,1,11,1430471095,259919356,false +2020,1,11,1941297675,1798529669,false +2020,1,11,1700700628,728271310,false +2020,1,11,256281435,1734378514,false +2020,1,12,1674544460,1459157814,true +2020,1,12,318589732,1669001101,true +2020,1,12,1760829098,1342410180,false +2020,1,12,962573159,674990668,true +2020,1,12,304031013,1011122845,true +2020,1,12,1075540900,1144466628,true +2020,1,12,1402914542,1991398145,true +2020,1,12,1595136245,1587542827,true +2020,1,12,1364478961,1437943178,true +2020,1,12,333870177,1277520914,false +2020,1,13,1510788527,1325181049,true +2020,1,13,74118539,304739632,false +2020,1,13,1121518385,1544163494,false +2020,1,13,1240030980,95048915,false +2020,1,13,566028374,1390133226,true +2020,1,13,1199722219,843676347,true +2020,1,13,100568540,558628648,false +2020,1,13,707985828,446479681,false +2020,1,13,70235126,232036148,false +2020,1,13,1157580235,506552346,false +2020,1,14,46760786,1131805007,false +2020,1,14,2038013863,1583417374,true +2020,1,14,1447680214,1522590453,false +2020,1,14,1931585490,2107415506,true +2020,1,14,344110724,1080601122,true +2020,1,14,105124821,383898278,false +2020,1,14,899586109,1745008233,true +2020,1,14,33693910,1081107757,true +2020,1,14,484914529,1617250407,true +2020,1,14,799020989,756612788,true +2020,1,15,193638525,1441424512,false +2020,1,15,1755752277,339070212,false +2020,1,15,796912772,1912262370,false +2020,1,15,69250125,690906420,true +2020,1,15,21218115,758994497,false +2020,1,15,1496318002,958774479,true +2020,1,15,244507,1832812496,false +2020,1,15,1739513773,217038025,true +2020,1,15,982034974,558879327,true +2020,1,15,1773712817,1934526442,true +2020,1,16,787284763,2140262308,false +2020,1,16,1499622493,634355902,true +2020,1,16,1957913933,1933767193,true +2020,1,16,919543221,561150762,false +2020,1,16,551420322,341644032,true +2020,1,16,1096740239,2112807986,true +2020,1,16,1741525198,2012430168,false +2020,1,16,520178489,2028148385,true +2020,1,16,944602029,1564969974,true +2020,1,16,651535558,1715019136,true +2020,1,17,892636068,1362868227,false +2020,1,17,1341851404,1577901485,false +2020,1,17,293571323,1959965312,true +2020,1,17,1359678639,847549933,true +2020,1,17,764591464,849621669,true +2020,1,17,746269649,316926280,false +2020,1,17,390371137,605541437,false +2020,1,17,1624945984,1078512022,true +2020,1,17,347579923,1520525379,false +2020,1,17,1780941110,1898555531,true +2020,1,18,1201325967,1593599854,true +2020,1,18,468395279,553509785,false +2020,1,18,1063822346,1366812535,true +2020,1,18,373256820,2094562179,false +2020,1,18,1918662658,1422044465,true +2020,1,18,1691794421,787309186,false +2020,1,18,1541446763,115037393,false +2020,1,18,1135950856,1428296761,false +2020,1,18,1440776108,1017741052,false +2020,1,18,624808675,643387640,true +2020,1,19,2073883031,1422187631,true +2020,1,19,1666084017,1587137020,true +2020,1,19,1041715905,1788306269,false +2020,1,19,1833014533,1504304809,false +2020,1,19,176845824,33368283,true +2020,1,19,1925240233,295455474,true +2020,1,19,1048726787,1842497009,true +2020,1,19,1754156565,1309107571,false +2020,1,19,1860516149,380613413,false +2020,1,19,689063463,856315871,false +2020,1,20,513847408,382103202,true +2020,1,20,1763650049,447293140,true +2020,1,20,745178412,710238274,true +2020,1,20,1215868437,490501684,false +2020,1,20,2136482858,1880921125,true +2020,1,20,210018753,812600724,true +2020,1,20,1460941263,2039963556,false +2020,1,20,530154511,1240581212,true +2020,1,20,1320468390,102245811,true +2020,1,20,116527475,760807344,true +2020,1,21,330907849,1035634458,false +2020,1,21,950602396,1445223490,true +2020,1,21,173993497,1823221767,true +2020,1,21,15610808,1347259531,false +2020,1,21,836731794,1525589993,false +2020,1,21,197611697,2086793255,false +2020,1,21,546551328,1411793305,false +2020,1,21,108363347,1371669328,true +2020,1,21,1476822457,1891932460,false +2020,1,21,1850870114,1808240035,true +2020,1,22,126235055,1157419545,false +2020,1,22,434936831,1833090475,false +2020,1,22,1023157618,1297150826,true +2020,1,22,838309611,1196143353,false +2020,1,22,1142374972,1215705390,false +2020,1,22,531398545,1796385449,false +2020,1,22,1537181280,1499511158,false +2020,1,22,392750303,2094670495,false +2020,1,22,2130890028,1824813168,true +2020,1,22,293129295,1747477996,false +2020,1,23,2098678436,669018763,false +2020,1,23,550825391,1948852494,false +2020,1,23,725617790,722615460,true +2020,1,23,18124687,1913874195,true +2020,1,23,1445740070,1030568359,true +2020,1,23,186779486,1818233031,true +2020,1,23,1704771511,1276820023,true +2020,1,23,665356180,335664489,false +2020,1,23,1769070627,2084267560,false +2020,1,23,2036224899,1138974886,true +2020,1,24,751913229,1736407293,false +2020,1,24,302609318,2121911245,false +2020,1,24,2003465976,1967911334,false +2020,1,24,1063382656,153280620,false +2020,1,24,1991115479,174381867,false +2020,1,24,1633954160,614406369,true +2020,1,24,975477294,2140614417,true +2020,1,24,1717221890,878317447,true +2020,1,24,1657821493,1445890962,true +2020,1,24,1235774367,743491447,true +2020,1,25,556555974,697886475,false +2020,1,25,187883192,320104996,true +2020,1,25,1372091388,2143477534,true +2020,1,25,612023736,1514666809,true +2020,1,25,257069510,1724779911,false +2020,1,25,985162072,1966225528,false +2020,1,25,956202213,1573207439,false +2020,1,25,96514361,1022094743,true +2020,1,25,2080419076,1114850466,false +2020,1,25,1969055439,1919124554,true +2020,1,26,1949879798,536669559,true +2020,1,26,658663412,268769014,true +2020,1,26,936407603,544123073,false +2020,1,26,731251655,1959975109,false +2020,1,26,765629488,786358131,false +2020,1,26,1621051304,85708823,false +2020,1,26,1177180203,1711710701,false +2020,1,26,1199682399,340375677,true +2020,1,26,1625076346,2133742766,true +2020,1,26,1496521441,1885140933,false +2020,1,27,1174812285,991606379,true +2020,1,27,1341378855,448067008,true +2020,1,27,1832448205,639728787,true +2020,1,27,264510226,77162691,false +2020,1,27,818797368,1665309344,false +2020,1,27,1330623066,1507285278,true +2020,1,27,453100450,363619794,true +2020,1,27,287940592,1251691509,true +2020,1,27,1645311733,566426148,true +2020,1,27,2102016607,1215119931,false +2020,1,28,803157913,1640357526,false +2020,1,28,1489924113,1692308174,true +2020,1,28,1803160361,1559148068,false +2020,1,28,549193346,1430531797,true +2020,1,28,54775147,635904005,false +2020,1,28,1832971974,1358439929,false +2020,1,28,1617839788,713987326,true +2020,1,28,1032097007,387273745,false +2020,1,28,361910802,1357930744,true +2020,1,28,623813430,400659531,false +2020,1,29,577928325,2022219543,false +2020,1,29,1297074928,982169099,false +2020,1,29,664796983,474065916,false +2020,1,29,1735382204,1848987113,true +2020,1,29,346857721,1081844992,false +2020,1,29,2015190633,2063231389,false +2020,1,29,747925468,1954059401,false +2020,1,29,812333186,1634318262,false +2020,1,29,811694291,999992009,true +2020,1,29,1576085136,1750330479,true +2020,1,30,842216779,1847633793,true +2020,1,30,827911263,1602724608,false +2020,1,30,536372981,856110245,false +2020,1,30,2103101425,62373695,true +2020,1,30,1402017854,838709997,false +2020,1,30,939997933,1072887147,true +2020,1,30,408388640,561557695,false +2020,1,30,1841475489,1182379635,true +2020,1,30,1415139920,1655455987,false +2020,1,30,1895490799,1440910622,false +2020,1,31,460975360,1919580764,true +2020,1,31,1877504450,1394767069,false +2020,1,31,1196772260,2070274587,true +2020,1,31,2002250722,1817619756,false +2020,1,31,155087483,1251029554,true +2020,1,31,234331960,1738908877,true +2020,1,31,1445177556,1292116676,false +2020,1,31,1999575522,268140101,false +2020,1,31,1993676860,842754055,false +2020,1,31,239955626,564986717,false +2020,2,1,1647424797,640042977,false +2020,2,1,349684415,28274590,true +2020,2,1,807800187,1975570830,true +2020,2,1,465228988,1674756601,false +2020,2,1,1415345352,350433016,true +2020,2,1,2390624,614171678,false +2020,2,1,860687374,394168958,true +2020,2,1,452716427,992494125,true +2020,2,1,1633479644,519171420,false +2020,2,1,2042936407,1361958858,true +2020,2,2,2064665013,995498485,true +2020,2,2,1787210089,1102532532,true +2020,2,2,831350018,2012285610,true +2020,2,2,2101657798,320137587,true +2020,2,2,1280468898,980957452,true +2020,2,2,385390298,1258814654,false +2020,2,2,1793836216,799896194,false +2020,2,2,1477966450,1876194462,false +2020,2,2,572722248,1085511355,false +2020,2,2,378246095,2103282817,true +2020,2,3,1354708154,859797896,true +2020,2,3,1328675365,636676020,true +2020,2,3,511841126,1607144687,false +2020,2,3,811078689,328275970,true +2020,2,3,221623214,1321570715,true +2020,2,3,1713681225,777044587,true +2020,2,3,1499824212,1388198634,true +2020,2,3,2080611153,1248909605,true +2020,2,3,1525175567,1102154067,false +2020,2,3,902454722,1111058548,false +2020,2,4,1985650317,1605490907,false +2020,2,4,722464639,1201096692,true +2020,2,4,112934236,1749447361,true +2020,2,4,309096716,1701262438,false +2020,2,4,1903099004,75897022,true +2020,2,4,831560617,419788333,true +2020,2,4,306226818,290098531,false +2020,2,4,325088406,1674237675,true +2020,2,4,539896972,1029863542,false +2020,2,4,668572965,850416911,true +2020,2,5,1407481990,1112657765,false +2020,2,5,628531551,1403538986,false +2020,2,5,1772976234,383215912,false +2020,2,5,2097371233,601936229,false +2020,2,5,284181111,2124605983,false +2020,2,5,1872072910,1170866548,false +2020,2,5,1528230557,1944621018,true +2020,2,5,35991770,671281619,false +2020,2,5,1177395091,131196364,false +2020,2,5,1924756694,862244146,false +2020,2,6,187273802,211936077,true +2020,2,6,187261018,1736991584,false +2020,2,6,1120883897,184594567,false +2020,2,6,1837405515,202246927,true +2020,2,6,1895056704,2030379800,false +2020,2,6,190441846,1410801733,true +2020,2,6,1184346639,629619852,true +2020,2,6,1558318719,840360546,false +2020,2,6,250487883,1734632454,false +2020,2,6,1015005511,1369251061,true +2020,2,7,1949107444,517081516,false +2020,2,7,673910048,821980382,false +2020,2,7,1840657847,1776397745,false +2020,2,7,786599015,172894061,false +2020,2,7,1534273065,2127509009,true +2020,2,7,218760022,213285757,false +2020,2,7,393832995,369812831,false +2020,2,7,434670384,1078551742,false +2020,2,7,1909682235,1065841204,false +2020,2,7,524194585,23922602,true +2020,2,8,8241032,1084523037,false +2020,2,8,130257201,2062031064,false +2020,2,8,400206388,281113628,true +2020,2,8,1336420816,1185848494,false +2020,2,8,1625142922,669828634,false +2020,2,8,2109809649,164078745,true +2020,2,8,1891545730,1829488089,true +2020,2,8,232556848,239355229,false +2020,2,8,2035779070,1429502278,true +2020,2,8,632888552,1790764566,false +2020,2,9,1098118696,1669466945,false +2020,2,9,357218643,996788737,false +2020,2,9,2138552507,1423532332,true +2020,2,9,787966651,1011312425,true +2020,2,9,1178402851,1616386793,true +2020,2,9,734142487,840911495,true +2020,2,9,379003302,1186591244,false +2020,2,9,756731319,168756796,true +2020,2,9,927771935,719750758,false +2020,2,9,409516474,749437160,false +2020,2,10,1365266937,1650521405,false +2020,2,10,352037776,381093204,false +2020,2,10,1668140892,568421830,false +2020,2,10,2029694644,388118491,true +2020,2,10,1830354387,1051402433,true +2020,2,10,570644058,1679666017,true +2020,2,10,2062110427,818657865,true +2020,2,10,618185428,1781318074,false +2020,2,10,1795160838,1573379114,false +2020,2,10,1286431433,1664607173,false +2020,2,11,156635597,1675500645,false +2020,2,11,1261590942,262699953,false +2020,2,11,653313565,1232187053,false +2020,2,11,209316261,358385698,true +2020,2,11,719075645,347760889,false +2020,2,11,1891306468,1093773613,false +2020,2,11,1131002284,156373636,false +2020,2,11,1067503761,1547911273,false +2020,2,11,1157594620,966865360,false +2020,2,11,1662716579,312423300,false +2020,2,12,474637276,1140304265,false +2020,2,12,949712382,41362720,true +2020,2,12,1148893645,364426586,true +2020,2,12,1642384160,219027938,true +2020,2,12,962831926,987041552,false +2020,2,12,1885035625,499486708,true +2020,2,12,1968761042,713654694,true +2020,2,12,1469167684,1117887860,true +2020,2,12,1787005871,413778693,false +2020,2,12,587313373,780519039,true +2020,2,13,545664823,2087085731,true +2020,2,13,2115691828,63927767,false +2020,2,13,910360821,1086188103,true +2020,2,13,2003093669,1404463963,true +2020,2,13,1724755741,1010996120,true +2020,2,13,594672949,355310019,true +2020,2,13,633592569,968679492,false +2020,2,13,652603984,186245479,true +2020,2,13,1423915721,1421999073,false +2020,2,13,531229887,880895988,false +2020,2,14,939644862,838456790,false +2020,2,14,776680939,1609764185,false +2020,2,14,1925852942,1700093676,true +2020,2,14,440795103,1971731546,false +2020,2,14,295273824,1908734301,false +2020,2,14,1086185595,822616453,true +2020,2,14,1939407352,2016710516,false +2020,2,14,1529442770,516144513,false +2020,2,14,1748253544,1416553501,false +2020,2,14,1056437945,291594220,true +2020,2,15,195252667,1715918332,true +2020,2,15,2000726495,488271427,true +2020,2,15,1969263183,1189824730,false +2020,2,15,760212242,1561722505,false +2020,2,15,100827099,1908574862,true +2020,2,15,1316528502,1437028268,false +2020,2,15,432211281,83042226,false +2020,2,15,1056125662,1050778072,true +2020,2,15,798434460,1287799222,true +2020,2,15,556806023,1367711534,false +2020,2,16,561687156,1213319994,false +2020,2,16,1518180063,1341186471,true +2020,2,16,544990893,125774010,true +2020,2,16,1413625125,1107819782,true +2020,2,16,328346840,1123021533,true +2020,2,16,661936893,651455521,false +2020,2,16,1742504720,1636658459,true +2020,2,16,492234587,1182828270,false +2020,2,16,312309057,812781169,true +2020,2,16,71407937,1293212440,true +2020,2,17,36573517,765512529,true +2020,2,17,1385542221,1389998853,true +2020,2,17,1920511202,824172315,false +2020,2,17,1549626384,1664860051,true +2020,2,17,989319128,2000986306,false +2020,2,17,39350675,139982587,true +2020,2,17,1354597996,1384468837,false +2020,2,17,1973404585,1753802648,true +2020,2,17,1643775167,1490296068,false +2020,2,17,241433043,1008633729,false +2020,2,18,691009879,68453533,true +2020,2,18,226427617,1958478396,false +2020,2,18,1405795992,2099015253,false +2020,2,18,195637149,297778161,true +2020,2,18,794549104,1279521649,false +2020,2,18,981778205,529974760,true +2020,2,18,1345749418,246154583,true +2020,2,18,418195340,1333518905,false +2020,2,18,958465188,1987246537,false +2020,2,18,247676816,277087469,true +2020,2,19,286600539,1579633680,false +2020,2,19,892210074,102207245,false +2020,2,19,1966860556,1888105444,true +2020,2,19,483628983,1558469708,false +2020,2,19,1930536410,877423650,false +2020,2,19,258851267,2074392141,true +2020,2,19,724882745,1913942710,false +2020,2,19,1293777170,1063348415,true +2020,2,19,995954190,316473415,true +2020,2,19,100306910,2004359093,false +2020,2,20,700987585,1387282817,false +2020,2,20,203364388,76852748,true +2020,2,20,1504484798,1957547280,false +2020,2,20,10041528,437177284,false +2020,2,20,336505611,746141874,true +2020,2,20,1197260974,730307722,false +2020,2,20,1098080258,1608637663,false +2020,2,20,1750814323,639499502,false +2020,2,20,1454085719,1168349694,false +2020,2,20,214758696,2035704052,false +2020,2,21,809985513,689412898,false +2020,2,21,1588573725,1688803589,false +2020,2,21,190166289,2084303293,false +2020,2,21,1799304576,416246382,true +2020,2,21,369899884,172739282,false +2020,2,21,479893652,1158219493,true +2020,2,21,1952763223,84916615,true +2020,2,21,10537239,931517142,false +2020,2,21,1175624851,2007987243,true +2020,2,21,1883061402,931736672,false +2020,2,22,1911838408,745748493,false +2020,2,22,1751394296,2075465037,false +2020,2,22,455074022,1323509166,false +2020,2,22,2066581044,98914409,false +2020,2,22,2027615888,438573626,false +2020,2,22,132964187,667467525,true +2020,2,22,326984471,1564157012,true +2020,2,22,886052680,616932740,false +2020,2,22,796065201,170885514,false +2020,2,22,1945832794,1939539532,true +2020,2,23,1913321197,656238356,true +2020,2,23,1513573751,2090109038,false +2020,2,23,389567892,1623557034,true +2020,2,23,1608962035,650736602,true +2020,2,23,1175801716,276003224,true +2020,2,23,775199031,1525923593,true +2020,2,23,852788944,1468225362,true +2020,2,23,424493886,353831511,false +2020,2,23,161722211,1540334139,false +2020,2,23,26274357,1357726602,false +2020,2,24,814859834,1563994719,false +2020,2,24,2145250031,1945606193,true +2020,2,24,1162767822,512091974,false +2020,2,24,1912876266,1895688166,true +2020,2,24,2140812333,918805812,false +2020,2,24,209756943,1901076380,true +2020,2,24,1572477709,316067692,false +2020,2,24,667096445,864443769,true +2020,2,24,1025554971,1435770707,true +2020,2,24,1280406519,1165263082,false +2020,2,25,701059564,1235417476,false +2020,2,25,1385542616,1414098169,true +2020,2,25,1674916437,2146412855,false +2020,2,25,103867790,531813133,false +2020,2,25,230261013,1781786502,true +2020,2,25,2040565005,547646107,false +2020,2,25,370525719,1569165172,true +2020,2,25,69512644,596711018,false +2020,2,25,1055733946,330799304,true +2020,2,25,1369257451,1482764932,false +2020,2,26,543927768,1231678226,false +2020,2,26,2008929870,1065496033,true +2020,2,26,979421319,1348063097,false +2020,2,26,762227421,715078002,true +2020,2,26,4833232,910578470,true +2020,2,26,292468132,1786989082,true +2020,2,26,1558985159,2024742239,false +2020,2,26,1778966787,1100709291,false +2020,2,26,1358395594,1014490035,true +2020,2,26,1768707896,1863004946,true +2020,2,27,2093478357,514528205,true +2020,2,27,432688275,1263887399,true +2020,2,27,676555564,262360512,false +2020,2,27,614739704,1698241510,false +2020,2,27,931629254,752628456,true +2020,2,27,300505202,376491514,true +2020,2,27,1021591620,355380410,false +2020,2,27,213398534,377547998,true +2020,2,27,1126735957,140866587,true +2020,2,27,934742798,444854788,false +2020,2,28,1757919189,1521229069,true +2020,2,28,1981333175,1617899453,false +2020,2,28,606717711,1030130648,false +2020,2,28,544429183,226084572,true +2020,2,28,1874022882,1824477181,true +2020,2,28,2121582026,482806344,true +2020,2,28,1914680256,897138321,true +2020,2,28,1585889758,1024985337,false +2020,2,28,866815674,2035390666,true +2020,2,28,924210922,1500574485,false +2020,2,29,1479370013,836461814,true +2020,2,29,124293605,273295651,true +2020,2,29,42660165,609169157,true +2020,2,29,1570399846,749800598,true +2020,2,29,1398665650,1577745264,false +2020,2,29,1736926920,1875947290,false +2020,2,29,1347825549,703308333,false +2020,2,29,554193456,136860261,true +2020,2,29,912593935,1195301086,false +2020,2,29,175509843,94297220,false +2020,2,30,1702313360,1976220301,true +2020,2,30,1021228051,427558098,false +2020,2,30,486391599,1889134735,false +2020,2,30,1051013445,902208541,false +2020,2,30,1331633733,413549402,true +2020,2,30,779640633,2019753643,true +2020,2,30,1899775785,238814080,false +2020,2,30,1381345004,1737820442,false +2020,2,30,855811821,1465940488,true +2020,2,30,242590941,831054155,false +2020,2,31,921954843,974374103,false +2020,2,31,879694337,1907366622,false +2020,2,31,507158684,69607149,false +2020,2,31,1591426125,846414566,true +2020,2,31,218482003,525464992,true +2020,2,31,1690360262,350240993,false +2020,2,31,755758088,2106745626,false +2020,2,31,2056874531,1721584404,false +2020,2,31,1448182463,742550470,false +2020,2,31,757824094,116849373,true +2020,3,1,78498705,1576152949,true +2020,3,1,1936409285,395471260,true +2020,3,1,743328406,958101589,false +2020,3,1,763564909,1312143674,true +2020,3,1,1845483408,1594589467,false +2020,3,1,448209946,590418818,true +2020,3,1,273466696,1013104879,false +2020,3,1,132871153,234262972,true +2020,3,1,239804645,1114691317,false +2020,3,1,1485750926,313176659,false +2020,3,2,105462180,1980735628,true +2020,3,2,1614356319,1801622636,true +2020,3,2,500962747,749202585,true +2020,3,2,1415472457,1821258160,true +2020,3,2,692523849,1759013751,true +2020,3,2,1338640385,754437316,true +2020,3,2,1921301663,260456036,true +2020,3,2,1820863915,309253913,false +2020,3,2,1974295887,1482240645,true +2020,3,2,227277456,96680734,true +2020,3,3,379730369,928846549,false +2020,3,3,139560460,298405884,true +2020,3,3,507903286,1111482775,true +2020,3,3,2026239757,1170297439,false +2020,3,3,1981354633,1143080236,true +2020,3,3,158208946,1843732990,false +2020,3,3,882358799,866309459,false +2020,3,3,1375007461,538035680,true +2020,3,3,1478482782,1022381379,false +2020,3,3,2069153811,534309257,true +2020,3,4,966089705,1499218975,true +2020,3,4,283361350,1574024093,true +2020,3,4,14695347,776618156,true +2020,3,4,631827788,1489769111,true +2020,3,4,407531391,2113279582,true +2020,3,4,123279035,901871265,false +2020,3,4,1999795282,195676429,true +2020,3,4,1201293794,592508861,false +2020,3,4,1837660704,815729135,false +2020,3,4,1837496591,1973005746,true +2020,3,5,994038765,1081199544,true +2020,3,5,57980856,1769902971,false +2020,3,5,1977461214,1242250467,true +2020,3,5,1577710688,375210553,false +2020,3,5,1491912163,1550958222,false +2020,3,5,1189661264,1753821123,false +2020,3,5,848317947,1162312574,false +2020,3,5,1948906625,1766620020,false +2020,3,5,1993041743,841475606,false +2020,3,5,1160011846,1715322824,false +2020,3,6,1194894225,514321776,false +2020,3,6,888777110,60778975,false +2020,3,6,105107032,705034457,false +2020,3,6,1054555146,1798557705,true +2020,3,6,1075626624,1901592149,true +2020,3,6,754367211,1948006111,true +2020,3,6,1842835380,1399792641,false +2020,3,6,1361379294,470169757,false +2020,3,6,770225821,132816235,false +2020,3,6,1949319957,859317425,true +2020,3,7,959008381,724938034,true +2020,3,7,701966608,2125135296,false +2020,3,7,704581913,1289695378,false +2020,3,7,1422676300,83380883,true +2020,3,7,100983352,894021032,false +2020,3,7,1040115321,1700203167,false +2020,3,7,262962735,547411594,true +2020,3,7,265317384,393828657,true +2020,3,7,1882840867,1834066578,false +2020,3,7,1083696693,1752871521,false +2020,3,8,910371602,1381160040,true +2020,3,8,599871706,1109652306,true +2020,3,8,973819071,2101923501,false +2020,3,8,64124580,1133519734,false +2020,3,8,1053362198,1795574361,true +2020,3,8,892956045,744318660,true +2020,3,8,1516541599,1530833189,true +2020,3,8,1768724129,787298596,true +2020,3,8,565588765,1593478043,false +2020,3,8,1950142077,1597690245,false +2020,3,9,524780400,1697502633,false +2020,3,9,1846464770,30299983,true +2020,3,9,1853822533,562738617,false +2020,3,9,1997831735,2023685939,true +2020,3,9,1211252297,1155031501,true +2020,3,9,1804009297,1033911433,true +2020,3,9,59770576,900227847,true +2020,3,9,1713017426,1199147702,false +2020,3,9,601153076,183124900,true +2020,3,9,1029722057,1147860570,true +2020,3,10,439775316,830248030,false +2020,3,10,408641933,2094534700,false +2020,3,10,76683260,965891486,false +2020,3,10,905521811,979919825,true +2020,3,10,318101540,1866280472,true +2020,3,10,8888106,341867285,true +2020,3,10,574031417,1351437215,true +2020,3,10,387240514,746910283,true +2020,3,10,70599934,1387701327,false +2020,3,10,1504073610,1826471104,true +2020,3,11,55718810,1082482018,true +2020,3,11,859822988,1963556995,true +2020,3,11,994825974,1102492241,false +2020,3,11,881134090,1460041462,false +2020,3,11,1071272819,518824461,false +2020,3,11,1148442699,1396282332,false +2020,3,11,117021602,682506599,true +2020,3,11,1733254174,1238329061,true +2020,3,11,933844104,660035861,false +2020,3,11,58184116,1149848554,false +2020,3,12,780921479,1002808008,true +2020,3,12,2034180523,1622252623,false +2020,3,12,1005632024,1763089491,false +2020,3,12,1640525271,1416806471,true +2020,3,12,579524036,176185560,true +2020,3,12,976795306,1166985460,false +2020,3,12,561538860,1219180392,true +2020,3,12,278649943,1696773268,false +2020,3,12,2132245246,2714534,true +2020,3,12,178710010,1972122684,false +2020,3,13,1550927905,1373347918,false +2020,3,13,1914365546,1836109518,false +2020,3,13,1968783500,2027672741,false +2020,3,13,1749342597,1117200169,false +2020,3,13,875136628,1241992511,true +2020,3,13,1097448980,35634131,false +2020,3,13,1018841827,827313577,false +2020,3,13,435933079,713794685,true +2020,3,13,121647822,1341973452,true +2020,3,13,1507354784,1467346484,false +2020,3,14,780252740,1776664357,false +2020,3,14,80580988,1164534056,true +2020,3,14,642945360,1751121205,true +2020,3,14,416014431,119530302,true +2020,3,14,450125988,452348324,false +2020,3,14,2134107992,501182709,false +2020,3,14,1304785953,550403087,true +2020,3,14,132532645,54073923,false +2020,3,14,2106281947,1819758213,false +2020,3,14,1286112587,1191414644,true +2020,3,15,2105626677,2041884438,false +2020,3,15,1357240065,479275566,false +2020,3,15,1205927953,1019960315,true +2020,3,15,829435333,1065241898,false +2020,3,15,1098381407,1146797391,true +2020,3,15,299841724,772186439,false +2020,3,15,301278823,894572613,false +2020,3,15,1820788007,169408760,true +2020,3,15,1171729050,1366103906,true +2020,3,15,2049029013,1398905667,true +2020,3,16,1490072546,642615338,false +2020,3,16,2079019676,2108112330,true +2020,3,16,1344494484,559213110,false +2020,3,16,1031289531,1974745488,false +2020,3,16,1663706268,1915309240,true +2020,3,16,1002383926,1593953409,true +2020,3,16,1499098237,879021774,false +2020,3,16,1474046218,2083482079,false +2020,3,16,2095867694,426128445,true +2020,3,16,202423372,1600965154,true +2020,3,17,653145870,1248408420,true +2020,3,17,1582003595,349087474,false +2020,3,17,934340696,1067301510,true +2020,3,17,1221346835,613902911,true +2020,3,17,1730354141,13842884,false +2020,3,17,495812390,1987944360,false +2020,3,17,423071476,645813335,true +2020,3,17,346677115,1705000358,true +2020,3,17,862387936,834143979,false +2020,3,17,344378282,1286008902,true +2020,3,18,950907930,1485480491,false +2020,3,18,269787725,1316225354,false +2020,3,18,248077887,494780384,false +2020,3,18,1351439871,1823693320,true +2020,3,18,1820812180,1779099182,true +2020,3,18,1498408579,1890777455,true +2020,3,18,1243443038,1653256146,false +2020,3,18,1446791082,925181160,true +2020,3,18,1292660868,104946039,false +2020,3,18,683839463,1263215897,true +2020,3,19,1622896171,719475571,true +2020,3,19,1808742786,1911265198,true +2020,3,19,2009410132,1087323275,true +2020,3,19,76523758,1751373238,true +2020,3,19,1116637861,479151508,true +2020,3,19,1112692658,1183232569,false +2020,3,19,2112404958,43106063,false +2020,3,19,515819223,1347516170,true +2020,3,19,1751195822,1346914562,true +2020,3,19,1949697263,1323013644,false +2020,3,20,403803869,760421068,true +2020,3,20,846066432,2018415452,false +2020,3,20,2020156946,1730800545,true +2020,3,20,156358906,493766940,true +2020,3,20,1141511572,1572699624,true +2020,3,20,337253769,1357471711,true +2020,3,20,288553688,1441103558,false +2020,3,20,1968969305,2069216985,true +2020,3,20,180292798,1179295843,false +2020,3,20,406461740,1655922019,true +2020,3,21,443615176,1722376409,true +2020,3,21,1032844113,636179323,true +2020,3,21,1418329013,330571985,false +2020,3,21,527720539,1141561661,false +2020,3,21,1009440573,210097196,true +2020,3,21,1609454120,1244603791,true +2020,3,21,1968793830,1525548119,false +2020,3,21,702346139,822893172,true +2020,3,21,2139526189,1751965462,true +2020,3,21,426886297,1650391490,true +2020,3,22,1358577781,1295381992,true +2020,3,22,808382753,1260144429,true +2020,3,22,926161782,1247998292,false +2020,3,22,2079848088,764253420,false +2020,3,22,103824172,125771968,true +2020,3,22,202501547,1749503577,true +2020,3,22,1654197099,1044990625,true +2020,3,22,1070062664,1600436320,false +2020,3,22,631329194,1388036641,false +2020,3,22,209196704,1479696798,true +2020,3,23,2088861088,68837012,false +2020,3,23,314392377,837037171,true +2020,3,23,1411024841,964349517,false +2020,3,23,171488619,1555647363,false +2020,3,23,1720276722,482062777,false +2020,3,23,662669031,1914996043,false +2020,3,23,1587715152,2017543365,false +2020,3,23,1082004501,334310879,false +2020,3,23,737040857,1072688116,true +2020,3,23,486725743,1579453930,true +2020,3,24,1315530970,540282816,false +2020,3,24,632094498,1344435331,false +2020,3,24,457402098,1119296530,true +2020,3,24,1227352468,179602811,true +2020,3,24,1395195398,1112574702,false +2020,3,24,290939053,2103969831,true +2020,3,24,349280200,1355442353,false +2020,3,24,601891570,267033308,true +2020,3,24,1346357216,1003949423,true +2020,3,24,735228103,710397330,true +2020,3,25,1838424035,1025223250,true +2020,3,25,1107881345,1450201708,true +2020,3,25,870925968,904663822,true +2020,3,25,963524365,284343566,false +2020,3,25,1659379732,1838721724,false +2020,3,25,460998477,715702806,false +2020,3,25,1153771330,705814722,false +2020,3,25,256137608,1533615628,true +2020,3,25,1937889279,455411092,true +2020,3,25,1976171380,980144030,true +2020,3,26,1351797078,775274599,false +2020,3,26,1602057418,747221779,true +2020,3,26,356311530,1494017594,true +2020,3,26,1346771297,152755500,false +2020,3,26,1005157320,1960914874,true +2020,3,26,131201350,1280893422,true +2020,3,26,1731028165,1683889895,true +2020,3,26,1155958600,1382518806,false +2020,3,26,1174666978,871407251,false +2020,3,26,1550633455,1799562263,false +2020,3,27,1379166329,1692161284,true +2020,3,27,145673185,277192179,true +2020,3,27,1504712224,87988326,true +2020,3,27,1010702357,834423646,false +2020,3,27,1518450806,333987533,false +2020,3,27,862874473,485280733,true +2020,3,27,1029256804,742094076,false +2020,3,27,2140813436,1260873438,true +2020,3,27,1881276536,720449063,true +2020,3,27,1044345814,227052550,false +2020,3,28,741842011,1890511981,false +2020,3,28,1877421502,433673250,false +2020,3,28,1043861401,1210279103,false +2020,3,28,1490031706,1366746494,true +2020,3,28,1915442991,1159866978,true +2020,3,28,341763043,589973651,false +2020,3,28,638348700,1581051235,false +2020,3,28,1252875291,579375683,false +2020,3,28,787813230,636370058,false +2020,3,28,1573607743,773381925,true +2020,3,29,1056353913,956692133,false +2020,3,29,648707120,164659083,false +2020,3,29,1553248827,1340340314,false +2020,3,29,1994207372,960198211,true +2020,3,29,1283993637,337177262,true +2020,3,29,375466207,133766222,false +2020,3,29,332921012,37205001,false +2020,3,29,294938696,1440934391,false +2020,3,29,999304867,1655816288,true +2020,3,29,31782422,1812603414,false +2020,3,30,433185699,973893322,false +2020,3,30,157692752,715388510,true +2020,3,30,1657625168,1583563184,false +2020,3,30,1063045928,789169163,true +2020,3,30,1431137128,367477836,true +2020,3,30,781856445,926600780,false +2020,3,30,799628395,622084931,false +2020,3,30,1317718473,1588621370,false +2020,3,30,76858406,264062833,false +2020,3,30,1459908863,1395044924,false +2020,3,31,812578276,2110130503,true +2020,3,31,1352418380,2139000614,true +2020,3,31,374892077,1416831051,false +2020,3,31,738799161,2008126330,true +2020,3,31,1582015590,1688229267,false +2020,3,31,1814601499,86502801,true +2020,3,31,1549757097,1657211228,true +2020,3,31,586804700,1905921940,true +2020,3,31,1482291623,1524030210,true +2020,3,31,2068789624,1773714572,false +2020,4,1,1882523785,419104781,true +2020,4,1,479956359,511987710,false +2020,4,1,1344934128,2070876685,true +2020,4,1,900315506,1314149310,false +2020,4,1,1635597745,460225648,false +2020,4,1,272600350,1928370692,false +2020,4,1,100781800,1409067688,false +2020,4,1,1418286891,1580605937,false +2020,4,1,2134700131,1307272242,true +2020,4,1,529686569,251435766,false +2020,4,2,1503582117,89528158,false +2020,4,2,1693647689,1139898339,false +2020,4,2,1768953067,419228900,false +2020,4,2,1867759727,250002574,true +2020,4,2,165701365,1162472880,true +2020,4,2,1665631417,1812797771,true +2020,4,2,1258464139,594686316,false +2020,4,2,400375251,1080626597,true +2020,4,2,1420261098,1153505194,true +2020,4,2,1069148588,88326185,false +2020,4,3,635923110,561594053,true +2020,4,3,468092783,1003195159,false +2020,4,3,1051713700,2126058184,false +2020,4,3,1273489355,1344994899,true +2020,4,3,131729712,365149183,true +2020,4,3,1033775599,524539124,false +2020,4,3,191193955,157071368,true +2020,4,3,810878909,1812343101,true +2020,4,3,647470421,1898434931,false +2020,4,3,657506095,1897569257,false +2020,4,4,1828961741,611928400,false +2020,4,4,527611868,526072841,false +2020,4,4,709096297,292367174,false +2020,4,4,225954584,1633506029,false +2020,4,4,2011776380,618708995,true +2020,4,4,1652859444,1186716623,true +2020,4,4,75972880,1135555361,false +2020,4,4,1459929528,987558192,true +2020,4,4,388475084,876189918,false +2020,4,4,1179771324,1834805806,false +2020,4,5,1458747567,1228070191,true +2020,4,5,5501601,220254273,false +2020,4,5,2119091939,361208088,true +2020,4,5,695268056,1472942612,true +2020,4,5,598245650,1912579160,false +2020,4,5,1155945860,1107751550,true +2020,4,5,594135553,564152507,true +2020,4,5,659913487,1670183626,true +2020,4,5,1683873119,1312330172,false +2020,4,5,1435530254,38354369,true +2020,4,6,510586890,553721405,false +2020,4,6,1499218953,1695092035,true +2020,4,6,232034619,962292594,false +2020,4,6,599256322,675903049,false +2020,4,6,1060547921,727488354,true +2020,4,6,220399053,569046950,true +2020,4,6,1696255291,812738178,false +2020,4,6,281037976,2023491047,false +2020,4,6,1252988328,636402518,true +2020,4,6,1556536376,395827993,true +2020,4,7,410988429,1897789473,false +2020,4,7,1707816469,1745607530,true +2020,4,7,296531877,830521862,false +2020,4,7,1701675969,92164049,true +2020,4,7,208708839,1519755820,false +2020,4,7,460266155,662695500,false +2020,4,7,431117878,1832311999,false +2020,4,7,570603884,944024407,false +2020,4,7,1287140785,1579144935,false +2020,4,7,677575059,1401238174,false +2020,4,8,1754082408,1964137482,false +2020,4,8,1184127500,123876637,false +2020,4,8,819792751,35409911,true +2020,4,8,1117054313,528877873,true +2020,4,8,1760798580,1069395655,true +2020,4,8,192224445,2026108225,false +2020,4,8,1733376625,1943368861,true +2020,4,8,1827949256,1344960518,true +2020,4,8,1991295756,1073673865,false +2020,4,8,830932129,1476266401,true +2020,4,9,981474237,1626778421,false +2020,4,9,111686694,1295693561,true +2020,4,9,1049523172,2018585905,true +2020,4,9,1375987162,1987489055,true +2020,4,9,1027425085,747024879,false +2020,4,9,1135937266,99081385,true +2020,4,9,2137377923,65214414,false +2020,4,9,1191942600,573594934,false +2020,4,9,1839554085,67853004,false +2020,4,9,1820095980,584365994,false +2020,4,10,1220266060,886473395,false +2020,4,10,1181531780,2006763966,false +2020,4,10,1585481128,859653852,false +2020,4,10,507880645,1809522671,true +2020,4,10,1717430472,89227994,true +2020,4,10,549854621,41057020,true +2020,4,10,1004523996,1185905846,false +2020,4,10,744143095,77089906,true +2020,4,10,1103761718,755415161,false +2020,4,10,327423507,270227950,true +2020,4,11,2017868010,1665660472,false +2020,4,11,209998822,1719990566,false +2020,4,11,269569109,680054499,true +2020,4,11,798926225,909805269,true +2020,4,11,409929338,412844417,true +2020,4,11,1141027384,1241803038,false +2020,4,11,1066756209,273200546,false +2020,4,11,278523118,1597778435,true +2020,4,11,888305502,577802540,false +2020,4,11,685206642,229085438,true +2020,4,12,1253938604,1808092882,true +2020,4,12,1784957926,1478634215,false +2020,4,12,920864126,1763301327,true +2020,4,12,260183252,1948251555,false +2020,4,12,231109263,527863488,false +2020,4,12,1484943381,1947007318,false +2020,4,12,793128964,1413663411,true +2020,4,12,1954360950,709364495,false +2020,4,12,476459730,178079938,true +2020,4,12,952824629,630707484,true +2020,4,13,490130376,124610517,false +2020,4,13,605043108,528293783,false +2020,4,13,576572519,43787609,true +2020,4,13,922772633,440680388,true +2020,4,13,2079008766,369220381,true +2020,4,13,1471074679,260644688,false +2020,4,13,1316771533,756383110,true +2020,4,13,1612831210,1260422308,false +2020,4,13,215094046,1974329053,false +2020,4,13,928782004,919124919,false +2020,4,14,72485011,503125866,false +2020,4,14,328164128,1574011691,false +2020,4,14,1392380949,472576909,true +2020,4,14,1688527850,458539935,false +2020,4,14,1679064875,1683750946,true +2020,4,14,1214315972,208676719,true +2020,4,14,2016488362,278383302,true +2020,4,14,1694140103,16496481,false +2020,4,14,474365169,1693449953,false +2020,4,14,1305207652,2004827378,false +2020,4,15,23892531,1805812093,true +2020,4,15,1607137978,1846207807,true +2020,4,15,1406110076,240606542,true +2020,4,15,963333975,1306429365,false +2020,4,15,85992825,202278273,true +2020,4,15,585294031,1417976344,false +2020,4,15,1604263972,1549599009,false +2020,4,15,83302323,290213760,false +2020,4,15,4041448,1105871415,false +2020,4,15,1789964487,1118515795,true +2020,4,16,1658673364,142086905,false +2020,4,16,832812428,218024304,true +2020,4,16,2110032379,424538260,true +2020,4,16,757587831,2130646789,false +2020,4,16,377272082,703442560,true +2020,4,16,22321263,1700989890,true +2020,4,16,1578746599,1358833244,false +2020,4,16,42333795,1127215973,false +2020,4,16,1522469709,1635407810,false +2020,4,16,808640798,912226270,false +2020,4,17,216655239,420085913,false +2020,4,17,542824603,1330622067,true +2020,4,17,626954854,809251273,false +2020,4,17,1801445261,651413159,false +2020,4,17,229184648,2092303710,false +2020,4,17,1311813845,1058010339,true +2020,4,17,128892801,2030449278,true +2020,4,17,1050406940,2126062338,true +2020,4,17,877093078,16390334,false +2020,4,17,1632060620,641294160,false +2020,4,18,1131977741,1401933756,false +2020,4,18,706188311,1735982757,true +2020,4,18,698222918,147464456,false +2020,4,18,2068425177,1277001887,false +2020,4,18,2043865618,861659574,false +2020,4,18,780001644,1980861730,true +2020,4,18,364145285,428750404,false +2020,4,18,1341429455,2143512753,false +2020,4,18,1593734331,368359052,true +2020,4,18,651994540,1252725044,true +2020,4,19,1434561345,534280435,true +2020,4,19,613338908,339240781,true +2020,4,19,1738750756,377131816,true +2020,4,19,1637865541,1939433545,true +2020,4,19,1786933398,1842080771,false +2020,4,19,454845213,235215692,true +2020,4,19,552296805,1153244510,false +2020,4,19,1386428413,1219067984,true +2020,4,19,278406843,1292430864,true +2020,4,19,1872282050,1686706694,false +2020,4,20,1277498335,1299147588,false +2020,4,20,195440562,961192901,false +2020,4,20,1584507374,881929698,true +2020,4,20,1484969827,1009629949,true +2020,4,20,168609990,1958327247,false +2020,4,20,494100048,316655610,true +2020,4,20,376658845,1798190627,true +2020,4,20,987870689,2101007305,false +2020,4,20,1877314544,767220911,false +2020,4,20,772693523,1183664551,false +2020,4,21,1979117526,532343905,true +2020,4,21,1306493441,1188982852,true +2020,4,21,55918813,95904108,true +2020,4,21,511158015,1298908085,true +2020,4,21,1191804657,763469636,false +2020,4,21,1483448838,585814775,false +2020,4,21,2010580743,1912229826,false +2020,4,21,407873693,1097395272,true +2020,4,21,835233660,1525514257,false +2020,4,21,356805005,86788548,true +2020,4,22,1941843051,1494242617,true +2020,4,22,1880585976,1317774347,true +2020,4,22,107334759,1062708063,false +2020,4,22,1821621933,222470869,false +2020,4,22,1501059672,1816402441,true +2020,4,22,2000594647,638100416,false +2020,4,22,1139325809,419226904,true +2020,4,22,2133947471,65031620,false +2020,4,22,2117986423,1381917997,false +2020,4,22,1088107836,1759081479,true +2020,4,23,306792402,158026675,false +2020,4,23,467752395,7566604,true +2020,4,23,803963420,2044970438,true +2020,4,23,2014418461,211262318,false +2020,4,23,1688315178,1546831536,false +2020,4,23,1136795588,852204463,false +2020,4,23,1922599372,480034227,true +2020,4,23,1326990946,1903832763,true +2020,4,23,1917947562,1956537428,true +2020,4,23,1037951558,696814377,true +2020,4,24,1642194703,23374189,true +2020,4,24,883129631,1304178550,true +2020,4,24,674006658,720394918,false +2020,4,24,1000572958,440769942,true +2020,4,24,1007757944,155640104,false +2020,4,24,1923274771,933651642,false +2020,4,24,1702607243,55345067,false +2020,4,24,1313111929,935768095,false +2020,4,24,1417950194,991481640,false +2020,4,24,131342177,506960444,false +2020,4,25,536236853,527043806,false +2020,4,25,2025897920,1563052459,false +2020,4,25,826394445,843249505,true +2020,4,25,2097380494,2134012316,false +2020,4,25,102321586,1401866738,true +2020,4,25,1360439475,663667139,false +2020,4,25,196852488,1938435155,true +2020,4,25,170049723,1651519980,true +2020,4,25,347736676,914029260,true +2020,4,25,768833220,837710789,true +2020,4,26,413117793,1386573093,true +2020,4,26,367957540,1140605311,false +2020,4,26,1267547781,235842747,false +2020,4,26,1365526710,1165576781,true +2020,4,26,344157083,1585719551,false +2020,4,26,1340192985,2062300660,false +2020,4,26,1088815125,1859010298,true +2020,4,26,1665593975,2074095329,true +2020,4,26,2104653712,228320173,false +2020,4,26,1325109947,409249410,true +2020,4,27,392625929,1850032744,true +2020,4,27,232069435,446769628,true +2020,4,27,1780897655,1889761137,false +2020,4,27,329307834,587410566,true +2020,4,27,620902066,786173063,false +2020,4,27,2132497625,1105804956,true +2020,4,27,289643510,369979433,false +2020,4,27,1993996109,1429880394,true +2020,4,27,431072107,1917163955,false +2020,4,27,1229017706,894652400,true +2020,4,28,865192044,211434402,true +2020,4,28,1085051954,726388760,false +2020,4,28,1510852743,91789988,true +2020,4,28,615986208,1969854652,false +2020,4,28,1309444472,1555966780,false +2020,4,28,964836199,230418304,false +2020,4,28,244126330,1500498214,false +2020,4,28,283617893,463045871,false +2020,4,28,975234510,546734783,false +2020,4,28,1388321129,1636601062,false +2020,4,29,671703367,770891798,true +2020,4,29,1758822964,872412839,false +2020,4,29,1571378491,369365116,false +2020,4,29,1478522029,1318635343,false +2020,4,29,627525418,1789579937,true +2020,4,29,487498940,659964834,true +2020,4,29,1468525045,393808552,false +2020,4,29,961024092,1479839341,true +2020,4,29,556058502,1308696249,false +2020,4,29,1157481523,517371625,true +2020,4,30,1935013929,190524917,true +2020,4,30,951505491,454287080,true +2020,4,30,1215013393,67327705,true +2020,4,30,1008677368,367502008,true +2020,4,30,1287972799,1087002994,false +2020,4,30,1574193242,2007194863,false +2020,4,30,340096574,687423226,true +2020,4,30,489986540,1246318256,true +2020,4,30,1144790129,1214718885,false +2020,4,30,985000099,1605447936,false +2020,4,31,2029931243,919410901,false +2020,4,31,2072420114,931614272,true +2020,4,31,3940364,161228007,false +2020,4,31,378939782,1405422324,false +2020,4,31,1406944844,18021210,true +2020,4,31,164795750,1576897361,true +2020,4,31,609712901,1177547520,false +2020,4,31,587779196,772962126,true +2020,4,31,207200792,749031793,true +2020,4,31,2095961339,1692242225,false +2020,5,1,90564415,264571395,false +2020,5,1,789182553,684257989,true +2020,5,1,272639986,347200496,true +2020,5,1,1856987736,566092738,false +2020,5,1,2100742428,1118314092,false +2020,5,1,257097422,166291472,true +2020,5,1,1101251098,1123275485,false +2020,5,1,429123604,863858404,false +2020,5,1,213129922,1076139128,false +2020,5,1,1244195103,569967601,false +2020,5,2,1594092629,318046693,false +2020,5,2,940522928,1911790112,true +2020,5,2,1549743535,1976178884,false +2020,5,2,895363105,1529187205,true +2020,5,2,1897009434,868809757,true +2020,5,2,551155637,1279249558,true +2020,5,2,1855268480,1339921961,true +2020,5,2,851575671,636985547,true +2020,5,2,2063408894,134923000,true +2020,5,2,1560329487,1825664297,true +2020,5,3,928544510,1356530903,false +2020,5,3,1089840142,1036231026,false +2020,5,3,1911612002,1144421512,true +2020,5,3,575561128,1219734249,false +2020,5,3,1891277419,2008338096,false +2020,5,3,936891862,1407847399,true +2020,5,3,1391587163,520492282,true +2020,5,3,102295113,394697153,true +2020,5,3,1000826891,1738589415,true +2020,5,3,1739447490,1675925151,true +2020,5,4,1269114850,1233184588,true +2020,5,4,1321545133,387542881,false +2020,5,4,688476150,1744954074,false +2020,5,4,1948317955,626353570,true +2020,5,4,522463805,741591376,true +2020,5,4,1857787623,1091562607,true +2020,5,4,289224220,244931116,true +2020,5,4,1196783233,319949401,false +2020,5,4,1409168302,1741166311,false +2020,5,4,256640246,379534826,false +2020,5,5,2111381401,1927324711,true +2020,5,5,1493787546,1219770777,true +2020,5,5,1882145626,1076729974,false +2020,5,5,1180580775,108847838,false +2020,5,5,1579228731,478254161,false +2020,5,5,1494108959,1909875696,true +2020,5,5,57510268,906256557,false +2020,5,5,2080844202,1741972046,true +2020,5,5,929703506,1771087949,false +2020,5,5,114462091,1091214383,false +2020,5,6,2079289034,153681273,false +2020,5,6,2121455608,314498399,true +2020,5,6,442153656,414781684,false +2020,5,6,810300806,1594783801,false +2020,5,6,100319272,1953929286,true +2020,5,6,1615198821,1334430145,false +2020,5,6,1656068896,556871860,true +2020,5,6,1689281434,654235658,true +2020,5,6,1466419852,88072570,true +2020,5,6,283729281,39346275,false +2020,5,7,1533206691,1934723880,true +2020,5,7,1647815988,1754971915,true +2020,5,7,1711895346,1005118255,true +2020,5,7,1952230836,76909656,false +2020,5,7,1778398344,38232569,false +2020,5,7,218172127,272411541,false +2020,5,7,1323145217,2108757808,true +2020,5,7,1651856281,964847108,false +2020,5,7,782306644,785873653,true +2020,5,7,1328689596,727062723,false +2020,5,8,474432572,2103335990,false +2020,5,8,1762746779,1395623563,false +2020,5,8,1137974803,1841089389,true +2020,5,8,2063430725,573118077,true +2020,5,8,664700269,1884726884,false +2020,5,8,1929760462,2040381307,false +2020,5,8,1656447973,656540884,true +2020,5,8,494569027,1920476098,true +2020,5,8,1448602767,1393526528,false +2020,5,8,1269463824,719127225,true +2020,5,9,1833467340,558120598,false +2020,5,9,2021703692,757905206,false +2020,5,9,1227360123,1909450612,false +2020,5,9,371046497,419858089,false +2020,5,9,1069497012,1538799926,true +2020,5,9,19840258,1355358700,false +2020,5,9,1679122069,80843653,false +2020,5,9,753685888,1847685794,true +2020,5,9,280791492,475199147,false +2020,5,9,331641162,1578809461,true +2020,5,10,363184823,807235283,true +2020,5,10,1811572166,1855954833,true +2020,5,10,644364517,1469686780,true +2020,5,10,1932137161,709717545,false +2020,5,10,1903818590,994472337,true +2020,5,10,1351893575,896153836,false +2020,5,10,1047276944,1444679706,false +2020,5,10,1714385252,461000373,true +2020,5,10,469566740,313313414,true +2020,5,10,1270251559,1219250650,false +2020,5,11,1594622940,2071473149,true +2020,5,11,897358456,961446374,false +2020,5,11,25870366,779334723,true +2020,5,11,2115950650,686523041,false +2020,5,11,2021932999,1547456515,false +2020,5,11,781575063,79218100,true +2020,5,11,1205555046,34735014,true +2020,5,11,1037515743,617464400,true +2020,5,11,149978308,313606790,false +2020,5,11,2038085766,99060370,false +2020,5,12,81248171,631129488,true +2020,5,12,1458017055,526659757,true +2020,5,12,144459655,822361020,false +2020,5,12,847759476,1099570678,true +2020,5,12,1182658279,653069795,true +2020,5,12,233955329,1429927009,true +2020,5,12,454200845,915694291,true +2020,5,12,681733756,192646948,false +2020,5,12,1684560897,1602808255,false +2020,5,12,1492108081,1375654243,true +2020,5,13,549179887,2005827661,true +2020,5,13,59535709,1610496575,false +2020,5,13,19441332,1824055037,true +2020,5,13,1498914257,1300773794,true +2020,5,13,1067682024,382001561,true +2020,5,13,919893624,1673209234,true +2020,5,13,1881787328,1936954092,true +2020,5,13,1893387422,801483255,false +2020,5,13,777082788,722512372,true +2020,5,13,644400139,1745919259,true +2020,5,14,1350658383,1250063434,false +2020,5,14,1475721288,2006267210,true +2020,5,14,63115979,1220873343,false +2020,5,14,27209389,1126152281,false +2020,5,14,1312135082,1819903110,false +2020,5,14,820885259,761253891,true +2020,5,14,29405774,1175727008,true +2020,5,14,869158410,196654679,true +2020,5,14,1715937821,1086692406,false +2020,5,14,1809316754,796303795,false +2020,5,15,1564189425,1140365726,false +2020,5,15,362370271,1593990137,true +2020,5,15,1317415976,1244415699,true +2020,5,15,1847594253,1477268228,true +2020,5,15,24060442,1569044088,true +2020,5,15,1527851065,1925234823,false +2020,5,15,1561956072,552440252,false +2020,5,15,1257997457,1534219742,false +2020,5,15,1749520345,715941638,true +2020,5,15,1074434538,2041853048,true +2020,5,16,1693416132,847866474,false +2020,5,16,663489456,1964537307,false +2020,5,16,192078648,1254678867,true +2020,5,16,952281676,300494464,true +2020,5,16,1749559644,272066544,true +2020,5,16,1449121881,56835310,false +2020,5,16,2007280836,1945209402,true +2020,5,16,408933690,1295936221,true +2020,5,16,24226734,2057903030,true +2020,5,16,936124914,288487503,false +2020,5,17,1833718620,255729727,false +2020,5,17,877588044,1159637214,false +2020,5,17,1071521931,1291076982,true +2020,5,17,1753500596,1360751018,true +2020,5,17,561805114,1968811952,true +2020,5,17,707362898,1638904500,false +2020,5,17,222027578,1019446903,true +2020,5,17,1662238791,359559304,true +2020,5,17,1968088895,585698301,true +2020,5,17,284894228,80255870,false +2020,5,18,526475883,174682122,true +2020,5,18,348109470,26884094,false +2020,5,18,1541854340,786614868,false +2020,5,18,497115720,136142735,true +2020,5,18,2029322078,2070612290,true +2020,5,18,1611757588,1104852116,true +2020,5,18,362688542,1596258689,true +2020,5,18,266515668,108417794,true +2020,5,18,2147136852,1020012536,false +2020,5,18,2015399880,86802040,false +2020,5,19,2081251120,379589380,true +2020,5,19,704021317,935201409,false +2020,5,19,119578934,1394161919,true +2020,5,19,2095606582,1339904996,true +2020,5,19,1558851895,764797300,false +2020,5,19,1775318671,1768991314,true +2020,5,19,553443717,1813468067,true +2020,5,19,1539513367,1523371981,false +2020,5,19,53822610,1961777172,false +2020,5,19,1104986119,1724106659,true +2020,5,20,1404434050,1953281128,false +2020,5,20,2093943674,2027626702,false +2020,5,20,759711057,314808244,true +2020,5,20,152514463,938887191,false +2020,5,20,1946217118,431553281,true +2020,5,20,2144091683,425315412,true +2020,5,20,355840576,759274698,false +2020,5,20,1209939297,752610192,true +2020,5,20,138333989,313902990,false +2020,5,20,1042360868,802474506,true +2020,5,21,366433331,319640272,true +2020,5,21,1152887342,103312447,false +2020,5,21,470636838,1918210843,false +2020,5,21,1970690491,393773959,true +2020,5,21,1008784276,316279009,false +2020,5,21,791444407,159015147,false +2020,5,21,840215205,464510805,false +2020,5,21,603150187,116622391,false +2020,5,21,135272490,1340310415,false +2020,5,21,1871484860,1341017471,true +2020,5,22,1635685381,1780423055,true +2020,5,22,1977524062,960030087,true +2020,5,22,1025859372,602810375,true +2020,5,22,871346821,206998998,false +2020,5,22,1318181296,1854262517,false +2020,5,22,561353210,2084225877,true +2020,5,22,1307128710,309722216,true +2020,5,22,1463813278,1733625546,false +2020,5,22,1185303048,2138963983,false +2020,5,22,578855873,416433303,true +2020,5,23,988722282,957905249,true +2020,5,23,1766939607,1522518395,false +2020,5,23,1464643475,1635703162,false +2020,5,23,548619887,1121256522,false +2020,5,23,708105755,1212442890,false +2020,5,23,1345854331,1295127773,false +2020,5,23,1264053090,889841634,false +2020,5,23,328091470,101574302,true +2020,5,23,839485686,1645629188,false +2020,5,23,121702352,548364677,false +2020,5,24,1722991554,921872707,true +2020,5,24,1367252701,280077920,false +2020,5,24,1759376066,1027544531,true +2020,5,24,617871568,772694965,false +2020,5,24,1742222119,1628450144,false +2020,5,24,127004065,909485736,true +2020,5,24,68363833,1277558139,false +2020,5,24,1697697830,1757294502,true +2020,5,24,1939489622,727838596,true +2020,5,24,949297739,40490180,true +2020,5,25,1491797207,510003035,false +2020,5,25,798445163,1973414933,false +2020,5,25,1508229329,908477305,true +2020,5,25,1862862091,1956423267,true +2020,5,25,1186543330,697457478,false +2020,5,25,893887394,442744592,false +2020,5,25,547312264,131199430,false +2020,5,25,441164852,1235386374,true +2020,5,25,913073595,118776296,true +2020,5,25,287495921,2069356802,true +2020,5,26,147898814,1703272237,false +2020,5,26,1843715601,840437668,false +2020,5,26,1674585187,624529496,false +2020,5,26,2059887010,2049245694,false +2020,5,26,845381177,482925411,true +2020,5,26,391087111,341927837,false +2020,5,26,1857827409,174465273,false +2020,5,26,1410832730,852192585,true +2020,5,26,131574446,1899033362,false +2020,5,26,1142936300,2013128162,false +2020,5,27,1208353907,2015180182,true +2020,5,27,1399113112,2073597738,true +2020,5,27,1845837163,998826253,true +2020,5,27,88816807,1846227810,true +2020,5,27,1706410115,1365647038,false +2020,5,27,1782774897,184353014,false +2020,5,27,406782279,823884499,false +2020,5,27,1574223625,25159608,true +2020,5,27,1710757516,977912925,true +2020,5,27,9380491,23502679,false +2020,5,28,1734705863,745026631,true +2020,5,28,947667947,1157668114,true +2020,5,28,1992991932,99773340,false +2020,5,28,1068608087,1070345948,false +2020,5,28,2128229778,869865133,false +2020,5,28,310822536,343441509,true +2020,5,28,867225964,61036026,false +2020,5,28,719711891,1269441223,true +2020,5,28,621895191,1113835240,true +2020,5,28,378026525,1108388303,false +2020,5,29,1494959982,1501885867,false +2020,5,29,696730623,1360140818,false +2020,5,29,184257405,9205490,false +2020,5,29,1518824895,553634125,false +2020,5,29,13926550,1684386609,true +2020,5,29,525999711,1364854467,true +2020,5,29,1977986508,88792733,true +2020,5,29,1713334652,959837086,false +2020,5,29,1375275660,105807929,false +2020,5,29,934864050,456160114,true +2020,5,30,1325470809,1427543945,false +2020,5,30,2024341686,895392631,false +2020,5,30,335704528,677753785,true +2020,5,30,1119441073,362779416,false +2020,5,30,1656938615,2099345509,true +2020,5,30,1648006588,760546024,false +2020,5,30,1031274255,810675015,false +2020,5,30,1778939275,756614939,false +2020,5,30,1156888668,1220791146,false +2020,5,30,1168853674,1916121358,true +2020,5,31,1714467493,996273596,true +2020,5,31,765512158,910198678,false +2020,5,31,1290192465,1922530325,true +2020,5,31,587340675,720484347,false +2020,5,31,1760963186,191744778,true +2020,5,31,2044645625,1880349166,false +2020,5,31,1811238943,1089771556,true +2020,5,31,1913691532,1626265315,false +2020,5,31,319315426,1609610586,false +2020,5,31,1343635281,1954726290,false +2020,6,1,130183862,2041757652,false +2020,6,1,1907122156,538035336,true +2020,6,1,129915960,578965353,true +2020,6,1,1407366722,97113874,false +2020,6,1,1432276511,1652580697,true +2020,6,1,843655553,1557016040,false +2020,6,1,763918619,1037659173,false +2020,6,1,558134023,1445949795,false +2020,6,1,325201316,1426709406,false +2020,6,1,400658437,1371682085,false +2020,6,2,774673251,186743683,true +2020,6,2,2104134908,1767509361,false +2020,6,2,451377213,533718079,true +2020,6,2,1617077347,624524468,true +2020,6,2,410346021,401023140,false +2020,6,2,1025089299,815521767,true +2020,6,2,180188478,825085123,false +2020,6,2,779268179,569772746,false +2020,6,2,1006236669,947367208,false +2020,6,2,2066151572,1618967867,false +2020,6,3,1361588544,820871966,false +2020,6,3,766132540,772509956,true +2020,6,3,1247464603,1015184850,true +2020,6,3,1724231961,2078131197,false +2020,6,3,1602555725,1629514087,true +2020,6,3,1517419595,45343830,true +2020,6,3,1922452625,2136634461,true +2020,6,3,1604598729,571307360,false +2020,6,3,1011208784,671703255,false +2020,6,3,99547088,1493807414,true +2020,6,4,86396764,2141260446,false +2020,6,4,878073300,814819338,true +2020,6,4,457517755,636855298,false +2020,6,4,1257936345,902346761,true +2020,6,4,548744112,883669723,true +2020,6,4,1904241712,1195933977,false +2020,6,4,1454478458,1915702361,false +2020,6,4,1163009041,1909530062,false +2020,6,4,27296265,1071102315,true +2020,6,4,753849491,785768310,true +2020,6,5,334598425,620601055,false +2020,6,5,624264411,1383439626,true +2020,6,5,1349813615,511654799,false +2020,6,5,1517581744,635561310,true +2020,6,5,473336369,1764523051,true +2020,6,5,1490306475,198353651,false +2020,6,5,1598728841,597608335,true +2020,6,5,1616493237,1097501742,true +2020,6,5,1345447960,18190239,false +2020,6,5,1600784036,2093917409,false +2020,6,6,1103292645,934300185,false +2020,6,6,160296453,278638190,false +2020,6,6,97134793,685971943,true +2020,6,6,1359775608,1077033457,true +2020,6,6,1365703303,1595206870,true +2020,6,6,1945261875,1007367202,false +2020,6,6,1040450321,352154293,false +2020,6,6,1462185340,1153484027,true +2020,6,6,787170736,81128014,true +2020,6,6,1681373247,886787405,false +2020,6,7,886432252,529112507,false +2020,6,7,1438119690,1067498167,true +2020,6,7,36789488,1842338558,true +2020,6,7,1310983700,2021222086,true +2020,6,7,1552565376,231771882,false +2020,6,7,836030492,1156894763,false +2020,6,7,1701593455,1891841976,true +2020,6,7,183466050,1538311487,true +2020,6,7,1995427176,1301543401,true +2020,6,7,1208976423,981835096,false +2020,6,8,2106512248,1065880891,false +2020,6,8,1862093816,299102867,false +2020,6,8,270274435,503236824,false +2020,6,8,1379518773,1409646629,true +2020,6,8,562109767,1442979091,false +2020,6,8,1472583238,1025010000,true +2020,6,8,221798925,1598330438,false +2020,6,8,957489222,1208929163,true +2020,6,8,616129557,1638786995,true +2020,6,8,563397009,841230981,false +2020,6,9,99502429,845465201,false +2020,6,9,1623557864,1824427779,false +2020,6,9,1459759969,32922576,true +2020,6,9,1443336790,1417488846,true +2020,6,9,2094219217,925047741,false +2020,6,9,674313749,855409197,true +2020,6,9,883857024,859186945,true +2020,6,9,121491235,306625437,false +2020,6,9,1391061751,2086724215,false +2020,6,9,262859341,1549459650,true +2020,6,10,646108264,496427924,false +2020,6,10,1135437779,1036087755,true +2020,6,10,540084477,634851275,true +2020,6,10,1175286004,1765753532,true +2020,6,10,1231454649,724279219,false +2020,6,10,181090313,1888883295,true +2020,6,10,1468441179,1157820419,true +2020,6,10,1729787630,1207418052,false +2020,6,10,701705485,2129361336,true +2020,6,10,913359088,307532957,true +2020,6,11,991972287,1763983362,true +2020,6,11,909235321,968742108,true +2020,6,11,1665409,768888270,false +2020,6,11,257783345,1056723027,false +2020,6,11,1757381690,1454689793,true +2020,6,11,19518027,884152142,false +2020,6,11,122362756,1975300297,false +2020,6,11,1067661877,102419296,true +2020,6,11,1977318313,554106249,true +2020,6,11,33235949,299456468,true +2020,6,12,1796860896,1467462727,false +2020,6,12,729474461,491966853,false +2020,6,12,1282919781,1543097008,true +2020,6,12,808883030,1881860873,true +2020,6,12,1468337415,757346311,true +2020,6,12,1663497727,606978673,true +2020,6,12,1569323823,925771928,true +2020,6,12,504994251,176514064,true +2020,6,12,1913241247,118787209,false +2020,6,12,1987035106,1630880263,false +2020,6,13,1038464086,1529009957,true +2020,6,13,366109174,22258307,true +2020,6,13,1567505232,988498664,true +2020,6,13,1730527228,62633714,true +2020,6,13,1525298474,978390014,false +2020,6,13,190508870,1258962079,false +2020,6,13,2037316813,1318307316,true +2020,6,13,1530128299,1890468716,true +2020,6,13,1826179750,2014677703,true +2020,6,13,864446087,735809965,false +2020,6,14,1557299745,1827006612,true +2020,6,14,1801781425,495658059,false +2020,6,14,2048575065,1813973148,false +2020,6,14,984265665,365367476,true +2020,6,14,1156580412,1095161179,false +2020,6,14,156533813,291742285,false +2020,6,14,1089826454,1585706706,true +2020,6,14,2046560598,177307485,true +2020,6,14,2143929534,1404549992,false +2020,6,14,1481933046,317260197,false +2020,6,15,1622718173,902238442,true +2020,6,15,1937175639,965365030,false +2020,6,15,2008802051,1690260611,true +2020,6,15,2040491218,36182982,false +2020,6,15,405851273,674449877,true +2020,6,15,1314456103,1468284619,true +2020,6,15,934077295,1907476597,true +2020,6,15,255196535,923242351,false +2020,6,15,1774764986,1683289199,true +2020,6,15,182924115,542506482,false +2020,6,16,1592087804,1263399336,false +2020,6,16,294544488,1092137720,false +2020,6,16,1043656837,1009730358,true +2020,6,16,1759295408,584145174,true +2020,6,16,1517830401,1404452609,false +2020,6,16,1442548275,68400347,false +2020,6,16,287422555,1935833931,false +2020,6,16,1100197814,1252409270,false +2020,6,16,1370902853,448179962,true +2020,6,16,1034953969,1152095878,false +2020,6,17,2020698040,84440748,false +2020,6,17,1393356573,2069415615,true +2020,6,17,1774311492,827367427,true +2020,6,17,1119536639,1434905952,false +2020,6,17,575526925,1239044527,true +2020,6,17,480967863,881836342,true +2020,6,17,1816274880,1961390618,true +2020,6,17,1619868411,710317594,true +2020,6,17,955838959,1971019714,true +2020,6,17,654297997,1724627242,true +2020,6,18,1957224844,7353769,false +2020,6,18,1808981505,1232848440,true +2020,6,18,18438017,566936417,false +2020,6,18,112453948,1749832921,false +2020,6,18,951856789,1673913248,false +2020,6,18,818327847,1907566559,true +2020,6,18,1484702495,605963609,false +2020,6,18,76619153,2026664994,false +2020,6,18,1650788402,812833491,true +2020,6,18,1529045303,185265121,true +2020,6,19,1547953740,1636190011,false +2020,6,19,1935259492,1611371051,false +2020,6,19,72941486,206413462,false +2020,6,19,376212397,1195769351,true +2020,6,19,503960798,1341386019,false +2020,6,19,2089878194,1662334803,true +2020,6,19,660287506,2137920151,true +2020,6,19,871490544,2146879390,false +2020,6,19,755714152,1905253035,false +2020,6,19,686859262,1327335259,false +2020,6,20,482362936,145731579,true +2020,6,20,1154094036,642788421,true +2020,6,20,1496598505,1627463467,false +2020,6,20,1966724330,2061252560,true +2020,6,20,1927138061,333737917,true +2020,6,20,610038608,1680583925,false +2020,6,20,277151667,403778867,true +2020,6,20,832618001,2026732220,false +2020,6,20,190127142,1545496131,false +2020,6,20,858125180,894426449,false +2020,6,21,1779226014,1900214602,true +2020,6,21,1182493224,1776986509,false +2020,6,21,1207877255,878005314,false +2020,6,21,477550798,1171953456,false +2020,6,21,1241318765,400071656,true +2020,6,21,1298468437,1635957830,true +2020,6,21,333117359,2120773844,false +2020,6,21,875831446,890673257,true +2020,6,21,1072714112,463737521,true +2020,6,21,2146018254,337434852,false +2020,6,22,806934760,492097752,false +2020,6,22,1584442807,355759839,false +2020,6,22,399479740,1042087156,false +2020,6,22,223153846,1694336787,false +2020,6,22,1144858829,11161482,true +2020,6,22,1094162600,647698124,false +2020,6,22,519125688,936275813,false +2020,6,22,1968983604,608261139,true +2020,6,22,1240913692,370822242,true +2020,6,22,1281506987,319557458,true +2020,6,23,1987313450,797260566,true +2020,6,23,128872918,314962661,false +2020,6,23,1132390079,1202000433,false +2020,6,23,454159436,1877780800,true +2020,6,23,1503878545,1257784251,true +2020,6,23,1385058341,898442426,true +2020,6,23,1690171925,2058541946,false +2020,6,23,1637760797,1508140065,false +2020,6,23,2007932963,573686047,false +2020,6,23,1412774537,539162013,true +2020,6,24,1176104326,1267230735,true +2020,6,24,683580080,1156459915,true +2020,6,24,593593593,1326024674,true +2020,6,24,290073183,689641823,true +2020,6,24,436350022,38166509,true +2020,6,24,111561941,87144102,true +2020,6,24,1981795449,1204139381,false +2020,6,24,905978713,1160737987,false +2020,6,24,508668777,943587162,true +2020,6,24,2012757269,1824429051,true +2020,6,25,761478149,45636440,false +2020,6,25,1089586965,1764499179,false +2020,6,25,1990833960,253544508,true +2020,6,25,1746944879,450040704,false +2020,6,25,665182434,2007893394,true +2020,6,25,154621441,2028168963,false +2020,6,25,1195441778,746861756,true +2020,6,25,2047181394,687142113,false +2020,6,25,1731764075,1048197233,true +2020,6,25,426744175,639543852,false +2020,6,26,914786487,1431414555,false +2020,6,26,711614428,1393192976,false +2020,6,26,1665686135,790905220,true +2020,6,26,818639604,908527019,false +2020,6,26,162038257,828773076,false +2020,6,26,65796689,135673359,false +2020,6,26,2038578951,1265104854,false +2020,6,26,1457458439,2069049683,true +2020,6,26,1657473267,476322547,true +2020,6,26,1017918430,729372669,false +2020,6,27,1503789109,967526253,false +2020,6,27,1615522006,1111887408,false +2020,6,27,1908632567,1457430965,true +2020,6,27,1049014671,1223311456,true +2020,6,27,1744506713,1920115714,false +2020,6,27,1450853739,1685304694,true +2020,6,27,1203797858,1326515554,true +2020,6,27,797075854,1508387506,false +2020,6,27,16784510,1241759490,false +2020,6,27,888904704,1398346665,true +2020,6,28,1034484442,766871143,true +2020,6,28,272538782,680969518,false +2020,6,28,908749572,1804016837,false +2020,6,28,1620782495,2111069198,true +2020,6,28,845482123,1948619447,true +2020,6,28,1285773015,870181084,true +2020,6,28,279758486,2027316493,true +2020,6,28,547806718,1937432657,true +2020,6,28,872444597,434303588,true +2020,6,28,526570404,27628507,false +2020,6,29,1884562511,2105445058,true +2020,6,29,1994250138,887298167,true +2020,6,29,987175802,171446649,true +2020,6,29,1041916154,2018415087,true +2020,6,29,421149736,59909869,true +2020,6,29,801486098,333440775,false +2020,6,29,1930275559,872997658,true +2020,6,29,1499373019,106725195,true +2020,6,29,410822310,16360179,false +2020,6,29,1622289350,1339843957,false +2020,6,30,213163840,393550303,true +2020,6,30,1559769098,417730977,false +2020,6,30,283462337,1128009235,true +2020,6,30,872117169,1898300623,true +2020,6,30,1079782289,600813934,false +2020,6,30,1883983120,487379815,false +2020,6,30,764880919,293911073,false +2020,6,30,199944323,238751403,false +2020,6,30,584290418,233768280,true +2020,6,30,1083552368,682999147,false +2020,6,31,283429202,674167685,true +2020,6,31,1393336322,812700240,true +2020,6,31,831445786,137621813,false +2020,6,31,2107369767,2034260634,true +2020,6,31,1362356951,161654895,false +2020,6,31,706212259,362351140,true +2020,6,31,1207580328,1813886466,false +2020,6,31,1623715943,1984586253,false +2020,6,31,1941709232,352924508,false +2020,6,31,1774201700,2066597206,false +2020,7,1,790248365,638820539,true +2020,7,1,1843440532,2028598101,true +2020,7,1,1984204476,1872526543,false +2020,7,1,1052372973,1761042893,true +2020,7,1,1652353534,1081912884,true +2020,7,1,245542782,259055880,false +2020,7,1,531032598,944093256,true +2020,7,1,425261543,1435329104,false +2020,7,1,1374632120,814262333,false +2020,7,1,115362529,285193612,false +2020,7,2,30175939,976486537,true +2020,7,2,362627228,1252207863,true +2020,7,2,484927636,1853192183,true +2020,7,2,816309787,2100256292,true +2020,7,2,1802735263,20936458,true +2020,7,2,899004880,2047884753,false +2020,7,2,1968997485,299805041,false +2020,7,2,1048909357,824485944,true +2020,7,2,1553147529,1301744313,false +2020,7,2,1407857887,590191063,true +2020,7,3,1645722639,1107582087,true +2020,7,3,276954868,1857617130,true +2020,7,3,1373833934,43794934,true +2020,7,3,543492888,716738065,true +2020,7,3,111969603,657596155,false +2020,7,3,911990952,2051540694,false +2020,7,3,590277588,1110549455,true +2020,7,3,163094873,1512900359,false +2020,7,3,1601220376,113601492,true +2020,7,3,564587670,1247948328,true +2020,7,4,14922504,22912121,true +2020,7,4,1874589420,649596011,true +2020,7,4,608555891,1678167031,true +2020,7,4,77774106,1885345509,true +2020,7,4,1744652298,1883044111,true +2020,7,4,1429519777,2008886532,false +2020,7,4,1409929946,1985467397,true +2020,7,4,2038566112,870026610,false +2020,7,4,1146949143,353719950,true +2020,7,4,20756268,1236264670,true +2020,7,5,2119452285,1315142517,false +2020,7,5,19237212,1940295018,false +2020,7,5,1491051959,1546199057,false +2020,7,5,1268995009,1744066480,true +2020,7,5,1610814598,1766164722,true +2020,7,5,990043271,121196190,true +2020,7,5,1451409063,452720713,true +2020,7,5,1587269102,1205387662,true +2020,7,5,591434403,927010806,false +2020,7,5,291916252,1241285741,true +2020,7,6,282199561,2034215488,true +2020,7,6,2010333112,1405333456,false +2020,7,6,1264706257,1155349415,true +2020,7,6,2024700562,277691206,true +2020,7,6,1105493195,1390869839,false +2020,7,6,217148665,11061314,true +2020,7,6,1920759402,2119885554,true +2020,7,6,1825527067,1432092794,true +2020,7,6,1806811733,1226639453,false +2020,7,6,1355431674,888706190,true +2020,7,7,1478353778,665625062,true +2020,7,7,160562141,1746485320,false +2020,7,7,1001463671,1247600730,false +2020,7,7,275066189,2144194534,false +2020,7,7,1483307039,330200147,true +2020,7,7,116215600,1081935395,false +2020,7,7,1489657233,668909051,true +2020,7,7,381358061,357893366,false +2020,7,7,658716839,1435045283,false +2020,7,7,2125696664,1398836038,false +2020,7,8,786609668,1061484215,false +2020,7,8,350361240,2056357897,false +2020,7,8,912274254,1144547832,true +2020,7,8,988738884,1863597947,false +2020,7,8,1727948250,371483907,false +2020,7,8,127066475,1257576826,false +2020,7,8,89151151,1532253320,false +2020,7,8,1340108231,1537030184,false +2020,7,8,457814450,1626414026,false +2020,7,8,1270937280,1711127646,false +2020,7,9,1914582602,851238709,false +2020,7,9,2028517654,1532047771,true +2020,7,9,463627758,188842287,false +2020,7,9,1470894661,2121764208,false +2020,7,9,372599119,1374422897,false +2020,7,9,2012587202,1918930584,true +2020,7,9,1840062239,1266625671,false +2020,7,9,630655326,676767293,true +2020,7,9,404778922,682165328,false +2020,7,9,312093815,1048707616,false +2020,7,10,325052081,1262363537,false +2020,7,10,1316241597,1503615121,false +2020,7,10,1436240207,250673646,true +2020,7,10,302097076,2103693187,false +2020,7,10,276164563,1188057848,false +2020,7,10,513208127,1723944481,true +2020,7,10,1152396232,777784836,false +2020,7,10,1521695948,1862454440,false +2020,7,10,1924563114,1703173643,false +2020,7,10,1377020425,1734629498,false +2020,7,11,878182908,1901402566,true +2020,7,11,433888798,2060423836,true +2020,7,11,647713718,932454843,false +2020,7,11,1530000867,103546153,true +2020,7,11,665363010,358970624,false +2020,7,11,383205009,1105102938,false +2020,7,11,1240530308,1649968854,false +2020,7,11,274979284,1890126380,false +2020,7,11,1211557542,1323144817,true +2020,7,11,1008060499,528462810,true +2020,7,12,751182486,1311080323,false +2020,7,12,1374138112,617528985,true +2020,7,12,1866142587,1384547687,true +2020,7,12,1239140563,2059894109,false +2020,7,12,1185188240,1048202238,false +2020,7,12,1321445823,2064339854,true +2020,7,12,125450273,1460099227,false +2020,7,12,575885563,1493815543,false +2020,7,12,518084004,1412915861,true +2020,7,12,913430526,400159065,false +2020,7,13,909604377,814038607,false +2020,7,13,419520691,1766672989,true +2020,7,13,1527772479,629970658,false +2020,7,13,1033277658,25582824,false +2020,7,13,1706842695,117164454,false +2020,7,13,1652767123,2106349297,true +2020,7,13,982356213,126835787,true +2020,7,13,1236963706,1462298994,true +2020,7,13,983544966,136216900,true +2020,7,13,69465320,1047951884,false +2020,7,14,1777403563,1696743351,true +2020,7,14,2004697901,148601447,false +2020,7,14,1544984946,159640359,false +2020,7,14,679147753,730603403,true +2020,7,14,153992452,894170258,true +2020,7,14,424339141,923705750,false +2020,7,14,562788977,1151701444,false +2020,7,14,1445013819,1780721821,false +2020,7,14,61201165,1401786628,false +2020,7,14,395682100,1408771486,false +2020,7,15,1924366340,354926182,true +2020,7,15,1524887203,1818664651,false +2020,7,15,109228076,1182061933,true +2020,7,15,404727141,1586658455,true +2020,7,15,1705153101,270460099,false +2020,7,15,401764874,1809987913,true +2020,7,15,1471174726,1590729509,false +2020,7,15,223344476,919537409,false +2020,7,15,1177730017,166198240,true +2020,7,15,1569892023,337551484,false +2020,7,16,2064377905,601015922,true +2020,7,16,1376945297,246617837,false +2020,7,16,1035195087,216285223,true +2020,7,16,1849229256,1997839308,false +2020,7,16,737259097,692869066,true +2020,7,16,128421257,1482297195,true +2020,7,16,1065653233,926948038,true +2020,7,16,749137138,428447147,false +2020,7,16,2074301366,1145927754,true +2020,7,16,1246076087,226830387,false +2020,7,17,1362009202,102771274,false +2020,7,17,507083386,378937830,false +2020,7,17,1899227373,51158390,true +2020,7,17,2093778578,515613079,false +2020,7,17,1227025824,1381935229,false +2020,7,17,1561505907,975401516,true +2020,7,17,1913269189,1582112820,false +2020,7,17,1547644663,1581544616,true +2020,7,17,1887758069,2075044221,true +2020,7,17,75861305,541091036,true +2020,7,18,1792512206,802087683,true +2020,7,18,1988956097,45403048,true +2020,7,18,953016928,1002565662,false +2020,7,18,409319214,755964436,true +2020,7,18,475848013,299754368,true +2020,7,18,916700327,1571606193,true +2020,7,18,1069276889,1630622593,true +2020,7,18,215040437,1157118711,false +2020,7,18,1438342506,1126290203,true +2020,7,18,1728831574,568758435,true +2020,7,19,1764877604,1118237173,false +2020,7,19,1794908524,1896439614,true +2020,7,19,329515677,1989350166,false +2020,7,19,1035759311,1613834323,false +2020,7,19,1370559205,1399106830,true +2020,7,19,934423470,1579695362,true +2020,7,19,146011734,1832628159,false +2020,7,19,1553074435,429146610,true +2020,7,19,454916069,469281540,false +2020,7,19,887995720,1632931529,true +2020,7,20,603483372,1164457710,false +2020,7,20,1396166061,233223355,true +2020,7,20,450507172,1778702671,true +2020,7,20,1883661434,1171184591,false +2020,7,20,791266505,1550157250,true +2020,7,20,1988760431,2111629537,true +2020,7,20,707555615,1101900928,true +2020,7,20,1063004722,904040086,true +2020,7,20,317734355,1545464216,true +2020,7,20,668125232,1849301432,true +2020,7,21,1652905699,667916642,true +2020,7,21,1627477636,176555441,false +2020,7,21,2092094749,1478045772,false +2020,7,21,15888821,597803042,true +2020,7,21,1829687635,1723990865,false +2020,7,21,1627245678,1665370828,true +2020,7,21,1531646159,349209524,false +2020,7,21,1524108159,1673834813,true +2020,7,21,1910499638,1567726554,false +2020,7,21,471619786,757750784,true +2020,7,22,247384147,1949314137,false +2020,7,22,383073764,1838136651,true +2020,7,22,1037117573,456880020,true +2020,7,22,1040421011,1998533462,true +2020,7,22,1729311938,1640404857,true +2020,7,22,1118022257,1054333666,false +2020,7,22,810157939,527244350,true +2020,7,22,1231624796,1242036378,false +2020,7,22,531289486,734084154,true +2020,7,22,529746303,1593977429,false +2020,7,23,583830461,585493530,false +2020,7,23,146243342,238979426,true +2020,7,23,1806300529,1253391408,true +2020,7,23,1217991956,194272610,true +2020,7,23,616334899,843567754,true +2020,7,23,545512930,464700254,false +2020,7,23,886299541,1552603921,true +2020,7,23,1712356605,1639206685,false +2020,7,23,1696864903,1622574569,false +2020,7,23,963780511,1500265698,false +2020,7,24,1031075077,1308109239,true +2020,7,24,1245274153,1453578268,true +2020,7,24,1950842198,193868459,false +2020,7,24,1134789909,703616945,false +2020,7,24,259818068,2097709143,false +2020,7,24,1812957000,1836591844,true +2020,7,24,234230901,268600248,true +2020,7,24,2110041305,1454479884,true +2020,7,24,992122252,1165730816,false +2020,7,24,1883757732,1374307476,true +2020,7,25,1844405629,964295685,false +2020,7,25,344288322,861501948,true +2020,7,25,1587375193,9483096,true +2020,7,25,2001395007,459816401,false +2020,7,25,1202093722,1408818548,true +2020,7,25,536221604,1823868636,false +2020,7,25,1388016247,1146500754,false +2020,7,25,1680539438,147163122,true +2020,7,25,74368400,776607983,false +2020,7,25,1513297356,1956191530,true +2020,7,26,2113239129,773451443,true +2020,7,26,689506614,541745772,false +2020,7,26,2095667470,1829503936,false +2020,7,26,1630591676,1984368783,true +2020,7,26,485943835,1867530246,true +2020,7,26,989304741,833688704,false +2020,7,26,1050832952,261738035,true +2020,7,26,60794281,1550776364,false +2020,7,26,1961890475,931419100,false +2020,7,26,1017793138,874100515,false +2020,7,27,1131570287,591001455,false +2020,7,27,838835762,72288551,false +2020,7,27,495499059,399986057,false +2020,7,27,1423202456,560175917,true +2020,7,27,665211236,1162164880,true +2020,7,27,5975266,2145102710,true +2020,7,27,1714919113,545664668,false +2020,7,27,1261274288,250451911,true +2020,7,27,745568093,727757066,false +2020,7,27,1576446696,750793964,true +2020,7,28,1633524314,1578845790,false +2020,7,28,1319349966,1908198671,false +2020,7,28,535350214,380184122,true +2020,7,28,232375573,1042052738,true +2020,7,28,923386068,2107197582,true +2020,7,28,812285854,1572054797,true +2020,7,28,2033869819,2097993806,true +2020,7,28,547892759,624592871,false +2020,7,28,1979519892,1414752090,true +2020,7,28,2065819839,353971419,true +2020,7,29,549103189,810295895,false +2020,7,29,1963752436,2019327306,false +2020,7,29,1671704335,1149701281,true +2020,7,29,1570731614,1668474895,true +2020,7,29,1859349875,896584620,false +2020,7,29,859917514,678556519,true +2020,7,29,260451268,174657128,true +2020,7,29,669363735,1289875565,true +2020,7,29,203524554,1866397826,true +2020,7,29,155848390,222404048,false +2020,7,30,1012855649,1366391451,true +2020,7,30,330299144,1712187339,true +2020,7,30,388844912,1131542474,false +2020,7,30,862848866,1033585534,true +2020,7,30,620580405,243684870,true +2020,7,30,972080952,1349448260,false +2020,7,30,49483301,1254255178,false +2020,7,30,1187345891,1824183704,false +2020,7,30,625685718,1685514898,false +2020,7,30,552800807,129627809,false +2020,7,31,726518179,804945663,false +2020,7,31,1298263314,1768262787,true +2020,7,31,1424044563,1279688900,true +2020,7,31,1824470198,1603318532,true +2020,7,31,1937659944,671105202,false +2020,7,31,873480250,347485995,true +2020,7,31,1751900445,894345331,true +2020,7,31,1513019961,1218539991,false +2020,7,31,633198913,1694314713,true +2020,7,31,1411620707,407251088,false +2020,8,1,104649713,1233532106,false +2020,8,1,906139354,39348981,false +2020,8,1,592070361,1323674544,true +2020,8,1,661044941,463761299,false +2020,8,1,954101256,1266429822,true +2020,8,1,428815153,510074134,false +2020,8,1,417880706,1557778840,true +2020,8,1,1819878832,840779011,false +2020,8,1,1957461827,219252560,true +2020,8,1,2138563874,1442065281,false +2020,8,2,1588474114,443347376,true +2020,8,2,1170473010,435149795,false +2020,8,2,730442513,318221766,true +2020,8,2,430305783,1501596653,false +2020,8,2,1498306593,588231090,false +2020,8,2,538120361,1064894463,true +2020,8,2,671868899,11393061,false +2020,8,2,260404670,2092241079,true +2020,8,2,961229490,1508142864,false +2020,8,2,2130418706,1204629167,true +2020,8,3,673751762,644396979,true +2020,8,3,96619492,1922931531,false +2020,8,3,1471602400,107779407,false +2020,8,3,399797713,1492981226,false +2020,8,3,1120097166,1564310827,false +2020,8,3,412046611,362119885,true +2020,8,3,1655442108,738256292,true +2020,8,3,1867992747,17540494,false +2020,8,3,734497930,22168449,true +2020,8,3,338531913,1418918663,true +2020,8,4,734028548,1277553613,true +2020,8,4,1565887233,1102388974,true +2020,8,4,334091110,620445144,true +2020,8,4,2117070137,97879960,false +2020,8,4,43092021,1741233201,false +2020,8,4,1819698255,348835485,true +2020,8,4,604556701,1036862656,true +2020,8,4,1914394087,1232979802,false +2020,8,4,1266267603,445273915,false +2020,8,4,1560250853,223829114,false +2020,8,5,1957858672,1561042612,true +2020,8,5,1858272639,1568875540,false +2020,8,5,1488029688,1789447668,false +2020,8,5,409793765,1396432381,true +2020,8,5,818416402,716185556,true +2020,8,5,878354038,1544958561,false +2020,8,5,1088981787,825852896,true +2020,8,5,1935590269,197007041,true +2020,8,5,719001109,484572384,true +2020,8,5,1995497956,906284512,false +2020,8,6,2079228451,1580788251,true +2020,8,6,593555301,802460498,true +2020,8,6,938560276,1749085557,false +2020,8,6,1662435872,1080507806,false +2020,8,6,1520668332,1544572851,true +2020,8,6,428473243,1662880312,false +2020,8,6,1410390435,657947019,true +2020,8,6,875047581,402386112,false +2020,8,6,223192985,998974280,true +2020,8,6,1763898718,177844036,true +2020,8,7,1839343220,1389399598,false +2020,8,7,305484037,828474229,false +2020,8,7,1491470811,1563902434,false +2020,8,7,1169974377,1253860634,true +2020,8,7,1082365387,361870716,true +2020,8,7,1110672276,1008036892,false +2020,8,7,53579648,2097598227,false +2020,8,7,1434519154,1197930751,false +2020,8,7,1089145380,1406963168,false +2020,8,7,1260349622,587141297,false +2020,8,8,1958942975,1375818318,true +2020,8,8,1317040611,942986469,true +2020,8,8,488301346,748117134,true +2020,8,8,1868164069,1486773374,true +2020,8,8,1503622339,321197447,false +2020,8,8,1867578327,1180357557,false +2020,8,8,63342780,999675710,false +2020,8,8,1476678589,854478335,false +2020,8,8,1398698230,1435042059,false +2020,8,8,269057367,159654876,true +2020,8,9,1610927779,883982637,true +2020,8,9,699874657,603104534,false +2020,8,9,1955922831,1268412577,true +2020,8,9,393270229,348873753,true +2020,8,9,589085686,99235665,true +2020,8,9,1138981138,1677326704,false +2020,8,9,55620767,1530510819,true +2020,8,9,1016740074,1181673094,false +2020,8,9,345211690,800114652,true +2020,8,9,2008744972,1109116215,false +2020,8,10,1505015512,1841003479,true +2020,8,10,1849347404,1248544228,true +2020,8,10,1508423219,1735307507,true +2020,8,10,501980069,1789145081,false +2020,8,10,110975582,1790156753,true +2020,8,10,571863827,2080356651,true +2020,8,10,1704985336,2039046832,true +2020,8,10,620471713,2050336217,true +2020,8,10,382649811,1512772493,false +2020,8,10,1865110318,2055459154,true +2020,8,11,2027291638,801677398,false +2020,8,11,1107126744,598378229,false +2020,8,11,669000094,230124680,false +2020,8,11,1209707040,1965460828,true +2020,8,11,535294259,1329650900,true +2020,8,11,466344204,1315928855,false +2020,8,11,856465787,2132788383,true +2020,8,11,666286079,1835724964,true +2020,8,11,569130342,2020847228,false +2020,8,11,325016836,929641887,false +2020,8,12,1838684232,1611198444,false +2020,8,12,457306031,733906794,false +2020,8,12,1478257509,1554953733,false +2020,8,12,725642733,1740551600,true +2020,8,12,1757741583,1842252764,true +2020,8,12,2072822106,2056947101,true +2020,8,12,907978703,939693247,false +2020,8,12,972089586,1867730587,true +2020,8,12,1790324815,1215669141,true +2020,8,12,1667669044,1031826411,true +2020,8,13,1788343035,1320734229,false +2020,8,13,949249049,882508937,false +2020,8,13,507413648,1355343239,true +2020,8,13,768082797,1477402375,false +2020,8,13,685106976,725135432,false +2020,8,13,225114977,1559628140,true +2020,8,13,1876256039,1903011409,true +2020,8,13,67981943,2024250682,false +2020,8,13,754017989,2098241642,false +2020,8,13,303649562,542271015,false +2020,8,14,602331205,1851534551,false +2020,8,14,1673989242,131448365,false +2020,8,14,2076819308,379217306,true +2020,8,14,1836829573,607984636,true +2020,8,14,1019375943,517800865,false +2020,8,14,2135806741,877635830,true +2020,8,14,1675758714,195972991,false +2020,8,14,1226973612,1725975095,true +2020,8,14,650411065,1161841526,false +2020,8,14,195911766,954821729,false +2020,8,15,1270933098,743592966,false +2020,8,15,1355544963,82715674,true +2020,8,15,1178145708,2117290809,false +2020,8,15,1692082172,550528384,true +2020,8,15,1647618611,1770487904,false +2020,8,15,529927488,1060815624,false +2020,8,15,103353530,757315844,false +2020,8,15,2110407660,1959747702,true +2020,8,15,353166583,1669945654,true +2020,8,15,1480261112,563159285,false +2020,8,16,1878844759,80181379,false +2020,8,16,1074294770,211722112,true +2020,8,16,531869678,1329595951,true +2020,8,16,890896764,1687078669,false +2020,8,16,1407852762,937609875,false +2020,8,16,1432574137,1334768709,true +2020,8,16,752355473,1200579749,false +2020,8,16,786491762,2084306084,false +2020,8,16,1068744395,1611016731,false +2020,8,16,954790413,1115117407,true +2020,8,17,424440074,1439849891,true +2020,8,17,922632118,1827439198,true +2020,8,17,916895728,1534612010,true +2020,8,17,1520502863,66103849,false +2020,8,17,1520133340,913513251,false +2020,8,17,1923314675,871835040,true +2020,8,17,153595282,1645750610,false +2020,8,17,1978163513,635145932,false +2020,8,17,1709010383,1785591942,false +2020,8,17,1700681342,1514190902,false +2020,8,18,769603349,804621892,false +2020,8,18,1822080981,1915257477,false +2020,8,18,55764255,803621460,true +2020,8,18,1282169971,934361158,true +2020,8,18,1758886328,888095148,false +2020,8,18,1551525816,186820153,false +2020,8,18,493202662,1242126645,false +2020,8,18,1193371923,1756378229,true +2020,8,18,1474688681,1208969551,true +2020,8,18,666958723,1532092731,false +2020,8,19,1984261574,915221552,true +2020,8,19,1112479837,1710733400,true +2020,8,19,234142541,1961322596,false +2020,8,19,415158939,2024360540,true +2020,8,19,2055939291,1281752610,false +2020,8,19,1707656939,1269517597,true +2020,8,19,1374576589,1578043696,true +2020,8,19,170074664,1885537077,false +2020,8,19,561450464,1595308674,false +2020,8,19,690737935,1793776251,false +2020,8,20,1498054629,594162661,true +2020,8,20,1429771596,115536305,true +2020,8,20,11416992,461643092,true +2020,8,20,1631245102,1364161495,true +2020,8,20,448829929,976465596,true +2020,8,20,340734117,1602094046,false +2020,8,20,1109082272,2042253930,true +2020,8,20,1855831388,539123128,true +2020,8,20,848265241,692233041,true +2020,8,20,1156947206,1968433653,false +2020,8,21,1795174922,29634531,false +2020,8,21,452475961,1106040685,true +2020,8,21,148359206,1523836223,true +2020,8,21,1523135857,1992651431,false +2020,8,21,194854621,1274931165,true +2020,8,21,1061334284,1014592764,true +2020,8,21,707585285,1603220989,false +2020,8,21,331707481,1119483306,true +2020,8,21,995293524,1043615102,true +2020,8,21,527008542,1204163919,true +2020,8,22,1693505267,30147063,true +2020,8,22,693672982,941258885,false +2020,8,22,1041522836,299501859,true +2020,8,22,1353321560,1140310170,true +2020,8,22,160079305,1300255472,true +2020,8,22,2036119209,652835224,true +2020,8,22,1771544730,1271355366,true +2020,8,22,1395839776,962591504,true +2020,8,22,307222915,1577579416,true +2020,8,22,1051281585,1433046383,false +2020,8,23,993693283,1990357590,true +2020,8,23,1366890864,1649235911,true +2020,8,23,1481524491,2018574441,false +2020,8,23,226710445,891632200,false +2020,8,23,2000825437,1031009906,true +2020,8,23,1102850361,1190615016,true +2020,8,23,1581381450,322331080,true +2020,8,23,1140753268,24296931,true +2020,8,23,1463468814,1733030120,false +2020,8,23,898790903,748151323,false +2020,8,24,422680178,1244130136,false +2020,8,24,346792690,771139978,false +2020,8,24,67656196,1048142220,false +2020,8,24,962317206,1240653362,true +2020,8,24,413264261,1693654177,false +2020,8,24,2046145573,1293844576,false +2020,8,24,1760402684,149903131,true +2020,8,24,1062143620,828817685,false +2020,8,24,357987370,869187298,false +2020,8,24,1301301080,533513198,true +2020,8,25,1941010774,1799880241,false +2020,8,25,684790039,1265046766,false +2020,8,25,2003110065,1173806132,true +2020,8,25,546762142,1843927787,false +2020,8,25,2069495737,1067925384,false +2020,8,25,1242964513,1077168533,true +2020,8,25,2034138212,1148046304,true +2020,8,25,1370940050,1709669964,true +2020,8,25,1055741443,348899699,false +2020,8,25,1895659730,1157781822,false +2020,8,26,2138499601,2132430850,true +2020,8,26,1714506061,1172347020,true +2020,8,26,2022688116,187080858,false +2020,8,26,1977815502,1808403810,false +2020,8,26,1254895317,28967007,false +2020,8,26,1542822056,1565901630,false +2020,8,26,728248603,353801208,true +2020,8,26,352981391,636940267,false +2020,8,26,1030327595,221937283,true +2020,8,26,633410228,1514491600,true +2020,8,27,875724928,262549523,true +2020,8,27,1720049661,2101752995,false +2020,8,27,1542693429,1380134687,true +2020,8,27,1096377206,1208746146,true +2020,8,27,1636944531,698879354,true +2020,8,27,1650929995,1136207583,true +2020,8,27,12589386,944534689,false +2020,8,27,1927135976,298382152,false +2020,8,27,1649036501,1795593860,false +2020,8,27,1420654613,1471670120,false +2020,8,28,1375783360,1255188245,true +2020,8,28,436942381,1314473243,false +2020,8,28,684554949,1288806471,false +2020,8,28,1364491448,346523306,false +2020,8,28,215624443,1093551420,false +2020,8,28,1584253971,402014470,false +2020,8,28,1160819085,2001040087,false +2020,8,28,839456241,780715808,true +2020,8,28,630803126,653925099,true +2020,8,28,17952975,1598090639,true +2020,8,29,2097241070,1156848832,false +2020,8,29,2111782585,1626543103,true +2020,8,29,1254470033,1459837094,true +2020,8,29,77477108,525479460,false +2020,8,29,867144298,1419980230,true +2020,8,29,487523680,1797879842,false +2020,8,29,601874908,610982013,false +2020,8,29,1930372021,1484710568,true +2020,8,29,851084208,362099932,false +2020,8,29,814026729,2086510902,true +2020,8,30,2004704419,1344932903,true +2020,8,30,1373373532,1881387327,true +2020,8,30,111685965,118988131,false +2020,8,30,1412870565,817314599,false +2020,8,30,1969182971,1602048093,false +2020,8,30,760521482,1597793878,true +2020,8,30,2036092048,730515365,true +2020,8,30,2095631891,430190538,false +2020,8,30,307861900,1712890232,false +2020,8,30,30244017,1040652022,true +2020,8,31,64503742,468940945,false +2020,8,31,1135871222,378574540,true +2020,8,31,61017182,790424910,true +2020,8,31,2146495055,716927305,false +2020,8,31,1024503671,45743288,true +2020,8,31,71820216,1588290011,false +2020,8,31,1874469096,1856994889,false +2020,8,31,680077165,582149997,false +2020,8,31,2039677033,768383623,false +2020,8,31,738334719,1963921683,false +2020,9,1,1117978710,1427913674,true +2020,9,1,1318080472,847159418,true +2020,9,1,879734068,556281799,false +2020,9,1,1755176846,247724971,false +2020,9,1,1278088219,1087290301,false +2020,9,1,461942942,1486117839,false +2020,9,1,1971615900,907220562,false +2020,9,1,986394974,2044459233,false +2020,9,1,1734969016,661479805,false +2020,9,1,2102462540,968077985,false +2020,9,2,1209255926,408661016,true +2020,9,2,1956928509,1721595171,false +2020,9,2,211856075,490538475,true +2020,9,2,1293259365,1680434117,true +2020,9,2,1322561996,1827478098,true +2020,9,2,1123816743,731266943,true +2020,9,2,1905816667,501933258,false +2020,9,2,1072167462,300324285,false +2020,9,2,2089913555,1263155205,false +2020,9,2,1813853859,599563085,false +2020,9,3,5271651,405654365,true +2020,9,3,573411449,1195260674,true +2020,9,3,574688016,20423610,true +2020,9,3,202417248,247423515,true +2020,9,3,565438109,1872675907,true +2020,9,3,850682168,658703506,true +2020,9,3,428354886,2085843067,false +2020,9,3,1595217872,663503321,false +2020,9,3,2039943569,452246674,true +2020,9,3,2015167396,494177977,false +2020,9,4,1560386211,868876818,true +2020,9,4,1944571012,1076192070,false +2020,9,4,39360733,565567027,true +2020,9,4,1319125541,2028647460,true +2020,9,4,1925261483,604304510,true +2020,9,4,973883912,878576794,false +2020,9,4,1791474118,682924926,true +2020,9,4,1951200579,498742051,true +2020,9,4,160472932,1515858614,false +2020,9,4,1504495361,322950262,false +2020,9,5,247712256,428714007,false +2020,9,5,703311332,136712511,true +2020,9,5,2056586691,1185308834,false +2020,9,5,607379846,1760631049,false +2020,9,5,1688862907,471292679,true +2020,9,5,1443039944,1888839933,true +2020,9,5,1068106327,2141183062,true +2020,9,5,2074323132,355856259,true +2020,9,5,1028829679,1787442718,true +2020,9,5,289461570,158698992,false +2020,9,6,520821813,1808472721,false +2020,9,6,204288949,1565863871,true +2020,9,6,1549232104,392007304,false +2020,9,6,1460748600,392999685,false +2020,9,6,1595467812,1066701430,true +2020,9,6,484917642,1065599020,false +2020,9,6,1222608661,933401262,false +2020,9,6,1437527134,986283787,true +2020,9,6,1701473801,6810762,true +2020,9,6,184131254,440362492,true +2020,9,7,2122409997,1532176720,true +2020,9,7,900954287,1160788866,false +2020,9,7,455692948,535709934,true +2020,9,7,1801866166,661937553,true +2020,9,7,2233315,399599072,false +2020,9,7,896130605,845580710,true +2020,9,7,1110544684,1227518174,false +2020,9,7,67674336,304564005,true +2020,9,7,241569451,754421282,false +2020,9,7,1192808410,84593083,false +2020,9,8,1389613593,2117275497,false +2020,9,8,967699328,300364728,false +2020,9,8,645520461,2077748481,true +2020,9,8,107794422,1111269530,false +2020,9,8,1303095537,537939315,false +2020,9,8,546455609,1237442111,true +2020,9,8,1593387914,1558638260,true +2020,9,8,1867960186,1610631036,false +2020,9,8,1073301317,1040411880,true +2020,9,8,860847770,85351691,false +2020,9,9,231199564,453067282,false +2020,9,9,1130198229,733385223,false +2020,9,9,798519180,92028739,true +2020,9,9,1940636160,2075187063,false +2020,9,9,1560809011,1725102898,true +2020,9,9,281306598,1587533463,true +2020,9,9,776254544,729387839,true +2020,9,9,1039121324,1012141014,false +2020,9,9,87924829,1809052486,false +2020,9,9,1280192826,1481123023,false +2020,9,10,1855803287,306640347,true +2020,9,10,1002746404,964407709,true +2020,9,10,1633538933,2056839610,true +2020,9,10,1912255112,1211070047,false +2020,9,10,680074901,112943764,false +2020,9,10,1011687887,1140130764,false +2020,9,10,88634622,618654345,true +2020,9,10,373100137,457229847,false +2020,9,10,980782480,361348831,false +2020,9,10,1434073804,1512636322,true +2020,9,11,439239394,331521503,true +2020,9,11,1056547164,1559705901,true +2020,9,11,1639022611,1729969381,true +2020,9,11,1674295315,2111710081,true +2020,9,11,620589622,823567605,false +2020,9,11,627932666,698926795,true +2020,9,11,1243054423,505479554,true +2020,9,11,183421904,1829616159,true +2020,9,11,1263352912,260151337,true +2020,9,11,657348214,2106722781,true +2020,9,12,211339502,359901568,true +2020,9,12,1213471995,1634401775,false +2020,9,12,1638768140,1784396381,false +2020,9,12,133885273,1966024985,true +2020,9,12,21290930,1562675729,true +2020,9,12,2048135740,1705550031,true +2020,9,12,1828238683,1953535377,false +2020,9,12,794388859,1030357327,true +2020,9,12,1549791469,644991115,false +2020,9,12,764667390,1940095987,true +2020,9,13,361195789,1278981316,true +2020,9,13,390211908,1869479411,false +2020,9,13,1128010698,869508455,false +2020,9,13,1402290130,1268755474,true +2020,9,13,1362839264,1521153800,false +2020,9,13,462614821,1120291817,false +2020,9,13,85465340,2134021756,true +2020,9,13,267210502,922140111,false +2020,9,13,644596681,670513124,false +2020,9,13,268407122,1795904144,true +2020,9,14,578857133,1555238008,false +2020,9,14,860856966,1748340330,false +2020,9,14,1882016386,369576175,true +2020,9,14,1525309121,319856402,false +2020,9,14,1255586986,122355616,true +2020,9,14,849623539,737186984,false +2020,9,14,2052899924,844134570,true +2020,9,14,306191208,1484565994,true +2020,9,14,2146484941,1960484809,false +2020,9,14,1250397186,818759168,false +2020,9,15,1268418599,1870865054,true +2020,9,15,459286963,2117615286,false +2020,9,15,2072638378,486720406,true +2020,9,15,1410760215,1856023559,false +2020,9,15,455859143,180946103,false +2020,9,15,1401104196,2087919167,true +2020,9,15,631089534,934343412,false +2020,9,15,55019810,206683678,true +2020,9,15,47346553,1580094944,true +2020,9,15,574049006,1213764294,true +2020,9,16,371376980,1771186387,false +2020,9,16,1947043769,111692634,false +2020,9,16,659397566,457038303,true +2020,9,16,1051402740,1983239826,false +2020,9,16,2082300025,76720166,false +2020,9,16,731375991,127369396,true +2020,9,16,799603047,1260613234,false +2020,9,16,836536447,1963543735,false +2020,9,16,1609269283,1954709534,false +2020,9,16,1534543764,572872562,false +2020,9,17,972622810,1712311723,true +2020,9,17,1721432414,2087803945,false +2020,9,17,592138589,1126407075,false +2020,9,17,377263813,453933014,false +2020,9,17,2112740771,2136703370,false +2020,9,17,736203009,872291585,true +2020,9,17,1059077942,1318025235,true +2020,9,17,1343517271,770726917,true +2020,9,17,1492684607,138822873,false +2020,9,17,295591864,524370958,false +2020,9,18,2041447723,1883758398,false +2020,9,18,2011643015,1247395642,false +2020,9,18,518039588,1701530956,true +2020,9,18,986310440,1658583199,true +2020,9,18,1612666833,1571242146,true +2020,9,18,300552229,266577196,false +2020,9,18,852646420,1549417176,false +2020,9,18,1023074062,511344784,false +2020,9,18,2012736967,184714724,true +2020,9,18,269205353,614696380,true +2020,9,19,1114055332,1282910292,false +2020,9,19,614964537,1684663956,false +2020,9,19,1892152540,1021804175,true +2020,9,19,1803836436,841050339,false +2020,9,19,204371720,1234671951,true +2020,9,19,1229438146,2071740098,false +2020,9,19,945373052,1035484944,false +2020,9,19,29689035,46333660,false +2020,9,19,537683489,1424965937,false +2020,9,19,1959125133,830704994,true +2020,9,20,467449440,1152716529,false +2020,9,20,1096348592,277807246,true +2020,9,20,664820938,1665326998,false +2020,9,20,157986053,1577931769,false +2020,9,20,194566775,2106835542,false +2020,9,20,1861749314,1317692476,true +2020,9,20,1376680030,584560102,false +2020,9,20,1876907284,74635410,true +2020,9,20,1686135567,676108982,true +2020,9,20,668358263,335229552,true +2020,9,21,926552974,166795833,true +2020,9,21,794252897,738999120,false +2020,9,21,539640905,1862929114,true +2020,9,21,757288216,617809701,true +2020,9,21,1426094433,1555811178,true +2020,9,21,1134444750,1402457650,true +2020,9,21,270707131,661256731,false +2020,9,21,451851613,950373172,false +2020,9,21,59697804,1220553337,true +2020,9,21,183477553,1408738085,true +2020,9,22,2042016103,733766360,false +2020,9,22,327713225,1109225985,false +2020,9,22,1724342517,1636396327,true +2020,9,22,392464210,2088991824,false +2020,9,22,1936913116,467953718,false +2020,9,22,2100040263,1209845606,false +2020,9,22,1745599874,660430285,true +2020,9,22,1084419877,1117096603,true +2020,9,22,1238310571,188104477,true +2020,9,22,194943878,1608227296,true +2020,9,23,1073171495,2062310524,true +2020,9,23,295269705,82617985,false +2020,9,23,892289549,744326767,true +2020,9,23,127643538,1353338105,true +2020,9,23,281725488,554194267,true +2020,9,23,111797953,1279192856,false +2020,9,23,616397523,1296362122,false +2020,9,23,733619124,1991974902,true +2020,9,23,987192011,620013629,true +2020,9,23,167468017,857996127,true +2020,9,24,1877288846,643903488,false +2020,9,24,1327419276,2126558401,false +2020,9,24,172561881,1610628482,true +2020,9,24,419258076,2056653255,false +2020,9,24,262856703,2045182357,true +2020,9,24,991551852,1082499865,true +2020,9,24,1804737782,1326859312,false +2020,9,24,2125659293,1944967684,false +2020,9,24,167428545,1939086715,true +2020,9,24,1995359223,253977592,true +2020,9,25,799236807,1141462354,true +2020,9,25,1835392548,799874325,false +2020,9,25,2140200245,2014428618,false +2020,9,25,231067898,1382883653,false +2020,9,25,1968738436,1037039760,true +2020,9,25,1999029298,266839954,false +2020,9,25,563465902,1187993666,true +2020,9,25,325209483,985478588,true +2020,9,25,1480635845,435734031,false +2020,9,25,1111622294,1742983734,true +2020,9,26,313249758,1103761732,true +2020,9,26,770599777,1054501146,true +2020,9,26,142454417,1871159692,true +2020,9,26,1288427177,383693997,false +2020,9,26,137809281,2135326222,true +2020,9,26,591547251,1063007372,false +2020,9,26,1376807845,1098173108,false +2020,9,26,1809954056,335619743,true +2020,9,26,409226568,961516087,false +2020,9,26,1203387859,1544788826,false +2020,9,27,1842383105,443137734,false +2020,9,27,1467148946,599700221,false +2020,9,27,1978377,1494668394,false +2020,9,27,1631304396,46687184,true +2020,9,27,969006777,1297557356,false +2020,9,27,1167338658,223340940,false +2020,9,27,1575328805,2018981479,false +2020,9,27,1135595835,324058222,false +2020,9,27,1404762107,73028311,true +2020,9,27,936309901,805157551,true +2020,9,28,1003592761,1407003704,false +2020,9,28,37553708,1072404817,true +2020,9,28,513514588,1918890265,false +2020,9,28,1690747486,96717177,false +2020,9,28,802635273,2140397232,true +2020,9,28,1797906869,2048153056,true +2020,9,28,971691880,423957191,true +2020,9,28,1882432208,430694387,false +2020,9,28,106380713,1093702953,true +2020,9,28,1087153431,939550395,false +2020,9,29,332933737,1462150351,false +2020,9,29,500339853,242020779,false +2020,9,29,334465261,1913303678,true +2020,9,29,1501816339,1843275744,true +2020,9,29,1485558902,1062921623,false +2020,9,29,684304550,1133271166,true +2020,9,29,2029856532,773841223,false +2020,9,29,1487342473,66982688,false +2020,9,29,1571708096,259493720,true +2020,9,29,324467283,1589655462,false +2020,9,30,1733644464,82922324,true +2020,9,30,1542252376,2022566594,true +2020,9,30,945277721,2081076154,true +2020,9,30,1639373293,1390293255,false +2020,9,30,969065433,1410332781,true +2020,9,30,1362225522,1555539062,false +2020,9,30,1282095814,1015776432,true +2020,9,30,1782499112,1870256310,true +2020,9,30,1359388819,34844744,false +2020,9,30,957261886,1779179644,true +2020,9,31,1627632519,130254485,false +2020,9,31,1928861333,45781707,false +2020,9,31,1807374716,1855519690,true +2020,9,31,680704563,40106261,true +2020,9,31,339965155,1925703560,true +2020,9,31,1439587928,1008781161,true +2020,9,31,1371178170,671097219,false +2020,9,31,687283378,1376712234,false +2020,9,31,198123449,1909323781,false +2020,9,31,250283558,1362008331,true +2020,10,1,1090033957,1027619172,false +2020,10,1,477537315,1451553084,true +2020,10,1,79849989,620036293,true +2020,10,1,5805698,376473873,false +2020,10,1,754181040,1484174275,false +2020,10,1,217198932,1527854476,true +2020,10,1,675039471,1892348375,true +2020,10,1,163947320,1853273739,false +2020,10,1,2078818590,422375314,false +2020,10,1,1893312527,1029941387,false +2020,10,2,1862057242,1813554944,true +2020,10,2,1584950554,1731175237,false +2020,10,2,1899718626,1658502107,false +2020,10,2,910287610,1976649349,false +2020,10,2,959884587,1873132880,false +2020,10,2,1836685051,1392618259,false +2020,10,2,605539926,1614143615,true +2020,10,2,811278515,1989656457,true +2020,10,2,1307343511,308491781,false +2020,10,2,1005423081,3709278,true +2020,10,3,129399073,1302596143,true +2020,10,3,1105372716,1257128091,true +2020,10,3,556678322,1376215717,false +2020,10,3,1589732527,891357424,false +2020,10,3,1087847907,132494763,false +2020,10,3,1961726532,1313846558,true +2020,10,3,1620889078,1323095638,true +2020,10,3,1249481354,1136359932,true +2020,10,3,1838237218,1326315067,true +2020,10,3,1053433511,108596348,true +2020,10,4,1542264720,1318672513,false +2020,10,4,223700266,1133333105,true +2020,10,4,305913466,1951675243,false +2020,10,4,1391446504,1397772311,false +2020,10,4,1510688561,515970216,true +2020,10,4,744246765,1560149121,true +2020,10,4,200400384,713177611,false +2020,10,4,2068719325,1200592245,false +2020,10,4,1697504019,413500292,true +2020,10,4,1451659863,237205356,true +2020,10,5,2137010679,804404326,true +2020,10,5,66932026,534845607,false +2020,10,5,384086090,548850018,false +2020,10,5,1565300693,387985042,false +2020,10,5,225073590,1702731202,false +2020,10,5,1973561201,1280838394,false +2020,10,5,590905165,1709278656,true +2020,10,5,1723671220,1652710073,true +2020,10,5,1190518489,1520870680,false +2020,10,5,1778653641,1042014690,true +2020,10,6,1837331141,1163022372,true +2020,10,6,2114724956,111915420,false +2020,10,6,304092583,1005351701,true +2020,10,6,43564733,1492267556,false +2020,10,6,487310485,678649404,false +2020,10,6,60355793,376260532,false +2020,10,6,1837085124,400152402,true +2020,10,6,1986901144,1112892226,true +2020,10,6,685083946,261099324,false +2020,10,6,1609831276,989038667,false +2020,10,7,1042152216,1765597391,false +2020,10,7,1293400142,1446299436,false +2020,10,7,198783237,410867827,false +2020,10,7,1035061973,1176137662,true +2020,10,7,1791602432,1817068982,false +2020,10,7,1818855837,1620369957,true +2020,10,7,319821958,2134200112,false +2020,10,7,1058559045,76182546,false +2020,10,7,1334303155,1084554830,false +2020,10,7,1475105892,994897651,false +2020,10,8,2010750459,1673481172,false +2020,10,8,372594171,1892861883,true +2020,10,8,913650679,1575202621,false +2020,10,8,1487313732,72411979,false +2020,10,8,559308265,373645881,true +2020,10,8,1658837244,240509608,false +2020,10,8,883216644,744144642,false +2020,10,8,720232309,60143739,true +2020,10,8,146822554,219884354,false +2020,10,8,702616079,445218194,false +2020,10,9,162474168,1084437170,true +2020,10,9,1946845675,1273437274,true +2020,10,9,1438284775,972626883,false +2020,10,9,1580177912,251882375,true +2020,10,9,2079771134,12299762,true +2020,10,9,1268554154,943566445,false +2020,10,9,38977608,822797460,false +2020,10,9,806744308,760282546,true +2020,10,9,2053521785,1840394806,true +2020,10,9,2050210662,998571355,false +2020,10,10,1228774426,1994823664,false +2020,10,10,1636198773,954654061,false +2020,10,10,1926561909,497187129,false +2020,10,10,279637598,1938847891,false +2020,10,10,233480751,123047956,false +2020,10,10,302241311,1104382089,false +2020,10,10,2056736783,705300156,true +2020,10,10,1205160570,91744828,true +2020,10,10,62240924,1866945301,true +2020,10,10,44537603,1183243229,false +2020,10,11,1497341090,493374851,true +2020,10,11,1193212836,972721502,true +2020,10,11,1911089343,361885697,false +2020,10,11,187359771,976574477,true +2020,10,11,1149609334,1142064013,true +2020,10,11,1480173156,1746537550,true +2020,10,11,1405022607,1904906554,false +2020,10,11,112354054,308153501,false +2020,10,11,1004690596,1313866131,true +2020,10,11,2130340287,320664415,false +2020,10,12,811534376,423903247,true +2020,10,12,699436549,923161910,false +2020,10,12,1391821645,1753041347,false +2020,10,12,1954216529,1540514759,false +2020,10,12,1025405590,1688343964,false +2020,10,12,482624054,278320439,true +2020,10,12,1199328764,1750667710,false +2020,10,12,748065450,1938232015,true +2020,10,12,913631458,1000690916,true +2020,10,12,575051536,1646878080,true +2020,10,13,851352920,1610607568,false +2020,10,13,440317158,501080384,true +2020,10,13,1346168923,1589062189,false +2020,10,13,277082399,1103603835,false +2020,10,13,1126340367,1589144431,true +2020,10,13,455276944,1173133066,true +2020,10,13,1313957159,610339739,false +2020,10,13,1084340736,1473714283,true +2020,10,13,1095543551,1355473143,false +2020,10,13,1258448824,249007926,false +2020,10,14,1369343498,1513486096,false +2020,10,14,1295210399,1307960313,true +2020,10,14,1360285862,1366624419,false +2020,10,14,1945375314,955080718,true +2020,10,14,1703630454,2077886787,true +2020,10,14,1253981873,438459645,true +2020,10,14,1239681228,1071335631,false +2020,10,14,407369426,1580583818,false +2020,10,14,75406035,341649951,false +2020,10,14,321272941,1734688138,false +2020,10,15,538275695,15096685,false +2020,10,15,1569741062,1690665982,true +2020,10,15,1919496976,1623247454,false +2020,10,15,1919964047,2007246652,true +2020,10,15,1774217711,1736801989,false +2020,10,15,1546590260,372412675,false +2020,10,15,804566890,801107559,false +2020,10,15,45861006,1559871686,false +2020,10,15,1502002555,974293361,true +2020,10,15,1823062389,1652252835,true +2020,10,16,1383414652,1074528868,false +2020,10,16,1845377308,573906359,false +2020,10,16,828611003,165051800,true +2020,10,16,558506832,1305792713,false +2020,10,16,306799496,1013461928,false +2020,10,16,423156448,1681022092,true +2020,10,16,1387324351,2783082,false +2020,10,16,1349600922,207731943,false +2020,10,16,1697396741,1637894976,true +2020,10,16,96615792,1591387900,false +2020,10,17,531254568,782897885,false +2020,10,17,1203515534,286317545,true +2020,10,17,1784701801,1183270360,false +2020,10,17,1830890385,453116031,false +2020,10,17,1509458837,1017226406,true +2020,10,17,1257456168,1443101275,true +2020,10,17,1707918904,1463100624,true +2020,10,17,1376946099,1011715873,false +2020,10,17,1367147601,1049879163,true +2020,10,17,2028898503,1899355350,false +2020,10,18,190745933,65189023,false +2020,10,18,1626161063,302860934,false +2020,10,18,1659985979,2090849927,false +2020,10,18,627667971,2010273456,false +2020,10,18,529348201,1313162344,true +2020,10,18,166507427,432028989,false +2020,10,18,389217269,620313134,true +2020,10,18,1986749643,706633956,false +2020,10,18,317796090,661359322,true +2020,10,18,607857784,1477624905,false +2020,10,19,1451661393,936023042,false +2020,10,19,1739706819,287587647,false +2020,10,19,129890605,286798925,false +2020,10,19,539180306,2076866798,true +2020,10,19,251473657,351909126,true +2020,10,19,1135344861,1524232653,false +2020,10,19,1457324855,1236129307,false +2020,10,19,886582065,2139336889,false +2020,10,19,1722675037,1235513634,true +2020,10,19,693597750,1634117764,true +2020,10,20,295607519,1177764668,true +2020,10,20,34257766,966836179,false +2020,10,20,1777297378,906276583,true +2020,10,20,367519796,2059779887,false +2020,10,20,1924839248,872840741,true +2020,10,20,574684241,1871931783,true +2020,10,20,2002528073,738208388,true +2020,10,20,1507584221,1334612893,false +2020,10,20,803953438,952091618,true +2020,10,20,400878873,1895639825,true +2020,10,21,528310877,1580828680,true +2020,10,21,1479789690,1984583690,true +2020,10,21,185008279,869100658,true +2020,10,21,527899437,37964969,true +2020,10,21,2024827361,507875546,true +2020,10,21,1563342188,622296013,true +2020,10,21,1885943372,666147939,true +2020,10,21,747082553,1801446608,true +2020,10,21,159513785,435206263,false +2020,10,21,1836932430,951350340,false +2020,10,22,1592541663,1776419759,false +2020,10,22,1883816300,2033269211,true +2020,10,22,1888272984,981877867,false +2020,10,22,1350652749,676196861,true +2020,10,22,1844881166,1154557398,true +2020,10,22,519904488,864890301,true +2020,10,22,824605033,285979537,false +2020,10,22,758295492,1524551206,false +2020,10,22,510378240,1380443965,true +2020,10,22,2020194954,858693658,true +2020,10,23,1813096130,111165583,true +2020,10,23,988110108,527842359,false +2020,10,23,2112281040,800545734,false +2020,10,23,907879131,107934345,true +2020,10,23,2067779717,1676254092,false +2020,10,23,1654834658,950209503,false +2020,10,23,785149016,2126647984,false +2020,10,23,603653975,1155930920,true +2020,10,23,598905240,1424766773,false +2020,10,23,1415780394,1026230060,false +2020,10,24,2108521536,1421040560,true +2020,10,24,458928861,247460198,true +2020,10,24,1626870088,1528743152,false +2020,10,24,1516611107,1363654910,true +2020,10,24,326447175,916078929,true +2020,10,24,174018934,655728909,false +2020,10,24,1188670266,1161149535,false +2020,10,24,1836469244,69216173,true +2020,10,24,2116098768,704525449,true +2020,10,24,1732668235,1999007630,false +2020,10,25,2010764178,1045236247,false +2020,10,25,1146712822,291590138,true +2020,10,25,362339504,1560454557,false +2020,10,25,226111201,354631570,true +2020,10,25,642810726,1811628450,true +2020,10,25,1478453706,1458400308,true +2020,10,25,970591798,1368000056,false +2020,10,25,1199728279,1386512062,false +2020,10,25,982759781,1471868595,false +2020,10,25,1677921854,1994150827,true +2020,10,26,22649468,650113222,true +2020,10,26,1354249570,1997695430,true +2020,10,26,1323555524,322167208,true +2020,10,26,49742851,649428570,true +2020,10,26,1282668782,681931030,true +2020,10,26,455472839,662950581,false +2020,10,26,1242169450,327652166,false +2020,10,26,878703825,1780906192,true +2020,10,26,1665790233,1191815728,false +2020,10,26,880664090,1243600251,false +2020,10,27,402160732,756612016,true +2020,10,27,1039242693,163501152,false +2020,10,27,12047541,1241798003,false +2020,10,27,1875849462,220407480,false +2020,10,27,1015817067,1251571881,true +2020,10,27,1199009275,699195274,true +2020,10,27,493968373,1389287155,false +2020,10,27,198916348,1804200135,true +2020,10,27,872883996,1857134347,true +2020,10,27,1606523496,1901376169,true +2020,10,28,1140101134,2046550802,true +2020,10,28,249579727,1229665039,false +2020,10,28,121932760,581309320,false +2020,10,28,1161366266,1059413213,false +2020,10,28,1866100742,684519773,true +2020,10,28,60977232,1900787590,false +2020,10,28,1932887893,488490656,true +2020,10,28,973468915,1015430460,true +2020,10,28,1561082815,1610610857,false +2020,10,28,1202109463,770781831,false +2020,10,29,473453380,2112892898,false +2020,10,29,7763586,676910132,true +2020,10,29,670077251,874983535,true +2020,10,29,725556275,1557172437,false +2020,10,29,578961469,1709139047,false +2020,10,29,2041751459,60397892,false +2020,10,29,1845217362,1404031844,true +2020,10,29,1483361826,613218287,true +2020,10,29,2104300775,1050354827,true +2020,10,29,604410936,660906204,false +2020,10,30,800274474,1584144221,true +2020,10,30,1089234451,1810242711,false +2020,10,30,1144653164,530529519,true +2020,10,30,945937257,776543889,true +2020,10,30,115561366,1593839202,true +2020,10,30,1168233307,1483248542,false +2020,10,30,1141559068,582014636,true +2020,10,30,2103439417,279515014,true +2020,10,30,1594148666,2100790544,true +2020,10,30,1277688885,477300978,false +2020,10,31,1219216481,237736665,false +2020,10,31,578500458,891328791,true +2020,10,31,701826589,1783624245,false +2020,10,31,332123327,1301252330,false +2020,10,31,1888474339,926453056,true +2020,10,31,1698201234,1725622215,false +2020,10,31,1096478840,1994300326,true +2020,10,31,847238449,532170704,false +2020,10,31,1884413436,967746140,true +2020,10,31,761235582,1418762860,true +2020,11,1,250642622,1383601251,true +2020,11,1,900256608,1715704752,false +2020,11,1,1198313310,1375507607,false +2020,11,1,972597960,669910426,false +2020,11,1,1532406508,1087674351,false +2020,11,1,1276691315,2049104992,true +2020,11,1,451309571,375057602,true +2020,11,1,1745270964,1989008205,true +2020,11,1,1083234835,1814054633,false +2020,11,1,366334948,1696398801,true +2020,11,2,60739414,719106517,false +2020,11,2,1259321168,1473184474,false +2020,11,2,1636415523,968196335,true +2020,11,2,244046010,1693726869,false +2020,11,2,2031660704,520836072,false +2020,11,2,2008946379,768781592,false +2020,11,2,1200827386,854251445,false +2020,11,2,1879316997,1581024432,false +2020,11,2,1193111221,248211057,false +2020,11,2,2004315413,404877514,true +2020,11,3,2028976557,941971297,false +2020,11,3,771266970,927410852,true +2020,11,3,1624853983,580191680,true +2020,11,3,1375006137,1276672848,false +2020,11,3,402370838,373497898,false +2020,11,3,860695569,1177865911,false +2020,11,3,1731396107,1070859192,true +2020,11,3,485825271,1987846543,false +2020,11,3,511760031,1995541808,false +2020,11,3,359268604,607607432,false +2020,11,4,1718876979,1489582032,true +2020,11,4,994405413,1658805534,false +2020,11,4,1048539725,14509214,true +2020,11,4,1684043369,1636921554,true +2020,11,4,18143755,1724354122,false +2020,11,4,1223025992,214892979,false +2020,11,4,797176080,1310903550,true +2020,11,4,1357665179,2083608447,false +2020,11,4,1458300656,1633989298,false +2020,11,4,52943855,1553577677,true +2020,11,5,2008581194,1530266588,false +2020,11,5,1804125956,1143047153,true +2020,11,5,693212429,74346906,false +2020,11,5,1722049871,435814818,true +2020,11,5,1082551197,1743589791,true +2020,11,5,1962001287,1471231372,true +2020,11,5,1112628688,342568340,true +2020,11,5,1526606245,1381100363,false +2020,11,5,1510659322,1688073892,false +2020,11,5,1741134060,1399573128,true +2020,11,6,116097164,291308400,false +2020,11,6,1792383176,1522110254,true +2020,11,6,2099216103,1470893469,true +2020,11,6,937681827,569752309,false +2020,11,6,1862693488,145521079,true +2020,11,6,107330573,1344282029,false +2020,11,6,327125265,1369612073,false +2020,11,6,92549054,1542612495,false +2020,11,6,1093995176,1959319400,true +2020,11,6,908796256,627646221,true +2020,11,7,786767634,676979380,false +2020,11,7,1409280809,1651383749,false +2020,11,7,1361622738,1785822706,true +2020,11,7,2046663584,1627783849,false +2020,11,7,1830611332,702474384,false +2020,11,7,1955365676,1026301060,true +2020,11,7,1438991267,737993330,true +2020,11,7,493409339,491669080,false +2020,11,7,1178104134,523118933,false +2020,11,7,1625016611,1676169545,true +2020,11,8,2073429466,697238063,true +2020,11,8,1292823050,2030486044,true +2020,11,8,1300291854,1652799094,false +2020,11,8,1852177020,2086219239,false +2020,11,8,840593254,1699376723,false +2020,11,8,800823663,1013113868,false +2020,11,8,1406128432,240338980,false +2020,11,8,624035825,1892017045,true +2020,11,8,1916951763,727468778,true +2020,11,8,700819886,714139779,true +2020,11,9,1422133678,774218856,true +2020,11,9,1261172744,848935355,true +2020,11,9,1254139643,1922358560,false +2020,11,9,2007997321,1960073867,true +2020,11,9,1311008384,719330172,true +2020,11,9,1792180424,1270704894,false +2020,11,9,1339344106,172721335,false +2020,11,9,652727582,43018372,false +2020,11,9,1676364325,1839076714,false +2020,11,9,1752839357,1189852298,false +2020,11,10,1611827730,1356217544,false +2020,11,10,240371461,1169021209,false +2020,11,10,1110002282,1753476329,false +2020,11,10,534109999,1424917408,true +2020,11,10,1388399792,35782587,true +2020,11,10,758695590,1824435301,false +2020,11,10,1919812474,420320115,true +2020,11,10,589360031,955673952,true +2020,11,10,787087486,322611849,false +2020,11,10,1800968664,1348711777,false +2020,11,11,720518439,1687337643,true +2020,11,11,158463106,1800297687,false +2020,11,11,542511515,1963866928,false +2020,11,11,972484544,537311122,false +2020,11,11,2090258040,1531805422,false +2020,11,11,1503712578,1841931084,true +2020,11,11,1909918648,1236305527,true +2020,11,11,1748517608,1194710256,true +2020,11,11,540011379,1767886647,false +2020,11,11,2114771826,313710280,true +2020,11,12,1584486787,2065285348,true +2020,11,12,1998708894,1226004505,true +2020,11,12,365218093,1969163832,true +2020,11,12,1521283920,1945512676,true +2020,11,12,1437771865,2081126139,false +2020,11,12,1874203024,1855923895,false +2020,11,12,1470612495,169126071,true +2020,11,12,91024257,922890199,true +2020,11,12,785824807,716492984,true +2020,11,12,1098293157,2002610223,false +2020,11,13,1104376939,1386585382,false +2020,11,13,1211116552,279323583,false +2020,11,13,1520680865,202914895,false +2020,11,13,1313027359,1716006027,false +2020,11,13,437083455,1181888647,false +2020,11,13,1596161159,223167024,false +2020,11,13,425286824,1046076451,true +2020,11,13,1263554799,927997866,false +2020,11,13,690954568,33180798,false +2020,11,13,1419031995,463126011,false +2020,11,14,149694709,1111446533,true +2020,11,14,186043479,3314855,false +2020,11,14,1206086219,220873400,true +2020,11,14,1217428018,333074753,false +2020,11,14,206648074,1753665967,false +2020,11,14,450571718,1113159047,false +2020,11,14,56356511,182907260,true +2020,11,14,1430999631,1010143160,true +2020,11,14,1729346394,1905552512,true +2020,11,14,487015665,387179302,true +2020,11,15,1598701823,1388283279,true +2020,11,15,1830602624,174483260,false +2020,11,15,1292985784,210385016,false +2020,11,15,692571182,1140625764,false +2020,11,15,871716495,1368806733,true +2020,11,15,1926257341,2113108633,true +2020,11,15,1028091645,24160375,false +2020,11,15,829623061,387263371,true +2020,11,15,534900525,1243189509,true +2020,11,15,881422917,1814250135,false +2020,11,16,2085991166,803347378,true +2020,11,16,1743566426,1631341828,false +2020,11,16,241978362,1866974357,true +2020,11,16,1135385237,1862213959,false +2020,11,16,1744998062,1140175479,true +2020,11,16,382109924,1181604412,false +2020,11,16,1847975976,456877638,true +2020,11,16,611432106,2100810638,false +2020,11,16,886893199,1294648866,false +2020,11,16,1736725043,1978910160,true +2020,11,17,84535301,273936106,false +2020,11,17,1563425655,264301807,true +2020,11,17,1247279454,1377280095,false +2020,11,17,795702769,1301253632,false +2020,11,17,1944588873,1003108148,true +2020,11,17,1130222249,1573472670,true +2020,11,17,1151760127,1054559650,true +2020,11,17,1291131094,1509097043,false +2020,11,17,805087132,202446424,false +2020,11,17,1508114586,1184015042,true +2020,11,18,1450275750,789292772,false +2020,11,18,490567362,148439430,false +2020,11,18,802282604,1280474596,false +2020,11,18,1560586644,1288876297,false +2020,11,18,327743304,1028517536,false +2020,11,18,642915861,1979027387,false +2020,11,18,427178221,103059165,false +2020,11,18,49549896,1325173106,false +2020,11,18,555419969,909745421,false +2020,11,18,413666385,1342571744,false +2020,11,19,839926641,343499806,true +2020,11,19,815413133,1431380055,false +2020,11,19,1947723847,1758607261,true +2020,11,19,302114803,2127129437,true +2020,11,19,337631467,2081394157,true +2020,11,19,789158237,941760186,false +2020,11,19,191921807,1343115190,false +2020,11,19,574739343,433227522,true +2020,11,19,680648638,1736997599,true +2020,11,19,1934152973,143291857,true +2020,11,20,1577999934,231146672,true +2020,11,20,20551210,1396419631,false +2020,11,20,292241610,755321326,true +2020,11,20,106556109,2051300564,true +2020,11,20,1006817810,458007823,false +2020,11,20,1886196957,881241926,false +2020,11,20,427578041,1644282328,true +2020,11,20,708004448,678733062,false +2020,11,20,1807011410,1039813378,true +2020,11,20,775575975,933328974,true +2020,11,21,1971725191,1297639867,true +2020,11,21,820249373,2004848642,true +2020,11,21,1728826867,565940252,false +2020,11,21,641049491,912393961,false +2020,11,21,1052724305,271187694,false +2020,11,21,815641382,1032033352,false +2020,11,21,1710745612,2144121938,true +2020,11,21,637484043,1241720,true +2020,11,21,1190542175,1913504802,false +2020,11,21,18616807,1613413384,true +2020,11,22,1132833148,1347343413,false +2020,11,22,61607291,1260106380,true +2020,11,22,1642799220,615998906,false +2020,11,22,1209741777,56855086,false +2020,11,22,891715290,819399280,false +2020,11,22,71907114,682507153,true +2020,11,22,1344530261,341995368,false +2020,11,22,1921176920,1870565807,false +2020,11,22,20757149,1143511173,false +2020,11,22,1531845205,502119350,true +2020,11,23,1538321560,1980927753,true +2020,11,23,1373532971,1617308434,true +2020,11,23,780960653,1326412484,true +2020,11,23,943168653,1635288096,true +2020,11,23,1499027631,2018713794,false +2020,11,23,1357301171,1231680578,true +2020,11,23,573650794,252824461,false +2020,11,23,726309715,1373727311,true +2020,11,23,1991396660,2144845915,true +2020,11,23,1233846735,1000867474,false +2020,11,24,1902185947,682392007,false +2020,11,24,1457179890,833826256,false +2020,11,24,167530615,611507164,false +2020,11,24,307040462,1857687314,true +2020,11,24,1071581081,1598510752,false +2020,11,24,1152289436,1146269221,false +2020,11,24,1770431302,629572454,false +2020,11,24,1260743085,2117383441,false +2020,11,24,1472034268,2017717894,true +2020,11,24,678379465,213581042,true +2020,11,25,1074294012,1956573780,false +2020,11,25,1522558354,1090255603,false +2020,11,25,54633990,2008686950,false +2020,11,25,1841920716,937465961,true +2020,11,25,1746659789,1787661714,true +2020,11,25,1459628937,1538229557,true +2020,11,25,1788050831,803442213,true +2020,11,25,1009703606,13064695,false +2020,11,25,1158678353,610161469,true +2020,11,25,1183027070,1807610814,false +2020,11,26,1823085105,522184028,false +2020,11,26,1750554224,4252976,true +2020,11,26,1893276495,1126233980,true +2020,11,26,667123835,658946409,false +2020,11,26,1238927956,723615575,true +2020,11,26,1709611284,236954361,true +2020,11,26,944349387,2029660831,false +2020,11,26,1952834083,1024859600,false +2020,11,26,1992085948,494063952,true +2020,11,26,926773132,1504482549,true +2020,11,27,1982342994,143180918,true +2020,11,27,1206775150,2046252378,true +2020,11,27,341569138,761437673,true +2020,11,27,398361454,865897778,true +2020,11,27,1299177605,1990054131,false +2020,11,27,1597469862,60087175,true +2020,11,27,640549360,146668776,true +2020,11,27,1793646358,1088132627,true +2020,11,27,1498249188,2060827139,true +2020,11,27,627812003,1197964156,true +2020,11,28,217989776,1993643574,false +2020,11,28,300021510,1401656961,false +2020,11,28,544874717,1191971497,true +2020,11,28,1654336835,1812509386,false +2020,11,28,907114807,55880730,false +2020,11,28,513553530,1657110094,false +2020,11,28,1471350549,287999837,false +2020,11,28,723698477,786258564,true +2020,11,28,88067902,1473892530,false +2020,11,28,1626750288,1112180457,true +2020,11,29,1450413302,733452197,true +2020,11,29,1286905031,364495488,true +2020,11,29,1188362401,651322034,false +2020,11,29,742028567,362632265,true +2020,11,29,269288570,688340209,true +2020,11,29,1771192436,1578674925,false +2020,11,29,2002519660,2145725793,true +2020,11,29,462254965,372310547,true +2020,11,29,334941751,5051067,true +2020,11,29,460302867,671270934,false +2020,11,30,1144679778,679593938,false +2020,11,30,1868393407,1052666126,false +2020,11,30,553519786,1816466400,true +2020,11,30,283235765,1007991925,false +2020,11,30,879234287,1375965023,true +2020,11,30,1040433619,782117066,true +2020,11,30,2048149960,1979457849,true +2020,11,30,2036338077,544060668,false +2020,11,30,484983376,1292392811,true +2020,11,30,1002784825,327172492,true +2020,11,31,388031683,78677373,true +2020,11,31,1879086974,1241060255,true +2020,11,31,489313281,1318139992,true +2020,11,31,1802822522,1581271676,false +2020,11,31,1521020442,1259205211,false +2020,11,31,997629168,1421464614,true +2020,11,31,1684390247,41217991,false +2020,11,31,1852804800,1160220690,true +2020,11,31,1822286820,1802482388,true +2020,11,31,1437625140,952014642,false \ No newline at end of file diff --git a/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleCompositeHandler.java b/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleCompositeHandler.java new file mode 100644 index 0000000000..a9a6cb199a --- /dev/null +++ b/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-example + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.example; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose ExampleMetadataHandler and ExampleRecordHandler. + */ +public class ExampleCompositeHandler + extends CompositeHandler +{ + public ExampleCompositeHandler() + { + super(new ExampleMetadataHandler(), new ExampleRecordHandler()); + } +} diff --git a/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleMetadataHandler.java b/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleMetadataHandler.java new file mode 100644 index 0000000000..e2a1462ea5 --- /dev/null +++ b/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleMetadataHandler.java @@ -0,0 +1,283 @@ +/*- + * #%L + * athena-example + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.example; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import org.apache.arrow.vector.complex.reader.FieldReader; +//DO NOT REMOVE - this will not be _unused_ when customers go through the tutorial and uncomment +//the TODOs +import org.apache.arrow.vector.types.Types; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Set; + +/** + * This class is part of an tutorial that will walk you through how to build a connector for your + * custom data source. The README for this module (athena-example) will guide you through preparing + * your development environment, modifying this example Metadatahandler, building, deploying, and then + * using your new source in an Athena query. + *

+ * More specifically, this class is responsible for providing Athena with metadata about the schemas (aka databases), + * tables, and table partitions that your source contains. Lastly, this class tells Athena how to split up reads against + * this source. This gives you control over the level of performance and parallelism your source can support. + *

+ * For more examples, please see the other connectors in this repository (e.g. athena-cloudwatch, athena-docdb, etc...) + */ +public class ExampleMetadataHandler + extends MetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleMetadataHandler.class); + + /** + * used to aid in debugging. Athena will use this name in conjunction with your catalog id + * to correlate relevant query errors. + */ + private static final String SOURCE_TYPE = "example"; + + public ExampleMetadataHandler() + { + super(SOURCE_TYPE); + } + + /** + * Used to get the list of schemas (aka databases) that this source contains. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @return A ListSchemasResponse which primarily contains a Set of schema names and a catalog name + * corresponding the Athena catalog that was queried. + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator allocator, ListSchemasRequest request) + { + logger.info("doListSchemaNames: enter - " + request); + + Set schemas = new HashSet<>(); + + /** + * TODO: Add schemas, example below + * + schemas.add("schema1"); + * + */ + + return new ListSchemasResponse(request.getCatalogName(), schemas); + } + + /** + * Used to get the list of tables that this source contains. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog and database they are querying. + * @return A ListTablesResponse which primarily contains a List enumerating the tables in this + * catalog, database tuple. It also contains the catalog name corresponding the Athena catalog that was queried. + */ + @Override + public ListTablesResponse doListTables(BlockAllocator allocator, ListTablesRequest request) + { + logger.info("doListTables: enter - " + request); + + List tables = new ArrayList<>(); + + /** + * TODO: Add tables for the requested schema, example below + * + tables.add(new TableName(request.getSchemaName(), "table1")); + * + */ + + return new ListTablesResponse(request.getCatalogName(), tables); + } + + /** + * Used to get definition (field names, types, descriptions, etc...) of a Table. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog, database, and table they are querying. + * @return A GetTableResponse which primarily contains: + * 1. An Apache Arrow Schema object describing the table's columns, types, and descriptions. + * 2. A Set of partition column names (or empty if the table isn't partitioned). + * 3. A TableName object confirming the schema and table name the response is for. + * 4. A catalog name corresponding the Athena catalog that was queried. + */ + @Override + public GetTableResponse doGetTable(BlockAllocator allocator, GetTableRequest request) + { + logger.info("doGetTable: enter - " + request); + + Set partitionColNames = new HashSet<>(); + + /** + * TODO: Add partitions columns, example below. + * + partitionColNames.add("year"); + partitionColNames.add("month"); + partitionColNames.add("day"); + * + */ + + SchemaBuilder tableSchemaBuilder = SchemaBuilder.newBuilder(); + + /** + * TODO: Generate a schema for the requested table. + * + tableSchemaBuilder.addIntField("year") + .addIntField("month") + .addIntField("day") + .addStringField("account_id") + .addStructField("transaction") + .addChildField("transaction", "id", Types.MinorType.INT.getType()) + .addChildField("transaction", "completed", Types.MinorType.BIT.getType()) + //Metadata who's name matches a column name + //is interpreted as the description of that + //column when you run "show tables" queries. + .addMetadata("year", "The year that the payment took place in.") + .addMetadata("month", "The month that the payment took place in.") + .addMetadata("day", "The day that the payment took place in.") + .addMetadata("account_id", "The account_id used for this payment.") + .addMetadata("transaction", "The payment transaction details.") + //This metadata field is for our own use, Athena will ignore and pass along fields it doesn't expect. + //we will use this later when we implement doGetTableLayout(...) + .addMetadata("partitionCols", "year,month,day"); + * + */ + + return new GetTableResponse(request.getCatalogName(), + request.getTableName(), + tableSchemaBuilder.build(), + partitionColNames); + } + + /** + * Used to get the partitions that must be read from the request table in order to satisfy the requested predicate. + * + * @param blockWriter Used to write rows (partitions) into the Apache Arrow response. + * @param request Provides details of the catalog, database, and table being queried as well as any filter predicate. + * @param queryStatusChecker A QueryStatusChecker that you can use to stop doing work for a query that has already terminated + * @note Partitions are partially opaque to Amazon Athena in that it only understands your partition columns and + * how to filter out partitions that do not meet the query's constraints. Any additional columns you add to the + * partition data are ignored by Athena but passed on to calls on GetSplits. + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + for (int year = 2000; year < 2018; year++) { + for (int month = 1; month < 12; month++) { + for (int day = 1; day < 31; day++) { + + final int yearVal = year; + final int monthVal = month; + final int dayVal = day; + /** + * TODO: If the partition represented by this year,month,day offer the values to the block + * and check if they all passed constraints. The Block has been configured to automatically + * apply our partition pruning constraints. + * + blockWriter.writeRows((Block block, int row) -> { + boolean matched = true; + matched &= block.setValue("year", row, yearVal); + matched &= block.setValue("month", row, monthVal); + matched &= block.setValue("day", row, dayVal); + //If all fields matches then we wrote 1 row during this call so we return 1 + return matched ? 1 : 0; + }); + * + */ + } + } + } + } + + /** + * Used to split-up the reads required to scan the requested batch of partition(s). + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details of the catalog, database, table, andpartition(s) being queried as well as + * any filter predicate. + * @return A GetSplitsResponse which primarily contains: + * 1. A Set which represent read operations Amazon Athena must perform by calling your read function. + * 2. (Optional) A continuation token which allows you to paginate the generation of splits for large queries. + * @note A Split is a mostly opaque object to Amazon Athena. Amazon Athena will use the optional SpillLocation and + * optional EncryptionKey for pipelined reads but all properties you set on the Split are passed to your read + * function to help you perform the read. + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + { + logger.info("doGetSplits: enter - " + request); + + String catalogName = request.getCatalogName(); + Set splits = new HashSet<>(); + + Block partitions = request.getPartitions(); + + FieldReader day = partitions.getFieldReader("day"); + FieldReader month = partitions.getFieldReader("month"); + FieldReader year = partitions.getFieldReader("year"); + for (int i = 0; i < partitions.getRowCount(); i++) { + //Set the readers to the partition row we area on + year.setPosition(i); + month.setPosition(i); + day.setPosition(i); + + /** + * TODO: For each partition in the request, create 1 or more splits. Splits + * are parallelizable units of work. Each represents a part of your table + * that needs to be read for the query. Splits are opaque to Athena aside from the + * spill location and encryption key. All properties added to a split are solely + * for your use when Athena calls your readWithContraints(...) function to perform + * the read. In this example we just need to know the partition details (year, month, day). + * + Split split = Split.newBuilder(makeSpillLocation(request), makeEncryptionKey()) + .add("year", String.valueOf(year.readInteger())) + .add("month", String.valueOf(month.readInteger())) + .add("day", String.valueOf(day.readInteger())) + .build(); + + splits.add(split); + * + */ + } + + logger.info("doGetSplits: exit - " + splits.size()); + return new GetSplitsResponse(catalogName, splits); + } +} diff --git a/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleRecordHandler.java b/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleRecordHandler.java new file mode 100644 index 0000000000..605350612b --- /dev/null +++ b/athena-example/src/main/java/com/amazonaws/connectors/athena/example/ExampleRecordHandler.java @@ -0,0 +1,191 @@ +/*- + * #%L + * athena-example + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.example; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.BufferedReader; +import java.io.IOException; +import java.io.InputStreamReader; + +import static java.lang.String.format; + +/** + * This class is part of an tutorial that will walk you through how to build a connector for your + * custom data source. The README for this module (athena-example) will guide you through preparing + * your development environment, modifying this example RecordHandler, building, deploying, and then + * using your new source in an Athena query. + *

+ * More specifically, this class is responsible for providing Athena with actual rows level data from your source. Athena + * will call readWithConstraint(...) on this class for each 'Split' you generated in ExampleMetadataHandler. + *

+ * For more examples, please see the other connectors in this repository (e.g. athena-cloudwatch, athena-docdb, etc...) + */ +public class ExampleRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleRecordHandler.class); + + /** + * used to aid in debugging. Athena will use this name in conjunction with your catalog id + * to correlate relevant query errors. + */ + private static final String SOURCE_TYPE = "example"; + + private AmazonS3 amazonS3; + + public ExampleRecordHandler() + { + super(AmazonS3ClientBuilder.defaultClient(), AWSSecretsManagerClientBuilder.defaultClient(), AmazonAthenaClientBuilder.defaultClient(), SOURCE_TYPE); + this.amazonS3 = AmazonS3ClientBuilder.standard().build(); + } + + /** + * Used to read the row data associated with the provided Split. + * + * @param spiller A BlockSpiller that should be used to write the row data associated with this Split. + * The BlockSpiller automatically handles chunking the response, encrypting, and spilling to S3. + * @param recordsRequest Details of the read request, including: + * 1. The Split + * 2. The Catalog, Database, and Table the read request is for. + * 3. The filtering predicate (if any) + * 4. The columns required for projection. + * @param queryStatusChecker A QueryStatusChecker that you can use to stop doing work for a query that has already terminated + * @throws IOException + * @note Avoid writing >10 rows per-call to BlockSpiller.writeRow(...) because this will limit the BlockSpiller's + * ability to control Block size. The resulting increase in Block size may cause failures and reduced performance. + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + throws IOException + { + logger.info("readWithConstraint: enter - " + recordsRequest.getSplit()); + + Split split = recordsRequest.getSplit(); + int splitYear = 0; + int splitMonth = 0; + int splitDay = 0; + + /** + * TODO: Extract information about what we need to read from the split. If you are following the tutorial + * this is basically the partition column values for year, month, day. + * + splitYear = split.getPropertyAsInt("year"); + splitMonth = split.getPropertyAsInt("month"); + splitDay = split.getPropertyAsInt("day"); + * + */ + + String dataBucket = null; + /** + * TODO: Get the data bucket from the env variable set by athena-example.yaml + * + dataBucket = System.getenv("data_bucket"); + * + */ + + String dataKey = format("%s/%s/%s/sample_data.csv", splitYear, splitMonth, splitDay); + + BufferedReader s3Reader = openS3File(dataBucket, dataKey); + if (s3Reader == null) { + //There is no data to read for this split. + return; + } + + //We read the transaction data line by line from our S3 object. + String line; + while ((line = s3Reader.readLine()) != null) { + logger.info("readWithConstraint: processing line " + line); + + //The sample_data.csv file is structured as year,month,day,account_id,transaction.id,transaction.complete + String[] lineParts = line.split(","); + + //We use the provided BlockSpiller to write our row data into the response. This utility is provided by + //the Amazon Athena Query Federation SDK and automatically handles breaking the data into reasonably sized + //chunks, encrypting it, and spilling to S3 if we've enabled these features. + spiller.writeRows((Block block, int rowNum) -> { + boolean rowMatched = true; + + int year = Integer.parseInt(lineParts[0]); + int month = Integer.parseInt(lineParts[1]); + int day = Integer.parseInt(lineParts[2]); + String accountId = lineParts[3]; + int transactionId = Integer.parseInt(lineParts[4]); + boolean transactionComplete = Boolean.parseBoolean(lineParts[5]); + + /** + * TODO: Write the data using the supplied Block and check if the writes passed all constraints + * before retuning how many rows we wrote. + * + rowMatched &= block.offerValue("year", rowNum, year); + rowMatched &= block.offerValue("month", rowNum, month); + rowMatched &= block.offerValue("day", rowNum, day); + + //For complex types like List and Struct, we can build a Map to conveniently set nested values + Map eventMap = new HashMap<>(); + eventMap.put("id", transactionId); + eventMap.put("completed", transactionComplete); + + rowMatched &= block.offerComplexValue("transaction", rowNum, FieldResolver.DEFAULT, eventMap); + * + */ + + /** + * TODO: The account_id field is a sensitive field, so we'd like to mask it to the last 4 before + * returning it to Athena. Note that this will mean you can only filter (where/having) + * on the masked value from Athena. + * + String maskedAcctId = accountId.length() > 4 ? accountId.substring(accountId.length() - 4) : accountId; + rowMatched &= block.offerValue("account_id", rowNum, maskedAcctId); + * + */ + + //We return the number of rows written for this invocation. In our case 1 or 0. + return rowMatched ? 1 : 0; + }); + } + } + + /** + * Helper function for checking the existence of and opening S3 Objects for read. + */ + private BufferedReader openS3File(String bucket, String key) + { + logger.info("openS3File: opening file " + bucket + ":" + key); + if (amazonS3.doesObjectExist(bucket, key)) { + S3Object obj = amazonS3.getObject(bucket, key); + logger.info("openS3File: opened file " + bucket + ":" + key); + return new BufferedReader(new InputStreamReader(obj.getObjectContent())); + } + return null; + } +} diff --git a/athena-federation-sdk-tools/LICENSE.txt b/athena-federation-sdk-tools/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-federation-sdk-tools/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-federation-sdk-tools/README.md b/athena-federation-sdk-tools/README.md new file mode 100644 index 0000000000..5f775a8aa6 --- /dev/null +++ b/athena-federation-sdk-tools/README.md @@ -0,0 +1,9 @@ +# Amazon Athena Query Federation SDK Tools + +This module contains a collection of tools that are helpful in developing and testing Athena Query Federation components such as connectors. A detailed list +of the tools that can be found in this module can be found below. + +* **Connector Validator** - A runnable class which emulates the calls that Athena will make to your Lambda function as part of executing a + select * from .

where . The goal of this tool is to help your troubleshoot connectors by giving you visibility of what 'Athena' would + see. You can run this tool by using the helper script in the tools directory + ../tools/validate_connector.sh --lambda-func lambda_func [--record-func record_func] [--catalog catalog] [--schema schema [--table table [--constraints constraints]]] [--planning-only] [--help] \ No newline at end of file diff --git a/athena-federation-sdk-tools/pom.xml b/athena-federation-sdk-tools/pom.xml new file mode 100644 index 0000000000..6e7572af53 --- /dev/null +++ b/athena-federation-sdk-tools/pom.xml @@ -0,0 +1,92 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-federation-sdk-tools + jar + Amazon Athena Query Federation SDK Tools + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + log4j + log4j + 1.2.17 + + + + commons-cli + commons-cli + 1.4 + + + + + + + org.apache.maven.plugins + maven-checkstyle-plugin + 3.1.0 + + checkstyle.xml + UTF-8 + true + false + false + + + + validate + validate + + check + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.1.1 + + true + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + withdep + package + + shade + + + + withdep + + + + + + + \ No newline at end of file diff --git a/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/ConnectorValidator.java b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/ConnectorValidator.java new file mode 100644 index 0000000000..91a0966446 --- /dev/null +++ b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/ConnectorValidator.java @@ -0,0 +1,528 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK Tools + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.validation; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.google.common.collect.Sets; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.cli.CommandLine; +import org.apache.commons.cli.DefaultParser; +import org.apache.commons.cli.HelpFormatter; +import org.apache.commons.cli.Options; +import org.apache.commons.cli.ParseException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.List; +import java.util.Optional; +import java.util.Random; +import java.util.Set; +import java.util.stream.Collectors; + +import static com.amazonaws.athena.connector.lambda.data.BlockUtils.rowToString; +import static com.amazonaws.athena.connector.validation.ConstraintParser.parseConstraints; +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; +import static java.util.Objects.requireNonNull; + +/** + * This class should be used to validate deployed Lambda functions that use Athena's Federation SDK. + * It simulates the basic query patterns that Athena will send to Lambda throughout its usage so that + * broader and sometimes more subtle logical issues can be discovered before being used through Athena. + *

+ * You can run this tool using the following command: + * mvn exec:java -Dexec.mainClass=com.amazonaws.athena.connector.validation.ConnectorValidator -Dexec.args="[args]" + *

+ * This tool can also be run using the validate_connector.sh script in the tools directory under the package root: + * tools/validate_connector.sh [args] + */ +public class ConnectorValidator +{ + private static final Logger log = LoggerFactory.getLogger(ConnectorValidator.class); + + private static final Random RAND = new Random(); + + static final BlockAllocator BLOCK_ALLOCATOR = new BlockAllocatorImpl(); + + private ConnectorValidator() + { + // Intentionally left blank. + } + + /** + * The main method of this class allows the following argument pattern: + * --lambda-func lambda_func [--record-func record_func] [--catalog catalog] + * [--schema schema [--table table [--constraints constraints]]] [--planning-only] [--help] + *

+ * Run with the -h or --help options to see full argument descriptions, or see {@link TestConfig} below. + */ + public static void main(String[] args) + { + try { + TestConfig testConfig = TestConfig.fromArgs(args); + + /* + * SHOW DATABASES + */ + logTestQuery("SHOW DATABASES"); + Collection schemas = showDatabases(testConfig); + + // SHOW TABLES + final String db = testConfig.getSchemaId().isPresent() + ? testConfig.getSchemaId().get() + : getRandomElement(schemas); + log.info("Using database {}", db); + logTestQuery("SHOW TABLES IN " + db); + Collection tables = showTables(testConfig, db); + + /* + * DESCRIBE TABLE + */ + final TableName table = testConfig.getTableId().isPresent() + ? new TableName(db, testConfig.getTableId().get()) + : getRandomElement(tables); + log.info("Using table {}", toQualifiedTableName(table)); + logTestQuery("DESCRIBE " + toQualifiedTableName(table)); + GetTableResponse tableResponse = describeTable(testConfig, table); + final Schema schema = tableResponse.getSchema(); + final Set partitionColumns = tableResponse.getPartitionColumns(); + + /* + * SELECT + */ + logTestQuery("SELECT * FROM " + toQualifiedTableName(table)); + GetTableLayoutResponse tableLayout = getTableLayout(testConfig, table, schema, partitionColumns); + + GetSplitsResponse splitsResponse = getSplits(testConfig, + table, + schema, + tableLayout, + partitionColumns, + null); + + if (!testConfig.isPlanningOnly()) { + readRecords(testConfig, table, schema, splitsResponse.getSplits()); + } + else { + log.info("Skipping record reading because the arguments indicated that only the planning should be validated."); + } + + if (splitsResponse.getContinuationToken() == null) { + logSuccess(); + return; + } + + log.info("GetSplits included a continuation token: " + splitsResponse.getContinuationToken()); + log.info("Testing next batch of splits."); + + splitsResponse = getSplits(testConfig, + table, + schema, + tableLayout, + partitionColumns, + splitsResponse.getContinuationToken()); + + if (!testConfig.isPlanningOnly()) { + readRecords(testConfig, table, schema, splitsResponse.getSplits()); + } + else { + log.info("Skipping record reading because the arguments indicated that only the planning should be validated."); + } + + logSuccess(); + } + catch (Exception ex) { + logFailure(ex); + System.exit(1); + } + + System.exit(0); + } + + private static Collection showDatabases(TestConfig testConfig) + { + ListSchemasResponse schemasResponse = LambdaMetadataProvider.listSchemas(testConfig.getCatalogId(), + testConfig.getMetadataFunction(), + testConfig.getIdentity()); + final Collection schemas = schemasResponse.getSchemas(); + log.info("Found databases: " + schemas); + requireNonNull(schemas, "Returned collection of schemas was null!"); + checkState(!schemas.isEmpty(), "No schemas were returned!"); + List notLower = schemas.stream().filter(s -> !s.equals(s.toLowerCase())).collect(Collectors.toList()); + checkState(notLower.isEmpty(), + "All returned schemas must be lowercase! Found these non-lowercase schemas: " + notLower); + return schemas; + } + + private static Collection showTables(TestConfig testConfig, String db) + { + ListTablesResponse tablesResponse = LambdaMetadataProvider.listTables(testConfig.getCatalogId(), + db, + testConfig.getMetadataFunction(), + testConfig.getIdentity()); + final Collection tables = tablesResponse.getTables(); + log.info("Found tables: " + tables.stream() + .map(t -> toQualifiedTableName(t)) + .collect(Collectors.toList())); + requireNonNull(tables, "Returned collection of tables was null!"); + checkState(!tables.isEmpty(), "No tables were returned!"); + List notLower = tables.stream().filter(t -> !t.equals(new TableName(t.getSchemaName().toLowerCase(), + t.getTableName().toLowerCase()))).limit(5) + .map(t -> toQualifiedTableName(t)) + .collect(Collectors.toList()); + checkState(notLower.isEmpty(), + "All returned tables must be lowercase! Found these non-lowercase tables: " + notLower); + return tables; + } + + private static GetTableResponse describeTable(TestConfig testConfig, TableName table) + { + GetTableResponse tableResponse = LambdaMetadataProvider.getTable(testConfig.getCatalogId(), + table, + testConfig.getMetadataFunction(), + testConfig.getIdentity()); + TableName returnedTableName = tableResponse.getTableName(); + checkState(table.equals(returnedTableName), "Returned table name did not match the requested table name!" + + " Expected " + toQualifiedTableName(table) + + " but found " + toQualifiedTableName(returnedTableName)); + List notLower = tableResponse.getSchema().getFields() + .stream() + .map(Field::getName) + .filter(f -> !f.equals(f.toLowerCase())) + .collect(Collectors.toList()); + checkState(notLower.isEmpty(), + "All returned columns must be lowercase! Found these non-lowercase columns: " + notLower); + checkState(tableResponse.getSchema().getFields() + .stream().map(Field::getName) + .anyMatch(f -> !tableResponse.getPartitionColumns().contains(f)), + "Table must have at least one non-partition column!"); + Set fields = tableResponse.getSchema().getFields().stream().map(Field::getName).collect(Collectors.toSet()); + Sets.SetView difference = Sets.difference(tableResponse.getPartitionColumns(), fields); + checkState(difference.isEmpty(), "Table column list must include all partition columns! " + + "Found these partition columns which are not in the table's fields: " + + difference); + return tableResponse; + } + + private static GetTableLayoutResponse getTableLayout(TestConfig testConfig, + TableName table, + Schema schema, + Set partitionColumns) + { + Constraints constraints = parseConstraints(schema, testConfig.getConstraints()); + GetTableLayoutResponse tableLayout = LambdaMetadataProvider.getTableLayout(testConfig.getCatalogId(), + table, + constraints, + schema, + partitionColumns, + testConfig.getMetadataFunction(), + testConfig.getIdentity()); + log.info("Found " + tableLayout.getPartitions().getRowCount() + " partitions."); + checkState(tableLayout.getPartitions().getRowCount() > 0, + "Table " + toQualifiedTableName(table) + + " did not return any partitions. This can happen if the table" + + " is empty but could also indicate an issue." + + " Please populate the table or specify a different table."); + return tableLayout; + } + + private static GetSplitsResponse getSplits(TestConfig testConfig, + TableName table, + Schema schema, + GetTableLayoutResponse tableLayout, + Set partitionColumns, + String continuationToken) + { + Constraints constraints = parseConstraints(schema, testConfig.getConstraints()); + GetSplitsResponse splitsResponse = LambdaMetadataProvider.getSplits(testConfig.getCatalogId(), + table, + constraints, + tableLayout.getPartitions(), + new ArrayList<>(partitionColumns), + continuationToken, + testConfig.getMetadataFunction(), + testConfig.getIdentity()); + log.info("Found " + splitsResponse.getSplits().size() + " splits in batch."); + if (continuationToken == null) { + checkState(!splitsResponse.getSplits().isEmpty(), + "Table " + toQualifiedTableName(table) + + " did not return any splits. This can happen if the table" + + " is empty but could also indicate an issue." + + " Please populate the table or specify a different table."); + } + else { + checkState(!splitsResponse.getSplits().isEmpty(), + "Table " + toQualifiedTableName(table) + + " did not return any splits in the second batch despite returning" + + " a continuation token with the first batch."); + } + return splitsResponse; + } + + private static ReadRecordsResponse readRecords(TestConfig testConfig, + TableName table, + Schema schema, + Collection splits) + { + Constraints constraints = parseConstraints(schema, testConfig.getConstraints()); + Split split = getRandomElement(splits); + log.info("Executing randomly selected split with properties: {}", split.getProperties()); + ReadRecordsResponse records = LambdaRecordProvider.readRecords(testConfig.getCatalogId(), + table, + constraints, + schema, + split, + testConfig.getRecordFunction(), + testConfig.getIdentity()); + log.info("Received " + records.getRecordCount() + " records."); + checkState(records.getRecordCount() > 0, + "Table " + toQualifiedTableName(table) + + " did not return any rows in the tested split, even though an empty constraint was used." + + " This can happen if the table is empty but could also indicate an issue." + + " Please populate the table or specify a different table."); + log.info("Discovered columns: " + + records.getSchema().getFields() + .stream() + .map(f -> f.getName() + ":" + f.getType().getTypeID()) + .collect(Collectors.toList())); + + if (records.getRecordCount() == 0) { + return records; + } + + log.info("First row of split: " + rowToString(records.getRecords(), 0)); + + return records; + } + + private static T getRandomElement(Collection elements) + { + int i = RAND.nextInt(elements.size()); + Iterator iter = elements.iterator(); + T elem; + do { + elem = iter.next(); + i--; + } while (i >= 0); + return elem; + } + + private static void logTestQuery(String query) + { + log.info("=================================================="); + log.info("Testing " + query); + log.info("=================================================="); + } + + private static void logSuccess() + { + log.info("=================================================="); + log.info("Successfully Passed Validation!"); + log.info("=================================================="); + } + + private static void logFailure(Exception ex) + { + log.error("=================================================="); + log.error("Error Encountered During Validation!", ex); + log.error("=================================================="); + } + + private static String toQualifiedTableName(TableName name) + { + return name.getSchemaName() + "." + name.getTableName(); + } + + private static class TestConfig + { + private static final String LAMBDA_METADATA_FUNCTION_ARG = "lambda-func"; + private static final String LAMBDA_RECORD_FUNCTION_ARG = "record-func"; + private static final String CATALOG_ID_ARG = "catalog"; + private static final String SCHEMA_ID_ARG = "schema"; + private static final String TABLE_ID_ARG = "table"; + private static final String CONSTRAINTS_ARG = "constraints"; + private static final String PLANNING_ONLY_ARG = "planning-only"; + private static final String HELP_ARG = "help"; + + private final FederatedIdentity identity; + private final String metadataFunction; + private final String recordFunction; + private final String catalogId; + private final Optional schemaId; + private final Optional tableId; + private final Optional constraints; + private final boolean planningOnly; + + private TestConfig(String metadataFunction, + String recordFunction, + String catalogId, + Optional schemaId, + Optional tableId, + Optional constraints, + boolean planningOnly) + { + this.metadataFunction = metadataFunction; + this.recordFunction = recordFunction; + this.catalogId = catalogId; + this.schemaId = schemaId; + this.tableId = tableId; + this.constraints = constraints; + this.planningOnly = planningOnly; + this.identity = new FederatedIdentity("VALIDATION_ACCESS_KEY", + "VALIDATION_PRINCIPAL", + "VALIDATION_ACCOUNT"); + } + + public FederatedIdentity getIdentity() + { + return identity; + } + + String getMetadataFunction() + { + return metadataFunction; + } + + String getRecordFunction() + { + return recordFunction; + } + + String getCatalogId() + { + return catalogId; + } + + Optional getSchemaId() + { + return schemaId; + } + + Optional getTableId() + { + return tableId; + } + + Optional getConstraints() + { + return constraints; + } + + boolean isPlanningOnly() + { + return planningOnly; + } + + static TestConfig fromArgs(String[] args) throws ParseException + { + log.info("Received arguments: {}", args); + + requireNonNull(args); + + Options options = new Options(); + options.addOption("f", LAMBDA_METADATA_FUNCTION_ARG, true, + "The name of the Lambda function to be validated. " + + "Uses your configured default AWS region."); + options.addOption("r", LAMBDA_RECORD_FUNCTION_ARG, true, + "The name of the Lambda function to be used to read data records. " + + "If not provided, this defaults to the value provided for lambda-func. " + + "Uses your configured default AWS region."); + options.addOption("c", CATALOG_ID_ARG, true, + "The catalog name to pass to the Lambda function to be validated."); + options.addOption("s", SCHEMA_ID_ARG, true, + "The schema name to be used when validating the Lambda function. " + + "If not provided, a random existing schema will be chosen."); + options.addOption("t", TABLE_ID_ARG, true, + "The table name to be used when validating the Lambda function. " + + "If not provided, a random existing table will be chosen."); + options.addOption("c", CONSTRAINTS_ARG, true, + "A comma-separated list of field/value pair constraints to be applied " + + "when reading metadata and records from the Lambda function to be validated"); + options.addOption("p", PLANNING_ONLY_ARG, false, + "If this option is set, then the validator will not attempt to read" + + " any records after calling GetSplits."); + options.addOption("h", HELP_ARG, false, "Prints usage information."); + DefaultParser argParser = new DefaultParser(); + CommandLine parsedArgs = argParser.parse(options, args); + + if (parsedArgs.hasOption(HELP_ARG)) { + new HelpFormatter().printHelp(150, "./validate_connector.sh --" + LAMBDA_METADATA_FUNCTION_ARG + + " lambda_func [--" + LAMBDA_RECORD_FUNCTION_ARG + + " record_func] [--" + CATALOG_ID_ARG + + " catalog] [--" + SCHEMA_ID_ARG + + " schema [--" + TABLE_ID_ARG + + " table [--" + CONSTRAINTS_ARG + + " constraints]]] [--" + PLANNING_ONLY_ARG + "] [--" + HELP_ARG + "]", + null, + options, + null); + System.exit(0); + } + + checkArgument(parsedArgs.hasOption(LAMBDA_METADATA_FUNCTION_ARG), + "Lambda function must be provided via the --lambda-func or -l args!"); + String metadataFunction = parsedArgs.getOptionValue(LAMBDA_METADATA_FUNCTION_ARG); + checkArgument(metadataFunction.equals(metadataFunction.toLowerCase()), + "Lambda function name must be lowercase."); + + if (parsedArgs.hasOption(TABLE_ID_ARG)) { + checkArgument(parsedArgs.hasOption(SCHEMA_ID_ARG), + "The --schema argument must be provided if the --table argument is provided."); + } + + if (parsedArgs.hasOption(CONSTRAINTS_ARG)) { + checkArgument(parsedArgs.hasOption(TABLE_ID_ARG), + "The --table argument must be provided if the --constraints argument is provided."); + } + + String catalog = metadataFunction; + if (parsedArgs.hasOption(CATALOG_ID_ARG)) { + catalog = parsedArgs.getOptionValue(CATALOG_ID_ARG); + checkArgument(catalog.equals(catalog.toLowerCase()), + "Catalog name must be lowercase."); + } + + return new TestConfig(metadataFunction, + parsedArgs.hasOption(LAMBDA_RECORD_FUNCTION_ARG) + ? parsedArgs.getOptionValue(LAMBDA_RECORD_FUNCTION_ARG) + : metadataFunction, + catalog, + Optional.ofNullable(parsedArgs.getOptionValue(SCHEMA_ID_ARG)), + Optional.ofNullable(parsedArgs.getOptionValue(TABLE_ID_ARG)), + Optional.ofNullable(parsedArgs.getOptionValue(CONSTRAINTS_ARG)), + parsedArgs.hasOption(PLANNING_ONLY_ARG)); + } + } +} diff --git a/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/ConstraintParser.java b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/ConstraintParser.java new file mode 100644 index 0000000000..239039ebd3 --- /dev/null +++ b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/ConstraintParser.java @@ -0,0 +1,219 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK Tools + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.validation; + +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.google.common.base.Splitter; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Optional; +import java.util.stream.Collectors; + +import static com.amazonaws.athena.connector.validation.ConnectorValidator.BLOCK_ALLOCATOR; +import static com.google.common.base.Preconditions.checkArgument; +import static com.google.common.base.Preconditions.checkState; + +/** + * This class provides the ability to transform a simple constraint grammar into an instance of {@link Constraints}. + */ +public class ConstraintParser +{ + private static final Splitter CONSTRAINT_SPLITTER = Splitter.on(','); + + private ConstraintParser() + { + // Intentionally left blank. + } + + private enum LogicalOperator + { + EQ("=", 0) + { + ValueSet createValueSet(ArrowType type, Object operand) + { + if (operand == null) { + return SortedRangeSet.newBuilder(type, true).build(); + } + return SortedRangeSet.of(false, Range.equal(BLOCK_ALLOCATOR, type, operand)); + } + }, + NEQ("!=", 1) + { + ValueSet createValueSet(ArrowType type, Object operand) + { + return EQ.createValueSet(type, operand).complement(BLOCK_ALLOCATOR); + } + }, + GT(">", 0) + { + ValueSet createValueSet(ArrowType type, Object operand) + { + return SortedRangeSet.of(false, Range.greaterThan(BLOCK_ALLOCATOR, type, operand)); + } + }, + GTE(">=", 1) + { + ValueSet createValueSet(ArrowType type, Object operand) + { + return SortedRangeSet.of(false, Range.greaterThanOrEqual(BLOCK_ALLOCATOR, type, operand)); + } + }, + LT("<", 0) + { + ValueSet createValueSet(ArrowType type, Object operand) + { + return SortedRangeSet.of(false, Range.lessThan(BLOCK_ALLOCATOR, type, operand)); + } + }, + LTE("<=", 1) + { + ValueSet createValueSet(ArrowType type, Object operand) + { + return SortedRangeSet.of(false, Range.lessThanOrEqual(BLOCK_ALLOCATOR, type, operand)); + } + }; + + private final String operator; + private final int rank; + + LogicalOperator(String operator, int rank) + { + this.operator = operator; + this.rank = rank; + } + + public String getOperator() + { + return operator; + } + + public int getRank() + { + return rank; + } + + abstract ValueSet createValueSet(ArrowType type, Object operand); + } + + /** + * This method takes in a table schema and a String representing the set of + * simple contraints to be ANDed together and applied to that table. + * + * @param schema The schema of the table in question + * @param input A comma-separated constraint String in the form of {field_name}{operator}{value}. + * The operators must be one of those available in {@link LogicalOperator}. + * Currently, we only support Boolean, Integer, Floating Point, Decimal, and String operands + * for this validator's constraints. + * @return a {@link Constraints} object populated from the input string, un-constrained if input is not present + */ + public static Constraints parseConstraints(Schema schema, Optional input) + { + if (!input.isPresent() || input.get().trim().isEmpty()) { + return new Constraints(Collections.EMPTY_MAP); + } + + Map fieldTypes = schema.getFields().stream() + .collect(Collectors.toMap(Field::getName, Field::getType)); + + Map constraints = new HashMap<>(); + Iterable constraintStrings = CONSTRAINT_SPLITTER.split(input.get()); + constraintStrings.forEach(str -> parseAndAddConstraint(fieldTypes, constraints, str)); + return new Constraints(constraints); + } + + private static void parseAndAddConstraint(Map fieldTypes, + Map constraints, + String constraintString) + { + LogicalOperator matchedOperator = null; + int bestMatchRank = Integer.MIN_VALUE; + for (LogicalOperator operator : LogicalOperator.values()) { + if (constraintString.contains(operator.getOperator()) && operator.getRank() > bestMatchRank) { + matchedOperator = operator; + bestMatchRank = operator.getRank(); + } + } + checkState(matchedOperator != null, + String.format("No operators found in constraint string '%s'!" + + " Allowable operators are %s", constraintString, + Arrays.stream(LogicalOperator.values()) + .map(LogicalOperator::getOperator) + .collect(Collectors.toList()))); + + String[] operands = constraintString.split(matchedOperator.getOperator()); + String fieldName = operands[0].trim(); + + if (fieldName == null) { + throw new IllegalArgumentException( + String.format("Constraint segment %s could not be parsed into a valid expression!" + + " Please use the form for each constraint.", + constraintString)); + } + + // If there is no reference value for this field, then we treat the right operand as null. + Object correctlyTypedOperand = null; + if (operands.length > 1) { + checkArgument(operands.length == 2, + String.format("Constraint argument %s contains multiple occurrences of operator %s", + constraintString, matchedOperator.getOperator())); + correctlyTypedOperand = tryParseOperand(fieldTypes.get(fieldName), operands[1]); + } + + constraints.put(fieldName, matchedOperator.createValueSet(fieldTypes.get(fieldName), correctlyTypedOperand)); + } + + /* + * Currently, we only support Boolean, Integer, Floating Point, Decimal, and String operands. + */ + private static Object tryParseOperand(ArrowType type, String operand) + { + switch (type.getTypeID()) { + case Bool: + return Boolean.valueOf(operand); + case FloatingPoint: + case Decimal: + try { + return Float.valueOf(operand); + } + catch (NumberFormatException floatEx) { + return Double.valueOf(operand); + } + case Int: + try { + return Integer.valueOf(operand); + } + catch (NumberFormatException floatEx) { + return Long.valueOf(operand); + } + default: + // For anything else, we try passing the operand as provided. + return operand; + } + } +} diff --git a/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/LambdaMetadataProvider.java b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/LambdaMetadataProvider.java new file mode 100644 index 0000000000..1f7a42c755 --- /dev/null +++ b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/LambdaMetadataProvider.java @@ -0,0 +1,259 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK Tools + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.validation; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.amazonaws.regions.Regions; +import com.amazonaws.services.lambda.AWSLambdaClientBuilder; +import com.amazonaws.services.lambda.invoke.LambdaFunction; +import com.amazonaws.services.lambda.invoke.LambdaFunctionNameResolver; +import com.amazonaws.services.lambda.invoke.LambdaInvokerFactory; +import com.amazonaws.services.lambda.invoke.LambdaInvokerFactoryConfig; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.Method; +import java.util.List; +import java.util.Set; +import java.util.UUID; + +import static com.amazonaws.athena.connector.validation.ConnectorValidator.BLOCK_ALLOCATOR; + +/** + * This class offers multiple convenience methods to retrieve metadata from a deployed Lambda. + */ +public class LambdaMetadataProvider +{ + private static final Logger log = LoggerFactory.getLogger(LambdaMetadataProvider.class); + private static final String UNKNOWN_SUFFIX = "_unknown"; + + private LambdaMetadataProvider() + { + // Intentionally left blank. + } + + /** + * This method builds and executes a ListSchemasRequest against the specified Lambda function. + * + * @param catalog the catalog name to be passed to Lambda + * @param metadataFunction the name of the Lambda function to call + * @param identity the identity of the caller + * @return the response + */ + public static ListSchemasResponse listSchemas(String catalog, + String metadataFunction, + FederatedIdentity identity) + { + String queryId = UUID.randomUUID().toString() + UNKNOWN_SUFFIX; + log.info("Submitting ListSchemasRequest with ID " + queryId); + + try (ListSchemasRequest request = + new ListSchemasRequest(identity, queryId, catalog)) { + log.info("Submitting request: {}", request); + ListSchemasResponse response = (ListSchemasResponse) getService(metadataFunction).getMetadata(request); + log.info("Received response: {}", response); + return response; + } + catch (Exception e) { + throw new RuntimeException(e); + } + } + + /** + * This method builds and executes a ListTablesRequest against the specified Lambda function. + * + * @param catalog the catalog name to be passed to Lambda + * @param schema the name of the contextual schema for the request + * @param metadataFunction the name of the Lambda function to call + * @param identity the identity of the caller + * @return the response + */ + public static ListTablesResponse listTables(String catalog, + String schema, + String metadataFunction, + FederatedIdentity identity) + { + String queryId = UUID.randomUUID().toString() + UNKNOWN_SUFFIX; + log.info("Submitting ListTablesRequest with ID " + queryId); + + try (ListTablesRequest request = + new ListTablesRequest(identity, queryId, catalog, schema)) { + log.info("Submitting request: {}", request); + ListTablesResponse response = (ListTablesResponse) getService(metadataFunction).getMetadata(request); + log.info("Received response: {}", response); + return response; + } + catch (Exception e) { + throw new RuntimeException(e); + } + } + + /** + * This method builds and executes a GetTableRequest against the specified Lambda function. + * + * @param catalog the catalog name to be passed to Lambda + * @param tableName the schema-qualified table name indicating which table should be retrieved + * @param metadataFunction the name of the Lambda function to call + * @param identity the identity of the caller + * @return the response + */ + public static GetTableResponse getTable(String catalog, + TableName tableName, + String metadataFunction, + FederatedIdentity identity) + { + String queryId = UUID.randomUUID().toString() + UNKNOWN_SUFFIX; + log.info("Submitting GetTableRequest with ID " + queryId); + + try (GetTableRequest request = + new GetTableRequest(identity, queryId, catalog, tableName)) { + log.info("Submitting request: {}", request); + GetTableResponse response = (GetTableResponse) getService(metadataFunction).getMetadata(request); + log.info("Received response: {}", response); + return response; + } + catch (Exception e) { + throw new RuntimeException(e); + } + } + + /** + * This method builds and executes a GetTableLayoutRequest against the specified Lambda function. + * + * @param catalog the catalog name to be passed to Lambda + * @param tableName the schema-qualified table name indicating the table whose layout should be retrieved + * @param constraints the constraints to be applied to the request + * @param schema the schema of the table in question + * @param partitionCols the partition column names for the table in question + * @param metadataFunction the name of the Lambda function to call + * @param identity the identity of the caller + * @return the response + */ + public static GetTableLayoutResponse getTableLayout(String catalog, + TableName tableName, + Constraints constraints, + Schema schema, + Set partitionCols, + String metadataFunction, + FederatedIdentity identity) + { + String queryId = UUID.randomUUID().toString() + UNKNOWN_SUFFIX; + log.info("Submitting GetTableLayoutRequest with ID " + queryId); + + try (GetTableLayoutRequest request = + new GetTableLayoutRequest(identity, queryId, catalog, tableName, constraints, schema, partitionCols)) { + log.info("Submitting request: {}", request); + GetTableLayoutResponse response = (GetTableLayoutResponse) getService(metadataFunction).getMetadata(request); + log.info("Received response: {}", response); + return response; + } + catch (Exception e) { + throw new RuntimeException(e); + } + } + + /** + * This method builds and executes a GetSplitsRequest against the specified Lambda function. + * + * @param catalog the catalog name to be passed to Lambda + * @param tableName the schema-qualified table name indicating the table for which splits should be retrieved + * @param constraints the constraints to be applied to the request + * @param partitions the block of partitions to be provided with the request + * @param partitionCols the partition column names for the table in question + * @param contToken a continuation token to be provided with the request, or null + * @param metadataFunction the name of the Lambda function to call + * @param identity the identity of the caller + * @return the response + */ + public static GetSplitsResponse getSplits(String catalog, + TableName tableName, + Constraints constraints, + Block partitions, + List partitionCols, + String contToken, + String metadataFunction, + FederatedIdentity identity) + { + String queryId = UUID.randomUUID().toString() + UNKNOWN_SUFFIX; + log.info("Submitting GetSplitsRequest with ID " + queryId); + + try (GetSplitsRequest request = + new GetSplitsRequest(identity, queryId, catalog, tableName, partitions, partitionCols, constraints, contToken)) { + log.info("Submitting request: {}", request); + GetSplitsResponse response = (GetSplitsResponse) getService(metadataFunction).getMetadata(request); + log.info("Received response: {}", response); + return response; + } + catch (Exception e) { + throw new RuntimeException(e); + } + } + + public interface MetadataService + { + @LambdaFunction + MetadataResponse getMetadata(final MetadataRequest request); + } + + public static final class Mapper + implements LambdaFunctionNameResolver + { + private final String metadataLambda; + + private Mapper(String metadataLambda) + { + this.metadataLambda = metadataLambda; + } + + @Override + public String getFunctionName(Method method, LambdaFunction lambdaFunction, + LambdaInvokerFactoryConfig lambdaInvokerFactoryConfig) + { + return metadataLambda; + } + } + + private static MetadataService getService(String lambdaFunction) + { + return LambdaInvokerFactory.builder() + .lambdaClient(AWSLambdaClientBuilder.standard().withRegion(Regions.US_EAST_2) + .build()) + .objectMapper(ObjectMapperFactory.create(BLOCK_ALLOCATOR)) + .lambdaFunctionNameResolver(new Mapper(lambdaFunction)) + .build(MetadataService.class); + } +} diff --git a/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/LambdaRecordProvider.java b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/LambdaRecordProvider.java new file mode 100644 index 0000000000..8a4ea1e28d --- /dev/null +++ b/athena-federation-sdk-tools/src/main/java/com/amazonaws/athena/connector/validation/LambdaRecordProvider.java @@ -0,0 +1,137 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK Tools + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.validation; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordRequest; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.amazonaws.regions.Regions; +import com.amazonaws.services.lambda.AWSLambdaClientBuilder; +import com.amazonaws.services.lambda.invoke.LambdaFunction; +import com.amazonaws.services.lambda.invoke.LambdaFunctionNameResolver; +import com.amazonaws.services.lambda.invoke.LambdaInvokerFactory; +import com.amazonaws.services.lambda.invoke.LambdaInvokerFactoryConfig; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.Method; +import java.util.UUID; + +import static com.amazonaws.athena.connector.validation.ConnectorValidator.BLOCK_ALLOCATOR; + +/** + * This class offers a convenience method to retrieve records from a deployed Lambda. + */ +public class LambdaRecordProvider +{ + private static final Logger log = LoggerFactory.getLogger(LambdaMetadataProvider.class); + + private static final long MAX_BLOCK_SIZE = 16000000; + private static final long MAX_INLINE_BLOCK_SIZE = 5242880; + + private LambdaRecordProvider() + { + // Intentionally left blank. + } + + /** + * This method builds and executes a ReadRecordsRequest against the specified Lambda function. + * + * @param catalog the catalog name to be passed to Lambda + * @param tableName the schema-qualified table name indicating the table for which splits should be retrieved + * @param constraints the constraints to be applied to the request + * @param schema the schema of the table in question + * @param split the split to be read in this request + * @param recordFunction the name of the Lambda function to call + * @param identity the identity of the caller + * @return the response + */ + public static ReadRecordsResponse readRecords(String catalog, + TableName tableName, + Constraints constraints, + Schema schema, + Split split, + String recordFunction, + FederatedIdentity identity) + { + String queryId = UUID.randomUUID().toString() + "_unknown"; + log.info("Submitting ReadRecordsRequest with ID " + queryId); + + try (ReadRecordsRequest request = + new ReadRecordsRequest(identity, + queryId, + catalog, + tableName, + schema, + split, + constraints, + MAX_BLOCK_SIZE, + MAX_INLINE_BLOCK_SIZE)) { + log.info("Submitting request: {}", request); + ReadRecordsResponse response = (ReadRecordsResponse) getService(recordFunction).readRecords(request); + log.info("Received response: {}", response); + return response; + } + catch (Exception e) { + throw new RuntimeException(e); + } + } + + public interface RecordService + { + @LambdaFunction + RecordResponse readRecords(final RecordRequest request); + } + + public static final class Mapper + implements LambdaFunctionNameResolver + { + private final String metadataLambda; + + private Mapper(String metadataLambda) + { + this.metadataLambda = metadataLambda; + } + + @Override + public String getFunctionName(Method method, LambdaFunction lambdaFunction, + LambdaInvokerFactoryConfig lambdaInvokerFactoryConfig) + { + return metadataLambda; + } + } + + private static RecordService getService(String lambdaFunction) + { + return LambdaInvokerFactory.builder() + .lambdaClient(AWSLambdaClientBuilder.standard().withRegion(Regions.US_EAST_2) + .build()) + .objectMapper(ObjectMapperFactory.create(BLOCK_ALLOCATOR)) + .lambdaFunctionNameResolver(new Mapper(lambdaFunction)) + .build(RecordService.class); + } +} diff --git a/athena-federation-sdk-tools/src/main/resources/log4j.properties b/athena-federation-sdk-tools/src/main/resources/log4j.properties new file mode 100644 index 0000000000..750e867d2b --- /dev/null +++ b/athena-federation-sdk-tools/src/main/resources/log4j.properties @@ -0,0 +1,26 @@ +### +# #%L +# Amazon Athena Query Federation SDK +# %% +# Copyright (C) 2019 Amazon Web Services +# %% +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# #L% +### +log = . +log4j.rootLogger = INFO, CONSOLE + +#Define the CONSOLE appender +log4j.appender.CONSOLE=org.apache.log4j.ConsoleAppender +log4j.appender.CONSOLE.layout=org.apache.log4j.PatternLayout +log4j.appender.CONSOLE.layout.conversionPattern=%d{yyyy-MM-dd HH:mm:ss} <%X{AWSRequestId}> %-5p %c{1}:%m%n diff --git a/athena-federation-sdk/LICENSE.txt b/athena-federation-sdk/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-federation-sdk/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-federation-sdk/README.md b/athena-federation-sdk/README.md new file mode 100644 index 0000000000..35ef092723 --- /dev/null +++ b/athena-federation-sdk/README.md @@ -0,0 +1,185 @@ +# Amazon Athena Query Federation SDK + +The Athena Query Federation SDK defines a set of interfaces and wire protocols that you can implement to enable Athena to delegate portions of it's query execution plan to code that you deploy/write. + +This essentially allows you to customize Athena's core execution engine with your own functionality while still taking advantage of Athena's ease of use and fully managed nature. + +You can find a collection of ready made modules that allow Athena to connect to various data sources by going to [Serverless Application Repository](https://console.aws.amazon.com/serverlessrepo/). Serverless Application Repository will allow you to search for and 1-Click deploy Athena connectors. + + Alternatively, you can explore [the Amazon Athena Query Federation github repositoru](https://github.com/awslabs/aws-athena-query-federation) for many of those same ready made connectors, modify them as you see fit, or write your own connector using the included example project. + +For those seeking to write their own connectors, we recommend you being by going through the [tutorial in athena-example](https://github.com/awslabs/aws-athena-query-federation/tree/master/athena-example) + +## Features + +* **Federated Metadata** - It is not always practical to centralize table metadata in a centralized meta-store. As such, this SDK allows Athena to delegate portions of its query planning to your connector in order to retrieve metadata about your data source. +* Glue DataCatalog Support - You can optionally enable a pre-built Glue MetadataHandler in your connector which will first attempt to fetch metadata from Glue about any table being queried before given you an opportunitiy to modify or re-write the retrieved metadata. This can be handy when you are using a custom format it S3 or if your data source doesn't have its own source of metadata (e.g. redis). +* **AWS Secrets Manager Integration** - If your connectors need passwords or other sensitive information, you can optionally use the SDK's built in tooling to resolve secrets. For example, if you have a config with a jdbc connection string you can do: "jdbc://${username}:${password}@hostname:post?options" and the SDK will automatically replace ${username} and ${password} with AWS Secrets Manager secrets of the same name. +* **Federated Identity** - When Athena federates a query to your connector, you may want to perform Authz based on the identitiy of the entity that executed the Athena Query. +* **Partition Pruning** - Athena will call you connector to understand how the table being queried is partitioned as well as to obtain which partitions need to be read for a given query. If your source supports partitioning, this give you an opportunity to use the query predicate to perform partition prunning. +* **Parallelized & Pipelined Reads** - Athena will parallelize reading your tables based on the partitioning information you provide. You also have the opportunity to tell Athena how (and if) it should split each partition into multiple (potentially concurrent) read operations. Behind the scenes Athena will parallelize reading the split (work units) you've created and pipeline reads to reduce the performance impact of reading a remote source. +* **Predicate Pushdown** - (Associative Predicates) Where relevant, Athena will supply you with the associative portion of the query predicate so that you can perform filtering or push the predicate into your source system for even better performance. It is important to note that the predicate is not always the query's full predicate. For example, if the query's predicate was "where (col0 < 1 or col1 < 10) and col2 + 10 < 100" only the "col0 < 1 or col1 < 10" will be supplied to you at this time. We are still considering the best form for supplying connectors with a more complete view of the query and its predicate and expect a future release to provide this to connectors that are capable of utilizing +* **Column Projection** - Where relevant, Athena will supply you with the columns that need to be projected so that you can reduce data scanned. +* **Limited Scans** - While Athena is not yet able to push down limits to you connector, the SDK does expose a mechanism by which you can abandon a scan early. Athena will already avoid scanning partitions and splits that are not needed once a limit, failure, or user cancellation occurs but this functionality will allow connectors that are in the middle of processing a split to stop regardless of the cause. This works even when the query's limit can not be semantically pushed down (e.g. limit happens after a filtered join). In a future release we may also introduce traditional limit pushdwon for the simple cases that would support that. +* **Congestion Control** - Some of the source you may wish to federate to may not be as scalable as Athena or may be running performance sensitive workloads that you wish to protect from an overzealous federated query. Athena will automatically detect congestion by listening for FederationThrottleException(s) as well as many other AWS service exceptions that indicate your source is overwhelmed. When Athena detects congestion it reducing parallelism against your source. Within the SDK you can make use of ThrottlingInvoker to more tightly control congestion yourself. Lastly, you can reduce the concurrency your Lambda functions are allowed to achieve in the Lambda console and Athena will respect that setting. + +### DataTypes + +The wire protocol between your connector(s) and Athena is built on Apache Arrow with JSON for request/response structures. As such we make use of Apache Arrow's type system. At this time we support the below Apache Arrow types with plans to add more (e.g. timestamp w/TZ and Map are some of the upcoming additions) + +The below table lists the supported Apache Arrow types as well as the corresponding java type you can use to 'set' values via Block.setValue(...) or BlockUtils.setValue(...). It is important to remember that while this SDK offers a number of convenience helpers to make working with Apache Arrow easier for the beginner you always have the option of using Apache Arrow directly. Using Arrow Directly can offer improved performance as well as more options for how you handle type conversion and coercion. + +|Apache Arrow Data Type|Java Type| +|-------------|-----------------| +|BIT|int, boolean| +|DATEMILLI|Date, long, int| +|DATEDAY|Date, long, int| +|FLOAT8|double| +|FLOAT4|float| +|INT|int, long| +|TINYINT|int| +|SMALLINT|int| +|BIGINT|long| +|VARBINARY|byte[]| +|DECIMAL|double, BigDecimal| +|VARCHAR|String, Text| +|STRUCT|Object (w/ FieldResolver)| +|LIST|iterable (w/Optional FieldResolver)| + + +## What is a 'Connector'? + +A 'Connector' is a piece of code that understands how to execute portions of an Athena query outside of Athena's core engine. Connectors must satisfy a few basic requirements. + +1. Your connector must provide a source of meta-data for Athena to get schema information about what databases, tables, and columns your connector has. This is done by building and deploying a lambda function that extends or composes com.amazonaws.athena.connector.lambda.handlers.MetadataHandler in the athena-federation-sdk module. +2. Your connector must provide a way for Athena to read the data stored in your tables. This is done by building and deploying a lambda function that extends or composes com.amazonaws.athena.connector.lambda.handlers.RecordHandler in the athena-federation-sdk module. + +Alternatively, you can deploy a single Lambda function which combines the two above requirements by using com.amazonaws.athena.connector.lambda.handlers.CompositeHandler or com.amazonaws.athena.connector.lambda.handlers.UnifiedHandler. While breaking this into two separate Lambda functions allows you to independently control the cost and timeout of your Lambda functions, using a single Lambda function can be simpler and higher performance due to less cold start. + +In the next section we take a closer look at the methods we must implement on the MetadataHandler and RecordHandler. + +Included with this SDK is a set of examples in src/com/amazonaws/athena/connector/lambda/examples . You can deploy the examples using the included athena-federation-sdk.yaml file. Run `../tools/publish.sh S3_BUCKET_NAME athena-federation-sdk` to publish the connector to your private AWS Serverless Application Repository. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form in the Serverless Application Repository console. After you've used the Serverless Application Repository UX to deploy and instance of your connection you can run the validation script `../tools/validate_connector.sh ` be sure to replace with the name you gave to your function/catalog when you deployed it via Serverless Application Repository. To ensure you connector is valid before running an Athena query. For detailed steps on building and deploying please view the README.md in the athena-example module of this repository. + +You can then run `SELECT count(*) from "lambda:"."custom_source"."fake_table" where year > 2010` from your Athena console. Be sure you replace with the catalog name you gave your connector when you deployed it. + + +### MetadataHandler Details + + Below we have the basic functions we need to implement when using the Amazon Athena Query Federation SDK's MetadataHandler to satisfy the boiler plate work of serialization and initialization. The abstract class we are extending takes care of all the Lambda interface bits and delegates only the discrete operations that are relevant to the task at hand, querying our new data source. + +All schema names, table names, and column names must be lower case at this time. Any entities that are uppercase or mixed case will not be accessible in queries and will be lower cased by Athena's engine to ensure consistency across sources. As such you may need to handle this when integrating with a source that supports mixed case. As an example, you can look at the CloudwatchTableResolver in the athena-cloudwatch module for one potential approach to this challenge. + +```java +public class MyMetadataHandler extends MetadataHandler +{ + @Override + protected ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest request) + { + //Return a list of Schema names (strings) for the requested catalog + } + + @Override + protected ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest request) + { + //Return a list of tables (strings) for the requested catalog and schema + } + + @Override + protected GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request) + { + //Return a table (column names, types, descriptions and table properties) + } + + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request) { + //Generates the partitions of the requested table that need to be read + //to satisfy the supplied predicate. This is meant to be a fast pruning operation. + //Source that don't support partitioning can return a single partition. Partitions + //are opaque to Athena and are just used to call the next method, doGetSplits(...) + //Partition Pruning is automatically handled by BlockWriter which creates + //Blocks that are constrained to filter out values that do not match + //the requests constraints. You can optionally get a ConstraintEvaluator + //from the BlockWriter or get constraints directly from the request if you + //need to do some customer filtering for performance reasons or to push + //down into your source system. + } + + @Override + protected GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest request) + { + //Return the Split(s) that define how reading your the requested table can be parallelized. + //Think of this method as a work-producer. Athena will call this paginated API while also + //scheduling each Split for execution. Sources that don't support parallelism can return + //a single split. Splits are mostly opaque to Athena and are just used to call your RecordHandler. + } +} +``` + +You can find example MetadataHandlers by looking at some of the connectors in the repository. athena-cloudwatch and athena-tpcds are fairly easy to follow along with. + +Alternatively, if you wish to use AWS Glue DataCatalog as the authoritative (or supplemental) source of meta-data for your connector you can extend com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler instead of com.amazonaws.athena.connector.lambda.handlers.MetadataHandler. GlueMetadataHandler comes with implementations for doListSchemas(...), doListTables(...), and doGetTable(...) leaving you to implemented only 2 methods. The Amazon Athena DocumentDB Connector in the athena-docdb module is an example of using GlueMetadataHandler. + +### RecordHandler Details + +Lets take a closer look at what is required for a RecordHandler. Below we have the basic functions we need to implement when using the Amazon Athena Query Federation SDK's MetadataHandler to satisfy the boiler plate work of serialization and initialization. The abstract class we are extending takes care of all the Lambda interface bits and delegates on the discrete operations that are relevant to the task at hand, querying our new data source. + +```java +public class MyRecordHandler + extends RecordHandler +{ + @Override + protected void readWithConstraint(ConstraintEvaluator constraintEvaluator, + BlockSpiller blockSpiller, + ReadRecordsRequest request) + { + //read the data represented by the Split in the request and use the blockSpiller.writeRow() + //to write rows into the response. The Amazon Athena Query Federation SDK handles all the + //boiler plate of spilling large response to S3, and optionally encrypting any spilled data. + //If you source supports filtering, use the Constraints objects on the request to push the predicate + //down into your source. You can also use the provided ConstraintEvaluator to performing filtering + //in this code block. + } +} +``` + +## Performance + +Federated queries may run more slowly than queries which are 100% localized to Athena's execution engine, however much of this is dependent upon the source you are interacting with. +When running a federated query, Athena make use of a deep execution pipeline as well as various data pre-fetch techniques to hide the performance impact of doing remote reads. If +your source supports parallel scans and predicate push-down it is possible to achieve performance that is close to that of native Athena. + +To put some real work context around this, we tested this SDK as well as the usage of AWS Lambda by re-creating Athena's S3 + AWS Glue integration as a federated connector. We then +ran 2 tests using a highly (~3000 files totaling 350GB) parallelizable dataset on S3. The tests were a select count(*) from test_table where our test table had 4 columns of +primitive types (int, bigint, float4, float8). This query was purposely simple because we wanted to stress test the TABLE_SCAN operation which corresponds very closely to the +current capabilities of our connector. We expect most workloads, for parallelizable source tables, to bottleneck on other areas of query execution before running into constraints +associated with federated TABLE_SCAN performance. + +|Test|GB/Sec|Rows/Sec| +|-------------|-----------------|-------------| +|Federated S3 Query w/Apache Arrow|102 Gbps|1.5B rows/sec| +|Athena + TextCSV on S3 Query|115 Gbps|120M rows/sec| +|Athena + Parquet on S3 Query|30Gbps*|2.7B rows/sec| + +*Parquet's run-length encoding makes the GB/sec number somewhat irrelevant for comparison testing but since it is more compact than Apache Arrow it does mean lower network utilization. +**These are not exhaustive tests but rather represent the point at which we stopped validation testing. + + +### Throttling & Rate Limiting + +If your Lambda function(s) throw a FederationThrottleException or if Lambda/EC2 throws a limit exceed exception, Athena will use that as an indication that your Lambda function(s) +or the source they talk to are under too much load and trigger Athena's Additive-Increase/Multiplicative-Decrease based Congestion Control mechanism. Some sources may generate +throttling events in the middle of a Lambda invocation, after some data has already been returned. In these cases, Athena can not always automatically apply congestion control +because retrying the call may lead to incorrect query results. We recommend using ThrottlingInvoker to handle calls to depedent services in your connector. The ThrottlingInvoker +has hooks to see if you've already written rows to the response and thus decide how best to handle a Throttling event either by: sleeping and retrying in your Lamnbda function or +by bubbling up a FederationThrottleException to Athena. + +You can configure ThrottlingInvoker via its builder or for pre-built connectors like athena-cloudwatch by setting the following environment variables: + +1. **throttle_initial_delay_ms** - (Default: 10ms) This is the initial call delay applied after the first congestion event. +1. **throttle_max_delay_ms** - (Default: 1000ms) This is the max delay between calls. You can derive TPS by dividing it into 1000ms. +1. **throttle_decrease_factor** - (Default: 0.5) This is the factor by which we reduce our call rate. +1. **throttle_increase_ms** - (Default: 10ms) This is the rate at which we decrease the call delay. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-federation-sdk/athena-federation-sdk.yaml b/athena-federation-sdk/athena-federation-sdk.yaml new file mode 100644 index 0000000000..538bab6327 --- /dev/null +++ b/athena-federation-sdk/athena-federation-sdk.yaml @@ -0,0 +1,58 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaDataGeneratorConnector + Description: 'This connector enables Amazon Athena to communicate with a randomly generated data source.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connector.lambda.examples.ExampleCompositeHandler" + CodeUri: "./target/aws-athena-federation-sdk-2019.46.1-withdep.jar" + Description: "This connector enables Amazon Athena to communicate with a randomly generated data source." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-federation-sdk/checkstyle.xml b/athena-federation-sdk/checkstyle.xml new file mode 100644 index 0000000000..c79e1ac630 --- /dev/null +++ b/athena-federation-sdk/checkstyle.xml @@ -0,0 +1,167 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/athena-federation-sdk/pom.xml b/athena-federation-sdk/pom.xml new file mode 100644 index 0000000000..bf8a2fae57 --- /dev/null +++ b/athena-federation-sdk/pom.xml @@ -0,0 +1,235 @@ + + + + 4.0.0 + + com.amazonaws + aws-athena-federation-sdk + 2019.47.1 + jar + Amazon Athena Query Federation SDK + + + 1.8 + 1.8 + 1.7.1 + + + + Amazon Web Services + https://https://aws.amazon.com// + + + 2019 + + + + Apache License 2.0 + http://www.apache.org/licenses/LICENSE-2.0 + repo + + + + + + com.amazonaws + aws-java-sdk-secretsmanager + 1.11.490 + + + com.amazonaws + aws-java-sdk-glue + 1.11.490 + + + com.amazonaws + aws-java-sdk-athena + 1.11.490 + + + org.apache.arrow + arrow-vector + 0.11.0 + + + org.apache.arrow + arrow-memory + 0.11.0 + + + com.amazonaws + aws-lambda-java-core + 1.2.0 + + + com.amazonaws + aws-java-sdk-lambda + 1.11.490 + + + com.amazonaws + aws-java-sdk-s3 + 1.11.490 + + + com.amazonaws + aws-java-sdk-kms + 1.11.490 + + + com.google.guava + guava + 21.0 + + + + org.slf4j + slf4j-api + ${slf4jVersion} + + + + org.slf4j + jcl-over-slf4j + ${slf4jVersion} + + + org.slf4j + slf4j-log4j12 + 1.7.25 + + + com.amazonaws + aws-lambda-java-log4j + 1.0.0 + + + + org.bouncycastle + bcprov-jdk15on + 1.61 + + + + junit + junit + 4.12 + test + + + + org.mockito + mockito-all + 1.10.19 + test + + + + + + + + + org.apache.maven.plugins + maven-checkstyle-plugin + 3.1.0 + + checkstyle.xml + UTF-8 + true + false + false + + + + validate + validate + + check + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.1.1 + + true + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + withdep + package + + shade + + + + withdep + + + + with-arrow + package + + shade + + + + with-arrow + true + + + com.amazonaws.athena:* + org.apache.arrow:* + + + + + + + + + org.codehaus.mojo + license-maven-plugin + 2.0.0 + + false + false + false + + + + first + + update-file-header + + process-sources + + apache_v2 + + src/main/java + src/test + + + + + + + + \ No newline at end of file diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/CollectionsUtils.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/CollectionsUtils.java new file mode 100644 index 0000000000..3710372442 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/CollectionsUtils.java @@ -0,0 +1,45 @@ +package com.amazonaws.athena.connector.lambda; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import java.util.Collection; + +public class CollectionsUtils +{ + private CollectionsUtils() {} + + public static boolean equals(Collection lhs, Collection rhs) + { + if (lhs == null && rhs == null) { + return true; + } + + if ((lhs == null && rhs != null) || (lhs != null && rhs == null)) { + return false; + } + + if (lhs.size() != rhs.size()) { + return false; + } + + return lhs.containsAll(rhs); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/QueryStatusChecker.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/QueryStatusChecker.java new file mode 100644 index 0000000000..62ee98b71a --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/QueryStatusChecker.java @@ -0,0 +1,134 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda; + +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.model.GetQueryExecutionRequest; +import com.amazonaws.services.athena.model.GetQueryExecutionResult; +import com.amazonaws.services.athena.model.InvalidRequestException; +import com.google.common.collect.ImmutableSet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Set; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicBoolean; + +import static java.lang.String.format; + +/** + * This class provides a mechanism for callers to terminate in-progress work if the upstream Athena query waiting for that work has + * already terminated. Callers using the SDK as-is should only need to call #isQueryRunning, as #startQueryStatusChecker + * should have already been called by {@link com.amazonaws.athena.connector.lambda.handlers.MetadataHandler} or + * {@link com.amazonaws.athena.connector.lambda.handlers.RecordHandler}. + */ +public class QueryStatusChecker + implements AutoCloseable +{ + private static final Logger logger = LoggerFactory.getLogger(QueryStatusChecker.class); + + // progressively longer delays at which to poll + private static final int[] FIBONACCI = new int[] { 1, 1, 2, 3, 5, 8, 13, 21, 34, 55}; + // Athena terminal states + private static final Set TERMINAL_STATES = ImmutableSet.of("SUCCEEDED", "FAILED", "CANCELLED"); + + private boolean wasStarted = false; + private final AtomicBoolean isRunning = new AtomicBoolean(true); + private final AmazonAthena athena; + private final ThrottlingInvoker athenaInvoker; + private final String queryId; + private final Thread checkerThread; + + public QueryStatusChecker(AmazonAthena athena, ThrottlingInvoker athenaInvoker, String queryId) + { + this.athena = athena; + this.athenaInvoker = athenaInvoker; + this.queryId = queryId; + this.checkerThread = new Thread(() -> runQueryStatusChecker(queryId), "QueryStatusCheckerThread-" + queryId); + } + + /** + * Returns whether the query is still running + */ + public boolean isQueryRunning() + { + // start the checker thread if it hasn't started already + if (!wasStarted) { + synchronized (this) { + if (!wasStarted) { + checkerThread.start(); + wasStarted = true; + } + } + } + return isRunning.get(); + } + + /** + * Stops the status checker thread + */ + @Override + public void close() + { + // fine if the thread isn't running + checkerThread.interrupt(); + logger.debug("Interrupt signal sent to status checker thread"); + } + + private void runQueryStatusChecker(String queryId) + { + int attempt = 0; + while (isRunning.get()) { + int delay = FIBONACCI[Math.min(attempt, FIBONACCI.length - 1)]; + try { + Thread.sleep(delay * 1000); + checkStatus(queryId, attempt); + } + catch (InterruptedException e) { + logger.debug("Checker thread interrupted. Ceasing status polling"); + return; + } + attempt++; + } + logger.debug("Query terminated. Ceasing status polling"); + } + + private void checkStatus(String queryId, int attempt) + throws InterruptedException + { + logger.debug(format("Background thread checking status of Athena query %s, attempt %d", queryId, attempt)); + try { + GetQueryExecutionResult queryExecution = athenaInvoker.invoke(() -> athena.getQueryExecution(new GetQueryExecutionRequest().withQueryExecutionId(queryId))); + String state = queryExecution.getQueryExecution().getStatus().getState(); + if (TERMINAL_STATES.contains(state)) { + logger.debug("Query {} has terminated with state {}", queryId, state); + isRunning.set(false); + } + } + catch (RuntimeException | TimeoutException e) { + logger.warn("Exception {} thrown when calling Athena for query status: {}", e.getClass().getSimpleName(), e.getMessage()); + if (e instanceof InvalidRequestException) { + // query does not exist, so no need to keep calling Athena + logger.debug("Athena reports query {} not found. Interrupting checker thread", queryId); + throw new InterruptedException(); + } + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/ThrottlingInvoker.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/ThrottlingInvoker.java new file mode 100644 index 0000000000..fc1fa99628 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/ThrottlingInvoker.java @@ -0,0 +1,278 @@ +package com.amazonaws.athena.connector.lambda; + +/*- + * #%L + * athena-cloudwatch + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.exceptions.FederationThrottleException; +import org.apache.arrow.util.VisibleForTesting; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.concurrent.Callable; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.AtomicReference; + +/** + * + */ +public class ThrottlingInvoker +{ + private static final Logger logger = LoggerFactory.getLogger(ThrottlingInvoker.class); + + private static final String THROTTLE_INITIAL_DELAY_MS = "throttle_initial_delay_ms"; + private static final String THROTTLE_MAX_DELAY_MS = "throttle_max_delay_ms"; + private static final String THROTTLE_DECREASE_FACTOR = "throttle_decrease_factor"; + private static final String THROTTLE_INCREASE_MS = "throttle_increase_ms"; + + private static final long DEFAULT_INITIAL_DELAY_MS = 10; + private static final long DEFAULT_MAX_DELAY_MS = 1_000; + private static final double DEFAULT_DECREASE_FACTOR = 0.5D; + private static final long DEFAULT_INCREASE_MS = 10; + + private final long initialDelayMs; + private final long maxDelayMs; + private final double decrease; + private final long increase; + private final ExceptionFilter filter; + private final AtomicReference spillerRef; + private final AtomicLong delay = new AtomicLong(0); + private volatile State state = State.FAST_START; + + public enum State + {FAST_START, CONGESTED, AVOIDANCE} + + public interface ExceptionFilter + { + boolean isMatch(Exception ex); + } + + public ThrottlingInvoker(Builder builder) + { + this(builder.initialDelayMs, + builder.maxDelayMs, + builder.decrease, + builder.increase, + builder.filter, + builder.spiller); + } + + public ThrottlingInvoker(long initialDelayMs, + long maxDelayMs, + double decrease, + long increase, + ExceptionFilter filter, + BlockSpiller spiller) + { + if (decrease > 1 || decrease < .001) { + throw new IllegalArgumentException("decrease was " + decrease + " but should be between .001 and 1"); + } + + if (maxDelayMs < 1) { + throw new IllegalArgumentException("maxDelayMs was " + maxDelayMs + " but must be >= 1"); + } + + if (increase < 1) { + throw new IllegalArgumentException("increase was " + increase + " but must be >= 1"); + } + + this.initialDelayMs = initialDelayMs; + this.maxDelayMs = maxDelayMs; + this.decrease = decrease; + this.increase = increase; + this.filter = filter; + this.spillerRef = new AtomicReference<>(spiller); + } + + public static Builder newBuilder() + { + return new Builder(); + } + + public static Builder newDefaultBuilder(ExceptionFilter filter) + { + long initialDelayMs = (System.getenv(THROTTLE_INITIAL_DELAY_MS) != null) ? + Long.parseLong(System.getenv(THROTTLE_INITIAL_DELAY_MS)) : DEFAULT_INITIAL_DELAY_MS; + long maxDelayMs = (System.getenv(THROTTLE_MAX_DELAY_MS) != null) ? + Long.parseLong(System.getenv(THROTTLE_MAX_DELAY_MS)) : DEFAULT_MAX_DELAY_MS; + double decreaseFactor = (System.getenv(THROTTLE_DECREASE_FACTOR) != null) ? + Long.parseLong(System.getenv(THROTTLE_DECREASE_FACTOR)) : DEFAULT_DECREASE_FACTOR; + long increase = (System.getenv(THROTTLE_INCREASE_MS) != null) ? + Long.parseLong(System.getenv(THROTTLE_INCREASE_MS)) : DEFAULT_INCREASE_MS; + + return newBuilder() + .withInitialDelayMs(initialDelayMs) + .withMaxDelayMs(maxDelayMs) + .withDecrease(decreaseFactor) + .withIncrease(increase) + .withFilter(filter); + } + + public T invoke(Callable callable) + throws TimeoutException + { + return invoke(callable, 0); + } + + public T invoke(Callable callable, long timeoutMillis) + throws TimeoutException + { + long startTime = System.currentTimeMillis(); + do { + try { + applySleep(); + T result = callable.call(); + handleAvoidance(); + return result; + } + catch (Exception ex) { + if (!filter.isMatch(ex)) { + //The exception did not match our filter for congestion, throw + throw (ex instanceof RuntimeException) ? (RuntimeException) ex : new RuntimeException(ex); + } + handleThrottle(ex); + } + } + while (!isTimedOut(startTime, timeoutMillis)); + + throw new TimeoutException("Timed out before call succeeded after " + (System.currentTimeMillis() - startTime) + " ms"); + } + + public void setBlockSpiller(BlockSpiller spiller) + { + spillerRef.set(spiller); + } + + public State getState() + { + return state; + } + + @VisibleForTesting + protected long getDelay() + { + return delay.get(); + } + + private synchronized void handleThrottle(Exception ex) + { + if (spillerRef.get() != null && !spillerRef.get().spilled()) { + //If no blocks have spilled, it is better to signal the Throttle to Athena by propagating. + throw new FederationThrottleException("ThrottlingInvoker requesting slow down due to " + ex, ex); + } + + long newDelay = (long) Math.ceil(delay.get() / decrease); + if (newDelay == 0) { + newDelay = initialDelayMs; + } + else if (newDelay > maxDelayMs) { + newDelay = maxDelayMs; + } + logger.info("handleThrottle: Encountered a Throttling event[{}] adjusting delay to {} ms @ {} TPS", + ex, newDelay, 1000D / newDelay); + state = State.CONGESTED; + delay.set(newDelay); + } + + private synchronized void handleAvoidance() + { + long newDelay = delay.get() - increase; + if (newDelay <= 0) { + newDelay = 0; + } + + if (delay.get() > 0) { + state = State.AVOIDANCE; + logger.info("handleAvoidance: Congestion AVOIDANCE active, decreasing delay to {} ms @ {} TPS", + newDelay, (newDelay > 0) ? 1000 / newDelay : "unlimited"); + delay.set(newDelay); + } + } + + private void applySleep() + { + if (delay.get() > 0) { + try { + Thread.sleep(delay.get()); + } + catch (InterruptedException ex) { + Thread.currentThread().interrupt(); + throw new RuntimeException(ex); + } + } + } + + private boolean isTimedOut(long startTime, long timeoutMillis) + { + return (timeoutMillis > 0) ? System.currentTimeMillis() - startTime > timeoutMillis : false; + } + + public static class Builder + { + private long initialDelayMs; + private long maxDelayMs; + private double decrease; + private long increase; + private ExceptionFilter filter; + private BlockSpiller spiller; + + public Builder withInitialDelayMs(long initialDelayMs) + { + this.initialDelayMs = initialDelayMs; + return this; + } + + public Builder withMaxDelayMs(long maxDelayMs) + { + this.maxDelayMs = maxDelayMs; + return this; + } + + public Builder withDecrease(double decrease) + { + this.decrease = decrease; + return this; + } + + public Builder withIncrease(long increase) + { + this.increase = increase; + return this; + } + + public Builder withFilter(ExceptionFilter filter) + { + this.filter = filter; + return this; + } + + public Builder withSpiller(BlockSpiller spiller) + { + this.spiller = spiller; + return this; + } + + public ThrottlingInvoker build() + { + return new ThrottlingInvoker(this); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/ArrowTypeComparator.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/ArrowTypeComparator.java new file mode 100644 index 0000000000..dcca79545d --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/ArrowTypeComparator.java @@ -0,0 +1,109 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.bouncycastle.util.Arrays; +import org.joda.time.LocalDateTime; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.math.BigDecimal; + +/** + * This utility class can be used to implement a comparator for various Apache Arrow typed values. It is mostly + * used as part of our testing harness and notably does not support certain complex types (e.g. STRUCTs). + */ +public class ArrowTypeComparator +{ + private static final Logger logger = LoggerFactory.getLogger(ArrowTypeComparator.class); + + private ArrowTypeComparator() {} + + public static int compare(FieldReader reader, Object lhs, Object rhs) + { + return compare(reader.getField().getType(), lhs, rhs); + } + + //TODO: Add support for Struct + public static int compare(ArrowType arrowType, Object lhs, Object rhs) + { + if (lhs == null && rhs == null) { + return 0; + } + else if (lhs == null) { + return 1; + } + else if (rhs == null) { + return -1; + } + + Types.MinorType type = Types.getMinorTypeForArrowType(arrowType); + switch (type) { + case INT: + case UINT4: + return Integer.compare((int) lhs, (int) rhs); + case TINYINT: + case UINT1: + return Byte.compare((byte) lhs, (byte) rhs); + case SMALLINT: + return Short.compare((short) lhs, (short) rhs); + case UINT2: + return Character.compare((char) lhs, (char) rhs); + case BIGINT: + case UINT8: + return Long.compare((long) lhs, (long) rhs); + case FLOAT8: + return Double.compare((double) lhs, (double) rhs); + case FLOAT4: + return Float.compare((float) lhs, (float) rhs); + case VARCHAR: + return lhs.toString().compareTo(rhs.toString()); + case VARBINARY: + return Arrays.compareUnsigned((byte[]) lhs, (byte[]) rhs); + case DECIMAL: + return ((BigDecimal) lhs).compareTo((BigDecimal) rhs); + case BIT: + return Boolean.compare((boolean) lhs, (boolean) rhs); + case DATEMILLI: + return ((LocalDateTime) lhs).compareTo((LocalDateTime) rhs); + case DATEDAY: + return ((Integer) lhs).compareTo((Integer) rhs); + case LIST: + //This could lead to thrashing if used to sort a collection + if (lhs.equals(rhs)) { + return 0; + } + else if (lhs.hashCode() < rhs.hashCode()) { + return -1; + } + else { + return 1; + } + default: + //logging because throwing in a comparator gets swallowed in many unit tests that use equality asserts + logger.warn("compare: Unknown type " + type + " object: " + lhs.getClass()); + throw new IllegalArgumentException("Unknown type " + type + " object: " + lhs.getClass()); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/Block.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/Block.java new file mode 100644 index 0000000000..0377d6866e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/Block.java @@ -0,0 +1,518 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.google.common.base.MoreObjects; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.VectorLoader; +import org.apache.arrow.vector.VectorSchemaRoot; +import org.apache.arrow.vector.VectorUnloader; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.ipc.message.ArrowRecordBatch; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.beans.Transient; +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Objects; + +import static com.amazonaws.athena.connector.lambda.data.BlockUtils.fieldToString; +import static java.util.Objects.requireNonNull; + +/** + * This class is used to provide a convenient interface for working (reading/writing) Apache Arrow Batches. As such + * this class is mostly a holder for an Apache Arrow Schema and the associated VectorSchema (used for read/write). + * The class also includes helper functions for easily loading/unloading data in the form of Arrow Batches. + * + * @note While using this class as a holder to encapsulate nuances of Apache Arrow can simplify your programming model + * and make it easier to get started, using setValue(...), setComplexValue(...), and any of the related helpers to + * write data to the Apache Arrow structures is less performant than using Apache Arrow's native interfaces. If your usecase + * and source data can be read in a columnar fashion you can achieve significantly (50% - 200%) better performance by + * avoiding setValue(...) and setComplexValue(...). In our testing conversion to Apache Arrow was not a significant + * bottleneck and instead represented extra latency which could be hidden through parallelism and pipelining. This is why + * we opted to offer these convenience methods. + *

+ * Remember to always close your Block(s) when you are done with them. If you are using a BlockAllocator it is still + * recommended that you close() Blocks explicitly wherever possible vs. depending on BlockAllocator.close() to free + * resources. Closing Blocks earlier will reduce peak memory demands and reduce the chance that you exhaust your Apache + * Arrow memory pool. + */ +public class Block + extends SchemaAware + implements AutoCloseable +{ + private static final Logger logger = LoggerFactory.getLogger(Block.class); + + //Used to identify which BlockAllocator owns the underlying memory resources used in this Block for debugging purposes. + //Not included in equality or hashcode. + private final String allocatorId; + //The schema of the block + private final Schema schema; + //The VectorSchemaRoot which can be used to read/write values to/from the underlying Apache Arrow buffers that + //for the Arrow Batch of rows. + private final VectorSchemaRoot vectorSchema; + //Used to constrain writes to the block, be default we use an emptyEvaluator that allows all writes. + //Note that we will _NOT_ close this ConstraintEvaluator because we may not own it and the emptyEvaluator + //has no resources that could leak. + private ConstraintEvaluator constraintEvaluator = ConstraintEvaluator.emptyEvaluator(); + + /** + * Used by a BlockAllocator to construct a block by setting the key values that a Block 'holds'. Most of the meaningful + * construction actually takes place within the BlockAllocator that calls this constructor. + * + * @param allocatorId Identifier of the BlockAllocator that owns the Block's memory resources. + * @param schema The schema of the data that can be read/written to the provided VectorSchema. + * @param vectorSchema Used to read/write values from the Apache Arrow memory buffers owned by this object. + */ + protected Block(String allocatorId, Schema schema, VectorSchemaRoot vectorSchema) + { + requireNonNull(allocatorId, "allocatorId is null"); + requireNonNull(schema, "schema is null"); + requireNonNull(vectorSchema, "vectorSchema is null"); + this.allocatorId = allocatorId; + this.schema = schema; + this.vectorSchema = vectorSchema; + } + + /** + * Used to constrain writes to the Block. + * + * @param constraintEvaluator The ConstraintEvaluator to use check if we should allow a value to be written to the Block. + * @note Setting the ConstraintEvaluator to null disables constraints. + */ + public void constrain(ConstraintEvaluator constraintEvaluator) + { + this.constraintEvaluator = (constraintEvaluator != null) ? constraintEvaluator : ConstraintEvaluator.emptyEvaluator(); + } + + /** + * Returns the ConstraintEvaluator used by the block. + */ + public ConstraintEvaluator getConstraintEvaluator() + { + return constraintEvaluator; + } + + public String getAllocatorId() + { + return allocatorId; + } + + public Schema getSchema() + { + return schema; + } + + /** + * Writes the provided value to the specified field on the specified row. This method does _not_ update the + * row count on the underlying Apache Arrow VectorSchema. You must call setRowCount(...) to ensure the values + * your have written are considered 'valid rows' and thus available when you attempt to serialize this Block. This + * method replies on BlockUtils' field conversion/coercion logic to convert the provided value into a type that + * matches Apache Arrow's supported serialization format. For more details on coercion please see @BlockUtils + * + * @param fieldName The name of the field you wish to write to. + * @param row The row number to write to. Note that Apache Arrow Blocks begin with row 0 just like a typical array. + * @param value The value you wish to write. + * @return True if the value was written to the Block, False if the value was not written due to failing a constraint. + * @note This method will throw an NPE if you call with with a non-existent field. You can use offerValue(...) + * to ignore non-existent fields. This can be useful when you are writing results and want to avoid checking + * if a field has been requested. One such example is when a query projects only a subset of columns and your + * underlying data store is not columnar. + */ + public boolean setValue(String fieldName, int row, Object value) + { + if (constraintEvaluator.apply(fieldName, value)) { + BlockUtils.setValue(getFieldVector(fieldName), row, value); + return true; + } + return false; + } + + /** + * Attempts to write the provided value to the specified field on the specified row. This method does _not_ update the + * row count on the underlying Apache Arrow VectorSchema. You must call setRowCount(...) to ensure the values + * your have written are considered 'valid rows' and thus available when you attempt to serialize this Block. This + * method replies on BlockUtils' field conversion/coercion logic to convert the provided value into a type that + * matches Apache Arrow's supported serialization format. For more details on coercion please see @BlockUtils + * + * @param fieldName The name of the field you wish to write to. + * @param row The row number to write to. Note that Apache Arrow Blocks begin with row 0 just like a typical array. + * @param value The value you wish to write. + * @return True if the value was written to the Block (even if the field is missing from the Block), + * False if the value was not written due to failing a constraint. + * @note This method will take no action if the provided fieldName is not a valid field in this Block's Schema. + * In such cases the method will return true. + */ + public boolean offerValue(String fieldName, int row, Object value) + { + if (constraintEvaluator.apply(fieldName, value)) { + FieldVector vector = getFieldVector(fieldName); + if (vector != null) { + BlockUtils.setValue(vector, row, value); + } + return true; + } + return false; + } + + /** + * Attempts to set the provided value for the given field name and row. If the Block's schema does not + * contain such a field, this method does nothing and returns false. + * + * @param fieldName The name of the field you wish to write to. + * @param row The row number to write to. Note that Apache Arrow Blocks begin with row 0 just like a typical array. + * @param value The value you wish to write. + * @return True if the value was written to the Block, False if the value was not written due to failing a constraint. + * @note This method will throw an NPE if you call with with a non-existent field. You can use offerComplexValue(...) + * to ignore non-existent fields. This can be useful when you are writing results and want to avoid checking + * if a field has been requested. One such example is when a query projects only a subset of columns and your + * underlying data store is not columnar. + */ + public boolean setComplexValue(String fieldName, int row, FieldResolver fieldResolver, Object value) + { + FieldVector vector = getFieldVector(fieldName); + BlockUtils.setComplexValue(vector, row, fieldResolver, value); + return true; + } + + /** + * Attempts to set the provided value for the given field name and row. If the Block's schema does not + * contain such a field, this method does nothing and returns false. + * + * @param fieldName The name of the field you wish to write to. + * @param row The row number to write to. Note that Apache Arrow Blocks begin with row 0 just like a typical array. + * @param value The value you wish to write. + * @return True if the value was written to the Block (even if the field is missing from the Block), + * False if the value was not written due to failing a constraint. + * @note This method will take no action if the provided fieldName is not a valid field in this Block's Schema. + * In such cases the method will return true. + */ + public boolean offerComplexValue(String fieldName, int row, FieldResolver fieldResolver, Object value) + { + FieldVector vector = getFieldVector(fieldName); + if (vector != null) { + BlockUtils.setComplexValue(vector, row, fieldResolver, value); + } + return true; + } + + /** + * Provides access to the Apache Arrow Vector Schema when direct access to Apache Arrow is required. + * + * @return The Apache Arrow Vector Schema. + */ + protected VectorSchemaRoot getVectorSchema() + { + return vectorSchema; + } + + /** + * Sets the valid row count on the underlying Apache Arrow Vector Schema. + * + * @param rowCount The row count to set. + * @Note If you do not set this value then block may not serialize correctly (too few rows) or rows may + * not be readable. + */ + public void setRowCount(int rowCount) + { + vectorSchema.setRowCount(rowCount); + } + + /** + * Returns the current row count as set by calling setRowCount(...) + * + * @return The current valud row count for the Apache Arrow Vector Schema. + */ + public int getRowCount() + { + return vectorSchema.getRowCount(); + } + + /** + * Provides access to the Apache Arrow FieldReader for the given field name. + * + * @param fieldName The name of the field to retrieve. + * @return The FieldReader that can be used to read values from the Block for the specified field. + * @note This method throws NPE if the requested field name is not a valid field name in the block's Schema. + * Additionally, for accessing nested field you must request the parent field and then call reader(String fieldName) + * on the parent FieldReader. You can find some examples of how to use Apache Arrow for complex/nested types in + * the UnitTest for this class or BlockUtils.java. + */ + public FieldReader getFieldReader(String fieldName) + { + return vectorSchema.getVector(fieldName).getReader(); + } + + /** + * Provides access to the Apache Arrow FieldVector which can be used to write values for the given field name. + * + * @param fieldName The name of the field to retrieve. + * @return The FieldVector that can be used to read values from the Block for the specified field or NULL if the field + * is not in this Block's Schema. + * @note Additionally, for accessing nested field you must request the parent field and then call the apprioriate + * method (based on type) to get the child field's FieldVector. You can find some examples of how to use Apache Arrow + * for complex/nested types in the UnitTest for this class or BlockUtils.java. + */ + public FieldVector getFieldVector(String fieldName) + { + return vectorSchema.getVector(fieldName); + } + + /** + * Provides access to the list of all top-level FieldReaders in this Block. + * + * @return List containing the top-level FieldReaders for this block. + */ + public List getFieldReaders() + { + List readers = new ArrayList<>(); + for (FieldVector next : vectorSchema.getFieldVectors()) { + readers.add(next.getReader()); + } + return readers; + } + + /** + * Calculates the current used size in 'bytes' for all Apache Arrow Buffers that comprise the row data for + * this Block. + * + * @return The used bytes of row data in this Block. + * @note This value is likley smaller than the actually memory held by this Block as it only counts the 'used' portion + * of the pre-allocated Apache Arrow Buffers. It is generally safer to think about this value as the size of the Block + * if you serialize it and thus is useful for controlling the size of the Block responses sent to Athena. + */ + @Transient + public long getSize() + { + long size = 0; + for (FieldVector next : vectorSchema.getFieldVectors()) { + size += next.getBufferSize(); + } + return size; + } + + /** + * Provides access to the list of all top-level FieldVectors in this Block. + * + * @return List containing the top-level FieldVectors for this block. + */ + public List getFieldVectors() + { + return vectorSchema.getFieldVectors(); + } + + /** + * Used to unload the Apache Arrow data in this Block in preparation for Serialization. + * + * @return An ArrowRecordBatch containing all row data in this Block for use in serializing the Block. + */ + public ArrowRecordBatch getRecordBatch() + { + VectorUnloader vectorUnloader = new VectorUnloader(vectorSchema); + return vectorUnloader.getRecordBatch(); + } + + /** + * Used to load Apache Arrow data into this Block after it has been deserialized. + * + * @param batch An ArrowRecordBatch containing all row data you'd like to load into this Block. + * @note The batch is closed after being loaded to avoid memory leaks or data corruption since the buffers + * associated with the batch are now owned by this Block. Closing the batch essentially decrements the referrence + * count in the Arrow Allocator. + */ + public void loadRecordBatch(ArrowRecordBatch batch) + { + VectorLoader vectorLoader = new VectorLoader(vectorSchema); + vectorLoader.load(batch); + batch.close(); + } + + /** + * Frees all Apache Arrow Buffers and resources associated with this block. + * + * @throws Exception + */ + @Override + public void close() + throws Exception + { + this.vectorSchema.close(); + } + + @Override + protected Schema internalGetSchema() + { + return schema; + } + + /** + * Provides some basic equality checking for a Block. This method has some draw backs in that is isn't a deep equality + * and will not work for some large complex blocks. At present this method is useful for testing purposes but may be refactored + * in a future release. + */ + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + Block that = (Block) o; + + if (this.schema.getFields().size() != that.schema.getFields().size()) { + return false; + } + + if (this.vectorSchema.getRowCount() != that.vectorSchema.getRowCount()) { + return false; + } + + try { + for (Field next : this.schema.getFields()) { + FieldReader thisReader = vectorSchema.getVector(next.getName()).getReader(); + FieldReader thatReader = that.vectorSchema.getVector(next.getName()).getReader(); + for (int i = 0; i < this.vectorSchema.getRowCount(); i++) { + thisReader.setPosition(i); + thatReader.setPosition(i); + if (ArrowTypeComparator.compare(thisReader, thisReader.readObject(), thatReader.readObject()) != 0) { + return false; + } + } + } + } + catch (IllegalArgumentException ex) { + //can happen when comparator doesn't support the type + throw ex; + } + catch (RuntimeException ex) { + //There are many differences which can cause an exception, easier to handle them this way + logger.warn("equals: ", ex); + return false; + } + + return true; + } + + /** + * Provides some basic equality checking for a Block ignoring ordering. This method has some draw backs in that is + * isn't a deep equality and will not work for some large complex blocks. At present this method is useful for testing + * purposes but may be refactored in a future release. + */ + public boolean equalsAsSet(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + Block that = (Block) o; + + if (this.schema.getFields().size() != that.schema.getFields().size()) { + return false; + } + + if (this.vectorSchema.getRowCount() != that.vectorSchema.getRowCount()) { + return false; + } + + try { + for (Field next : this.schema.getFields()) { + FieldReader thisReader = vectorSchema.getVector(next.getName()).getReader(); + FieldReader thatReader = that.vectorSchema.getVector(next.getName()).getReader(); + for (int i = 0; i < this.vectorSchema.getRowCount(); i++) { + thisReader.setPosition(i); + Types.MinorType type = thisReader.getMinorType(); + Object val = thisReader.readObject(); + boolean matched = false; + for (int j = 0; j < that.vectorSchema.getRowCount(); j++) { + thatReader.setPosition(j); + if (ArrowTypeComparator.compare(thatReader, val, thatReader.readObject()) == 0) { + matched = true; + } + } + if (!matched) { + return false; + } + } + } + } + catch (RuntimeException ex) { + //There are many differences which can cause an exception, easier to handle them this way + return false; + } + + return true; + } + + /** + * Provides some basic hashcode capabilities for the Block. This method has some draw backs in that it is difficult + * to maintain as we add new types and becomes error prone when and slow if missused. This challenge is compounded + * when understanding the right/wrong ways to use this are not easy to convey. + */ + @Override + public int hashCode() + { + int hashcode = 0; + for (Map.Entry next : this.schema.getCustomMetadata().entrySet()) { + hashcode = hashcode + Objects.hashCode(next); + } + + for (Field next : this.schema.getFields()) { + FieldReader thisReader = vectorSchema.getVector(next.getName()).getReader(); + for (int i = 0; i < this.vectorSchema.getRowCount(); i++) { + thisReader.setPosition(i); + hashcode = 31 * hashcode + Objects.hashCode(thisReader.readObject()); + } + } + return hashcode; + } + + @Override + public String toString() + { + MoreObjects.ToStringHelper helper = MoreObjects.toStringHelper(this); + helper.add("rows", getRowCount()); + + int rowsToPrint = this.vectorSchema.getRowCount() > 10 ? 10 : this.vectorSchema.getRowCount(); + for (Field next : this.schema.getFields()) { + FieldReader thisReader = vectorSchema.getVector(next.getName()).getReader(); + List values = new ArrayList<>(); + for (int i = 0; i < rowsToPrint; i++) { + thisReader.setPosition(i); + values.add(fieldToString(thisReader)); + } + helper.add(next.getName(), values); + } + + return helper.toString(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocator.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocator.java new file mode 100644 index 0000000000..3d9ce27e68 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocator.java @@ -0,0 +1,109 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import io.netty.buffer.ArrowBuf; +import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.vector.ipc.message.ArrowRecordBatch; +import org.apache.arrow.vector.types.pojo.Schema; + +/** + * Defines the interface that should be implemented by all reference counting Apache Arrow resource allocators + * that are provided by this SDK. You should use a BlockAllocator over an Apache Arrow BufferAllocator if the lifecycle + * of your Apache Arrow resources are not fully contained in narrow code path. In practice we've found that ensuring + * proper lifecycle for Apache Arrow resources led us to change the structure of our code in ways that made it less + * maintainable than if we had a mechanism to control the lifecyle of Apache Arrow resources that cross-cut our + * request lifecycle. + */ +public interface BlockAllocator + extends AutoCloseable +{ + /** + * Creates an empty Apache Arrow Block with the provided Schema. + * + * @param schema The schema of the Apache Arrow Block. + * @return THe resulting Block. + * @note Once created the Block is also registered with this BlockAllocator such that closing this BlockAllocator + * also closes this Block, freeing its Apache Arrow resources. + */ + Block createBlock(Schema schema); + + /** + * Creates an empty Apache Arrow Buffer of the requested size. This is useful when working with certain Apache Arrow + * types directly. + * + * @param size The number of bytes to reserve for the requested buffer. + * @return THe resulting Apache Arrow Buffer.. + * @note Once created the buffer is also registered with this BlockAllocator such that closing this BlockAllocator + * also closes this buffer, freeing its Apache Arrow resources. + */ + ArrowBuf createBuffer(int size); + + /** + * Allows for a leak-free way to create Apache Arrow Batches. At first glance this method's signature may seem ackward + * when compared to createBuffer(...) or createBlock(...) but ArrowRecordBatches are typically as part of serialization + * and as such are prone to leakage when you serialize or deserialize and invalid Block. With this approach the + * BlockAllocator is able to capture any exceptions from your BatchGenerator and perform nessesary clean up without + * your code having to implement the boiler plate for handling those edge cases. + * + * @param generator The generator which is expected to create an ArrowRecordBatch. + * @return THe resulting Apache Arrow Batch.. + * @note Once created the batch is also registered with this BlockAllocator such that closing this BlockAllocator + * also closes this batch, freeing its Apache Arrow resources. + */ + ArrowRecordBatch registerBatch(BatchGenerator generator); + + /** + * Provides access to the current memory pool usage on the underlying Apache Arrow BufferAllocator. + * + * @return The number of bytes that have been used (e.g. assigned to an Apache Arrow Resource like a block, batch, or buffer). + */ + long getUsage(); + + /** + * Closes all Apache Arrow resources tracked by this BlockAllocator, freeing their memory. + */ + void close(); + + /** + * Provides access to the current state of this BlockAllocator. + * + * @return True if close has been called, False otherwise. + */ + boolean isClosed(); + + /** + * Used to generate a batch in a leak free way using the BlockAllocator to handle + * the boiler plate aspects of error detection and rollback. + */ + interface BatchGenerator + { + /** + * When called by the BlockAllocator you can generate your batch. + * + * @param allocator A referrence to the BlockAllocator you can use to create your batch. + * @return The resulting ArrowRecordBatch. + * @throws Exception + */ + ArrowRecordBatch generate(BufferAllocator allocator) + throws Exception; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocatorImpl.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocatorImpl.java new file mode 100644 index 0000000000..5bcece0b93 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocatorImpl.java @@ -0,0 +1,287 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import io.netty.buffer.ArrowBuf; +import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.memory.RootAllocator; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.VectorSchemaRoot; +import org.apache.arrow.vector.ipc.message.ArrowRecordBatch; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; +import java.util.UUID; +import java.util.concurrent.atomic.AtomicBoolean; + +/** + * Basic BlockAllocator which uses reference counting to perform garbage collection of Apache Arrow resources. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ +public class BlockAllocatorImpl + implements BlockAllocator +{ + private static final Logger logger = LoggerFactory.getLogger(BlockAllocatorImpl.class); + + //Identifier for this block allocator, mostly used by BlockAllocatorRegistry. + private final String id; + //The Apache Arrow Buffer Allocator that we are wrapping with reference counting and clean up. + private final BufferAllocator rootAllocator; + //The Blocks that have been allocated via this BlockAllocator + private final List blocks = new ArrayList<>(); + //The record batches that have been allocated via this BlockAllocator + private final List recordBatches = new ArrayList<>(); + //The arrow buffers that have been allocated via this BlockAllocator + private final List arrowBufs = new ArrayList<>(); + //Flag inficating if this allocator has been closed. + private final AtomicBoolean isClosed = new AtomicBoolean(false); + + /** + * Default constructor. + */ + public BlockAllocatorImpl() + { + this(UUID.randomUUID().toString(), Integer.MAX_VALUE); + } + + /** + * Constructs a BlockAllocatorImpl with the given id. + * + * @param id The id used to identify this BlockAllocatorImpl + */ + public BlockAllocatorImpl(String id) + { + this(id, Integer.MAX_VALUE); + } + + /** + * Constructs a BlockAllocatorImpl with the given id and memory byte limit. + * + * @param id The id used to identify this BlockAllocatorImpl + * @param memoryLimit The max memory, in bytes, that this BlockAllocator is allows to use. + */ + public BlockAllocatorImpl(String id, long memoryLimit) + { + this.rootAllocator = new RootAllocator(memoryLimit); + this.id = id; + } + + /** + * Creates a block and registers it for later clean up if the block isn't explicitly closed by the caller. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + public synchronized Block createBlock(Schema schema) + { + Block block = null; + VectorSchemaRoot vectorSchemaRoot = null; + List vectors = new ArrayList(); + try { + for (Field next : schema.getFields()) { + vectors.add(next.createVector(rootAllocator)); + } + vectorSchemaRoot = new VectorSchemaRoot(schema, vectors, 0); + block = new Block(id, schema, vectorSchemaRoot); + blocks.add(block); + } + catch (Exception ex) { + if (block != null) { + try { + block.close(); + } + catch (Exception ex2) { + logger.error("createBlock: error while closing block during previous error.", ex2); + } + } + + if (vectorSchemaRoot != null) { + vectorSchemaRoot.close(); + } + + for (FieldVector next : vectors) { + next.close(); + } + + throw ex; + } + return block; + } + + /** + * Creates an ArrowBuf and registers it for later clean up if the ArrowBuff isn't explicitly closed by the caller. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + public ArrowBuf createBuffer(int size) + { + ArrowBuf buffer = null; + try { + buffer = rootAllocator.buffer(size); + arrowBufs.add(buffer); + return buffer; + } + catch (Exception ex) { + if (buffer != null) { + buffer.close(); + } + throw ex; + } + } + + /** + * Creates an ArrowRecordBatch and registers it for later clean up if the ArrowRecordBatch isn't explicitly closed + * by the caller. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + public synchronized ArrowRecordBatch registerBatch(BatchGenerator generator) + { + try { + logger.debug("registerBatch: {}", recordBatches.size()); + ArrowRecordBatch batch = generator.generate(getRawAllocator()); + recordBatches.add(batch); + return batch; + } + catch (org.apache.arrow.memory.OutOfMemoryException ex) { + //Must not wrap or we may break resource management logic elsewhere + throw ex; + } + catch (RuntimeException ex) { + throw ex; + } + catch (Exception ex) { + throw new RuntimeException(ex); + } + } + + /** + * Provides access to the underlying Apache Arrow Allocator. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + protected synchronized BufferAllocator getRawAllocator() + { + logger.debug("getRawAllocator: enter"); + return rootAllocator; + } + + /** + * Attempts to close all Blocks allocated by this BlockAllocator. + */ + @VisibleForTesting + protected synchronized void closeBlocks() + { + logger.debug("closeBlocks: {}", blocks.size()); + for (Block next : blocks) { + try { + next.close(); + } + catch (Exception ex) { + logger.warn("closeBlocks: Error closing block", ex); + } + } + blocks.clear(); + } + + /** + * Attempts to close all buffers allocated by this BlockAllocator. + */ + @VisibleForTesting + protected synchronized void closeBuffers() + { + logger.debug("closeBuffers: {}", arrowBufs.size()); + for (ArrowBuf next : arrowBufs) { + try { + next.close(); + } + catch (Exception ex) { + logger.warn("closeBuffers: Error closing buffer", ex); + } + } + arrowBufs.clear(); + } + + /** + * Attempts to close all batches allocated by this BlockAllocator. + */ + @VisibleForTesting + protected synchronized void closeBatches() + { + logger.debug("closeBatches: {}", recordBatches.size()); + for (ArrowRecordBatch next : recordBatches) { + try { + next.close(); + } + catch (Exception ex) { + logger.warn("closeBatches: Error closing batch", ex); + } + } + recordBatches.clear(); + } + + /** + * Returns number of bytes in the Apache Arrow Pool that are used. This is not the same as the actual + * reserved memory usage you may be familiar with from your operating system. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + public long getUsage() + { + return rootAllocator.getAllocatedMemory(); + } + + /** + * Closes all Apache Arrow Resources allocated via this BlockAllocator and then attempts to + * close the underlying Apache Arrow Allocator which would actually free memory. This operation may + * fail if the underlying Apache Arrow Allocator was used to allocate resources without registering + * them to this BlockAllocator and those resources were not freed prior to calling close. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + @Override + public synchronized void close() + { + if (!isClosed.get()) { + isClosed.set(true); + closeBatches(); + closeBlocks(); + closeBuffers(); + rootAllocator.close(); + } + } + + /** + * Indicates if this BlockAllocator has been closed. + * + * @see com.amazonaws.athena.connector.lambda.data.BlockAllocator + */ + @Override + public boolean isClosed() + { + return isClosed.get(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocatorRegistry.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocatorRegistry.java new file mode 100644 index 0000000000..7eede65e9c --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockAllocatorRegistry.java @@ -0,0 +1,41 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +/** + * Used to track BlockAllocators in transactional environments where you want tighter control over + * how much memory a particular transaction uses. By using the same BlockAllocator for all resources + * in a given transaction you can limit the total memory used by that transaction. This also proves + * to be a useful mechanism to inject BlockAllocators in contexts which are difficult to access with + * more traditional dependency injection mechanism. One such example is in a ObjectMapper that is deserializing + * and incoming request. + */ +public interface BlockAllocatorRegistry +{ + /** + * Gets or creates a new Block Allocator for the given context (id). + * + * @param id The id of the context for which you'd like a BlockAllocator. + * @return The BlockAllocator associated with that id, or a new BlockAllocator for that id which will then + * be vended for any future calls for that id. + */ + BlockAllocator getOrCreateAllocator(String id); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockSpiller.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockSpiller.java new file mode 100644 index 0000000000..535a867a56 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockSpiller.java @@ -0,0 +1,64 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; + +import java.util.List; + +/** + * Used to write blocks which may require chunking and optionally spilling via a secondary communication channel. + */ +public interface BlockSpiller + extends BlockWriter +{ + /** + * Indicates if any part of the response written thus far has been spilled. + * + * @return True if at least 1 block has spilled, False otherwise. + */ + boolean spilled(); + + /** + * Provides access to the single buffered block in the event that spilled() is false. + * + * @return + */ + Block getBlock(); + + /** + * Provides access to the manifest of SpillLocation(s) if spilled is true. + * + * @return + */ + List getSpillLocations(); + + /** + * Frees any resources associated with the BlockSpiller. + */ + void close(); + + /** + * Provides access to the ConstraintEvaluator that will be applied to the generated Blocks. + */ + ConstraintEvaluator getConstraintEvaluator(); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockUtils.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockUtils.java new file mode 100644 index 0000000000..6dd9b16e4a --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockUtils.java @@ -0,0 +1,1108 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import io.netty.buffer.ArrowBuf; +import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.BigIntVector; +import org.apache.arrow.vector.BitVector; +import org.apache.arrow.vector.DateDayVector; +import org.apache.arrow.vector.DateMilliVector; +import org.apache.arrow.vector.DecimalVector; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.Float4Vector; +import org.apache.arrow.vector.Float8Vector; +import org.apache.arrow.vector.IntVector; +import org.apache.arrow.vector.SmallIntVector; +import org.apache.arrow.vector.TinyIntVector; +import org.apache.arrow.vector.UInt1Vector; +import org.apache.arrow.vector.UInt2Vector; +import org.apache.arrow.vector.UInt4Vector; +import org.apache.arrow.vector.UInt8Vector; +import org.apache.arrow.vector.VarBinaryVector; +import org.apache.arrow.vector.VarCharVector; +import org.apache.arrow.vector.complex.ListVector; +import org.apache.arrow.vector.complex.StructVector; +import org.apache.arrow.vector.complex.impl.UnionListWriter; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.complex.writer.BaseWriter.StructWriter; +import org.apache.arrow.vector.complex.writer.FieldWriter; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.arrow.vector.util.Text; +import org.apache.commons.codec.Charsets; + +import java.math.BigDecimal; +import java.math.RoundingMode; +import java.time.LocalDate; +import java.time.LocalDateTime; +import java.time.ZoneId; +import java.util.Arrays; +import java.util.Collection; +import java.util.Date; +import java.util.Iterator; +import java.util.List; +import java.util.Map; + +/** + * This utility class abstracts many facets of reading and writing values into Apache Arrow's FieldReader and FieldVector + * objects. + * + * @note This class encourages a row wise approach to writing results. These interfaces are often viewed as simpler than + * the would be columnar equivalents. Even though many of the systems that we've integrated with using this SDK do not + * themselves support columnar access patterns, there is value in offering a a variant of these mechanisms that provide + * the skeleton for columnar writing/reading of results. + *

+ * The current SDK version takes the approach that experts can drop into 'native' Apache Arrow mode and simply not use + * this abstraction. This approach of making common things easy and still enabling access to a 'power user' mode is + * one we'd like to stick with but we'd also like to make it easier for customers that can/want a more columnar + * experience/performance to be able to do so more easily. + *

+ * In general the abstractions provided by this utility class also come with a performance hit when compared with native, + * columnar, Apache Arrow access patterns. The performance overhead primarily results from Object overhead related to boxing + * and/or type conversion. The second source of overhead is the constant lookup and branching of field types, vectors, readers, + * etc.. Some of this second category of overhead can be mitigated by being mindful of how you use this class but a more + * ideal solution would be to offer an interface that steers you in a better direction. + *

+ * An issue has been opened to track the creation of a columnar variant of this utility: + * https://github.com/awslabs/aws-athena-query-federation/issues/1 + */ +public class BlockUtils +{ + public static final ZoneId UTC_ZONE_ID = ZoneId.of("UTC"); + + /** + * Creates a new Block with a single column and populated with the provided values. + * + * @param allocator The BlockAllocator to use when creating the Block. + * @param columnName The name of the single column in the Block's Schema. + * @param type The Apache Arrow Type of the column. + * @param values The values to write to the new Block. Each value will be its own row. + * @return The newly created Block with a single column Schema at populated with the provided values. + */ + public static Block newBlock(BlockAllocator allocator, String columnName, ArrowType type, Object... values) + { + return newBlock(allocator, columnName, type, Arrays.asList(values)); + } + + /** + * Creates a new Block with a single column and populated with the provided values. + * + * @param allocator The BlockAllocator to use when creating the Block. + * @param columnName The name of the single column in the Block's Schema. + * @param type The Apache Arrow Type of the column. + * @param values The values to write to the new Block. Each value will be its own row. + * @return The newly created Block with a single column Schema at populated with the provided values. + */ + public static Block newBlock(BlockAllocator allocator, String columnName, ArrowType type, Collection values) + { + SchemaBuilder schemaBuilder = new SchemaBuilder(); + schemaBuilder.addField(columnName, type); + Schema schema = schemaBuilder.build(); + Block block = allocator.createBlock(schema); + int count = 0; + for (Object next : values) { + try { + setValue(block.getFieldVector(columnName), count++, next); + } + catch (Exception ex) { + throw new RuntimeException("Error for " + type + " " + columnName + " " + next, ex); + } + } + block.setRowCount(count); + return block; + } + + /** + * Creates a new, empty, Block with a single column. + * + * @param allocator The BlockAllocator to use when creating the Block. + * @param columnName The name of the single column in the Block's Schema. + * @param type The Apache Arrow Type of the column. + * @return The newly created, empty, Block with a single column Schema. + */ + public static Block newEmptyBlock(BlockAllocator allocator, String columnName, ArrowType type) + { + SchemaBuilder schemaBuilder = new SchemaBuilder(); + schemaBuilder.addField(columnName, type); + Schema schema = schemaBuilder.build(); + return allocator.createBlock(schema); + } + + /** + * Used to set complex values (Struct, List, etc...) on the provided FieldVector. + * + * @param vector The FieldVector into which we should write the provided value. + * @param pos The row number that the value should be written to. + * @param resolver The FieldResolver that can be used to map your value to the complex type (mostly for Structs, Maps). + * @param value The value to write. + * @note This method incurs more Object overhead (heap churn) than using Arrow's native interface. Users of this Utility + * should weigh their performance needs vs. the readability / ease of use. + */ + public static void setComplexValue(FieldVector vector, int pos, FieldResolver resolver, Object value) + { + if (vector instanceof ListVector) { + if (value != null) { + UnionListWriter writer = ((ListVector) vector).getWriter(); + writer.setPosition(pos); + writeList(vector.getAllocator(), + writer, + vector.getField(), + pos, + ((List) value).iterator(), + resolver); + ((ListVector) vector).setNotNull(pos); + } + } + else if (vector instanceof StructVector) { + StructWriter writer = ((StructVector) vector).getWriter(); + writer.setPosition(pos); + writeStruct(vector.getAllocator(), + writer, + vector.getField(), + pos, + value, + resolver); + } + else { + throw new RuntimeException("Unsupported 'Complex' vector " + + vector.getClass().getSimpleName() + " for field " + vector.getField().getName()); + } + } + + /** + * Used to set values (Int, BigInt, Bit, etc...) on the provided FieldVector. + * + * @param vector The FieldVector into which we should write the provided value. + * @param pos The row number that the value should be written to. + * @param value The value to write. + * @note This method incurs more Object overhead (heap churn) than using Arrow's native interface. Users of this Utility + * should weigh their performance needs vs. the readability / ease of use. + */ + public static void setValue(FieldVector vector, int pos, Object value) + { + try { + if (value == null) { + setNullValue(vector, pos); + return; + } + + //TODO: add all types + switch (vector.getMinorType()) { + case DATEMILLI: + if (value instanceof Date) { + ((DateMilliVector) vector).setSafe(pos, ((Date) value).getTime()); + } + else if (value instanceof LocalDateTime) { + ((DateMilliVector) vector).setSafe( + pos, + ((LocalDateTime) value).atZone(UTC_ZONE_ID).toInstant().toEpochMilli()); + } + else { + ((DateMilliVector) vector).setSafe(pos, (long) value); + } + break; + case DATEDAY: + if (value instanceof Date) { + org.joda.time.Days days = org.joda.time.Days.daysBetween(EPOCH, + new org.joda.time.DateTime(((Date) value).getTime())); + ((DateDayVector) vector).setSafe(pos, days.getDays()); + } + if (value instanceof LocalDate) { + int days = (int) ((LocalDate) value).toEpochDay(); + ((DateDayVector) vector).setSafe(pos, days); + } + else if (value instanceof Long) { + ((DateDayVector) vector).setSafe(pos, ((Long) value).intValue()); + } + else { + ((DateDayVector) vector).setSafe(pos, (int) value); + } + break; + case FLOAT8: + ((Float8Vector) vector).setSafe(pos, (double) value); + break; + case FLOAT4: + ((Float4Vector) vector).setSafe(pos, (float) value); + break; + case INT: + if (value != null && value instanceof Long) { + //This may seem odd at first but many frameworks (like Presto) use long as the preferred + //native java type for representing integers. We do this to keep type conversions simple. + ((IntVector) vector).setSafe(pos, ((Long) value).intValue()); + } + else { + ((IntVector) vector).setSafe(pos, (int) value); + } + break; + case TINYINT: + if (value instanceof Byte) { + ((TinyIntVector) vector).setSafe(pos, (byte) value); + } + else { + ((TinyIntVector) vector).setSafe(pos, (int) value); + } + break; + case SMALLINT: + if (value instanceof Short) { + ((SmallIntVector) vector).setSafe(pos, (short) value); + } + else { + ((SmallIntVector) vector).setSafe(pos, (int) value); + } + break; + case UINT1: + ((UInt1Vector) vector).setSafe(pos, (int) value); + break; + case UINT2: + ((UInt2Vector) vector).setSafe(pos, (int) value); + break; + case UINT4: + ((UInt4Vector) vector).setSafe(pos, (int) value); + break; + case UINT8: + ((UInt8Vector) vector).setSafe(pos, (int) value); + break; + case BIGINT: + ((BigIntVector) vector).setSafe(pos, (long) value); + break; + case VARBINARY: + ((VarBinaryVector) vector).setSafe(pos, (byte[]) value); + break; + case DECIMAL: + DecimalVector dVector = ((DecimalVector) vector); + if (value instanceof Double) { + BigDecimal bdVal = new BigDecimal((double) value); + bdVal = bdVal.setScale(dVector.getScale(), RoundingMode.HALF_UP); + dVector.setSafe(pos, bdVal); + } + else { + BigDecimal scaledValue = ((BigDecimal) value).setScale(dVector.getScale(), RoundingMode.HALF_UP); + ((DecimalVector) vector).setSafe(pos, scaledValue); + } + break; + case VARCHAR: + if (value instanceof String) { + ((VarCharVector) vector).setSafe(pos, ((String) value).getBytes(Charsets.UTF_8)); + } + else { + ((VarCharVector) vector).setSafe(pos, (Text) value); + } + break; + case BIT: + if (value instanceof Integer && (int) value > 0) { + ((BitVector) vector).setSafe(pos, 1); + } + else if (value instanceof Boolean && (boolean) value) { + ((BitVector) vector).setSafe(pos, 1); + } + else { + ((BitVector) vector).setSafe(pos, 0); + } + break; + default: + throw new IllegalArgumentException("Unknown type " + vector.getMinorType()); + } + } + catch (RuntimeException ex) { + String fieldName = (vector != null) ? vector.getField().getName() : "null_vector"; + throw new RuntimeException("Unable to set value for field " + fieldName + " using value " + value, ex); + } + } + + /** + * Used to convert a specific row in the provided Block to a human readable string. This is useful for diagnostic + * logging. + * + * @param block The Block to read the row from. + * @param row The row number to read. + * @return The human readable String representation of the requested row. + */ + public static String rowToString(Block block, int row) + { + if (row > block.getRowCount()) { + throw new IllegalArgumentException(row + " exceeds available rows " + block.getRowCount()); + } + + StringBuilder sb = new StringBuilder(); + for (FieldReader nextReader : block.getFieldReaders()) { + try { + nextReader.setPosition(row); + if (sb.length() > 0) { + sb.append(", "); + } + sb.append("["); + sb.append(nextReader.getField().getName()); + sb.append(" : "); + sb.append(fieldToString(nextReader)); + sb.append("]"); + } + catch (RuntimeException ex) { + throw new RuntimeException("Error processing field " + nextReader.getField().getName(), ex); + } + } + + return sb.toString(); + } + + /** + * Used to convert a single cell for the given FieldReader to a human readable string. + * + * @param reader The FieldReader from which we should read the current cell. This means the position to be read should + * have been set on the reader before calling this method. + * @return The human readable String representation of the value at the FieldReaders current position. + */ + public static String fieldToString(FieldReader reader) + { + switch (reader.getMinorType()) { + case DATEDAY: + return String.valueOf(reader.readInteger()); + case DATEMILLI: + return String.valueOf(reader.readLocalDateTime()); + case FLOAT8: + case FLOAT4: + case UINT4: + case UINT8: + case INT: + case BIGINT: + case VARCHAR: + case BIT: + return String.valueOf(reader.readObject()); + case DECIMAL: + return String.valueOf(reader.readBigDecimal()); + case SMALLINT: + return String.valueOf(reader.readShort()); + case TINYINT: + case UINT1: + return Integer.valueOf(reader.readByte()).toString(); + case UINT2: + return Integer.valueOf(reader.readCharacter()).toString(); + case VARBINARY: + return bytesToHex(reader.readByteArray()); + case STRUCT: + StringBuilder sb = new StringBuilder(); + sb.append("{"); + for (Field child : reader.getField().getChildren()) { + if (sb.length() > 3) { + sb.append(","); + } + sb.append("["); + sb.append(child.getName()); + sb.append(" : "); + sb.append(fieldToString(reader.reader(child.getName()))); + sb.append("]"); + } + sb.append("}"); + return sb.toString(); + case LIST: + StringBuilder sbList = new StringBuilder(); + sbList.append("{"); + while (reader.next()) { + if (sbList.length() > 1) { + sbList.append(","); + } + sbList.append(fieldToString(reader.reader())); + } + sbList.append("}"); + return sbList.toString(); + default: + Object obj = reader.readObject(); + return reader.getMinorType() + " - " + ((obj != null) ? obj.getClass().toString() : "null") + + "[ " + String.valueOf(obj) + " ]"; + } + } + + /** + * Copies a inclusive range of rows from one block to another. + * + * @param srcBlock The source Block to copy the range of rows from. + * @param dstBlock The destination Block to copy the range of rows to. + * @param firstRow The first row we'd like to copy. + * @param lastRow The last row we'd like to copy. + * @return The number of rows that were copied. + */ + public static int copyRows(Block srcBlock, Block dstBlock, int firstRow, int lastRow) + { + if (firstRow > lastRow || lastRow > srcBlock.getRowCount() - 1) { + throw new RuntimeException("src has " + srcBlock.getRowCount() + + " but requested copy of " + firstRow + " to " + lastRow); + } + + for (FieldReader src : srcBlock.getFieldReaders()) { + int dstOffset = dstBlock.getRowCount(); + for (int i = firstRow; i <= lastRow; i++) { + FieldVector dst = dstBlock.getFieldVector(src.getField().getName()); + src.setPosition(i); + setValue(dst, dstOffset++, src.readObject()); + } + } + + int rowsCopied = 1 + (lastRow - firstRow); + dstBlock.setRowCount(dstBlock.getRowCount() + rowsCopied); + return rowsCopied; + } + + /** + * Checks if a row is null by checking that all fields in that row are null (aka not set). + * + * @param block The Block we'd like to check. + * @param row The row number we'd like to check. + * @return True if the entire row is null (aka all fields null/unset), False if any field has a non-null value. + */ + public static boolean isNullRow(Block block, int row) + { + if (row > block.getRowCount() - 1) { + throw new RuntimeException("block has " + block.getRowCount() + + " rows but requested to check " + row); + } + + //If any column is non-null then return false + for (FieldReader src : block.getFieldReaders()) { + src.setPosition(row); + if (src.isSet()) { + return false; + } + } + + return true; + } + + /** + * Used to write a List value. + * + * @param allocator The BlockAllocator which can be used to generate Apache Arrow Buffers for types + * which require conversion to an Arrow Buffer before they can be written using the FieldWriter. + * @param writer The FieldWriter for the List field we'd like to write into. + * @param field The Schema details of the List Field we are writing into. + * @param pos The position (row) in the Apache Arrow batch we are writing to. + * @param value An iterator to the collection of values we want to write into the row. + * @param resolver The field resolver that can be used to extract individual values from the value iterator. + */ + @VisibleForTesting + protected static void writeList(BufferAllocator allocator, + FieldWriter writer, + Field field, + int pos, + Iterator value, + FieldResolver resolver) + { + //Apache Arrow List types have a single 'special' child field which gives us the concrete type of the values + //stored in the list. + Field child = null; + if (field.getChildren() != null && !field.getChildren().isEmpty()) { + child = field.getChildren().get(0); + } + + //Mark the beginning of the list, this is essentially how Apache Arrow handles the variable length nature + //of lists. + writer.startList(); + + Iterator itr = value; + while (itr.hasNext()) { + //For each item in the iterator, attempt to write it to the list. + Object val = itr.next(); + if (val != null) { + switch (Types.getMinorTypeForArrowType(child.getType())) { + case LIST: + try { + writeList(allocator, (FieldWriter) writer.list(), child, pos, ((List) val).iterator(), resolver); + } + catch (Exception ex) { + throw ex; + } + break; + case STRUCT: + writeStruct(allocator, writer.struct(), child, pos, val, resolver); + break; + default: + writeListValue(writer, child.getType(), allocator, val); + break; + } + } + } + writer.endList(); + } + + /** + * Used to write a Struct value. + * + * @param allocator The BlockAllocator which can be used to generate Apache Arrow Buffers for types + * which require conversion to an Arrow Buffer before they can be written using the FieldWriter. + * @param writer The FieldWriter for the Struct field we'd like to write into. + * @param field The Schema details of the Struct Field we are writing into. + * @param pos The position (row) in the Apache Arrow batch we are writing to. + * @param value The value we'd like to write as a struct. + * @param resolver The field resolver that can be used to extract individual Struct fields from the value. + */ + @VisibleForTesting + protected static void writeStruct(BufferAllocator allocator, + StructWriter writer, + Field field, + int pos, + Object value, + FieldResolver resolver) + { + //We expect null writes to have been handled earlier so this is a no-op. + if (value == null) { + return; + } + + //Indicate the beginning of the struct value, this is how Apache Arrow handles the variable length of Struct types. + writer.start(); + for (Field nextChild : field.getChildren()) { + //For each child field that comprises the struct, attempt to extract and write the corresponding value + //using the FieldResolver. + Object childValue = resolver.getFieldValue(nextChild, value); + switch (Types.getMinorTypeForArrowType(nextChild.getType())) { + case LIST: + writeList(allocator, + (FieldWriter) writer.list(nextChild.getName()), + nextChild, + pos, + ((List) childValue).iterator(), + resolver); + break; + case STRUCT: + writeStruct(allocator, + writer.struct(nextChild.getName()), + nextChild, + pos, + childValue, + resolver); + break; + default: + writeStructValue(writer, nextChild, allocator, childValue); + break; + } + } + writer.end(); + } + + @VisibleForTesting + /** + * Maps an Arrow Type to a Java class. + * @param minorType + * @return Java class mapping the Arrow type + */ + public static Class getJavaType(Types.MinorType minorType) + { + switch (minorType) { + case DATEMILLI: + return LocalDateTime.class; + case TINYINT: + case UINT1: + return Byte.class; + case SMALLINT: + return Short.class; + case UINT2: + return Character.class; + case DATEDAY: + return LocalDate.class; + case INT: + case UINT4: + return Integer.class; + case UINT8: + case BIGINT: + return Long.class; + case DECIMAL: + return BigDecimal.class; + case FLOAT4: + return Float.class; + case FLOAT8: + return Double.class; + case VARCHAR: + return String.class; + case VARBINARY: + return byte[].class; + case BIT: + return Boolean.class; + case LIST: + return List.class; + case STRUCT: + return Map.class; + default: + throw new IllegalArgumentException("Unknown type " + minorType); + } + } + + /** + * Used to write an individual value into a List field, multiple calls to this method per-cell are expected in order + * to write the N values of a list of size N. + * + * @param writer The FieldWriter (already positioned at the row and list entry number) that we want to write into. + * @param type The concrete type of the List's values. + * @param allocator The BlockAllocator that can be used for allocating Arrow Buffers for fields which require conversion + * to Arrow Buff before being written. + * @param value The value to write. + * @note This method and its Struct complement violate the DRY mantra because ListWriter and StructWriter don't share + * a meaningful ancestor despite having identical methods. This requires us to either further wrap and abstract the writer + * or duplicate come code. In a future release we hope to have contributed a better option to Apache Arrow which allows + * us to simplify this method. + */ + protected static void writeListValue(FieldWriter writer, ArrowType type, BufferAllocator allocator, Object value) + { + try { + //TODO: add all types + switch (Types.getMinorTypeForArrowType(type)) { + case DATEMILLI: + if (value instanceof Date) { + writer.writeDateMilli(((Date) value).getTime()); + } + else { + writer.writeDateMilli((long) value); + } + break; + case DATEDAY: + if (value instanceof Date) { + org.joda.time.Days days = org.joda.time.Days.daysBetween(EPOCH, + new org.joda.time.DateTime(((Date) value).getTime())); + writer.writeDateDay(days.getDays()); + } + else if (value instanceof LocalDate) { + int days = (int) ((LocalDate) value).toEpochDay(); + writer.writeDateDay(days); + } + else if (value instanceof Long) { + writer.writeDateDay(((Long) value).intValue()); + } + else { + writer.writeDateDay((int) value); + } + break; + case FLOAT8: + writer.float8().writeFloat8((double) value); + break; + case FLOAT4: + writer.float4().writeFloat4((float) value); + break; + case INT: + if (value != null && value instanceof Long) { + //This may seem odd at first but many frameworks (like Presto) use long as the preferred + //native java type for representing integers. We do this to keep type conversions simple. + writer.integer().writeInt(((Long) value).intValue()); + } + else { + writer.integer().writeInt((int) value); + } + break; + case TINYINT: + writer.tinyInt().writeTinyInt((byte) value); + break; + case SMALLINT: + writer.smallInt().writeSmallInt((short) value); + break; + case UINT1: + writer.uInt1().writeUInt1((byte) value); + break; + case UINT2: + writer.uInt2().writeUInt2((char) value); + break; + case UINT4: + writer.uInt4().writeUInt4((int) value); + break; + case UINT8: + writer.uInt8().writeUInt8((long) value); + break; + case BIGINT: + writer.bigInt().writeBigInt((long) value); + break; + case VARBINARY: + if (value instanceof ArrowBuf) { + ArrowBuf buf = (ArrowBuf) value; + writer.varBinary().writeVarBinary(0, buf.capacity(), buf); + } + else if (value instanceof byte[]) { + byte[] bytes = (byte[]) value; + try (ArrowBuf buf = allocator.buffer(bytes.length)) { + buf.writeBytes(bytes); + writer.varBinary().writeVarBinary(0, buf.readableBytes(), buf); + } + } + break; + case DECIMAL: + int scale = ((ArrowType.Decimal) type).getScale(); + if (value instanceof Double) { + int precision = ((ArrowType.Decimal) type).getPrecision(); + BigDecimal bdVal = new BigDecimal((double) value); + bdVal = bdVal.setScale(scale, RoundingMode.HALF_UP); + writer.decimal().writeDecimal(bdVal); + } + else { + BigDecimal scaledValue = ((BigDecimal) value).setScale(scale, RoundingMode.HALF_UP); + writer.decimal().writeDecimal(scaledValue); + } + break; + case VARCHAR: + if (value instanceof String) { + byte[] bytes = ((String) value).getBytes(Charsets.UTF_8); + try (ArrowBuf buf = allocator.buffer(bytes.length)) { + buf.writeBytes(bytes); + writer.varChar().writeVarChar(0, buf.readableBytes(), buf); + } + } + else if (value instanceof ArrowBuf) { + ArrowBuf buf = (ArrowBuf) value; + writer.varChar().writeVarChar(0, buf.readableBytes(), buf); + } + else if (value instanceof byte[]) { + byte[] bytes = (byte[]) value; + try (ArrowBuf buf = allocator.buffer(bytes.length)) { + buf.writeBytes(bytes); + writer.varChar().writeVarChar(0, buf.readableBytes(), buf); + } + } + break; + case BIT: + if (value instanceof Integer && (int) value > 0) { + writer.bit().writeBit(1); + } + else if (value instanceof Boolean && (boolean) value) { + writer.bit().writeBit(1); + } + else { + writer.bit().writeBit(0); + } + break; + default: + throw new IllegalArgumentException("Unknown type " + type); + } + } + catch (RuntimeException ex) { + String fieldName = (writer.getField() != null) ? writer.getField().getName() : "null_vector"; + throw new RuntimeException("Unable to write value for field " + fieldName + " using value " + value, ex); + } + } + + /** + * Used to write a value into a specific child field within a Struct. Multiple calls to this method per-cell are + * expected in order to write to all N fields of a Struct. + * + * @param writer The FieldWriter (already positioned at the row and list entry number) that we want to write into. + * @param field The child field we are attempting to write into. + * @param allocator The BlockAllocator that can be used for allocating Arrow Buffers for fields which require conversion + * to Arrow Buff before being written. + * @param value The value to write. + * @note This method and its List complement violate the DRY mantra because ListWriter and StructWriter don't share + * a meaningful ancestor despite having identical methods. This requires us to either further wrap and abstract the writer + * or duplicate come code. In a future release we hope to have contributed a better option to Apache Arrow which allows + * us to simplify this method. + */ + @VisibleForTesting + protected static void writeStructValue(StructWriter writer, Field field, BufferAllocator allocator, Object value) + { + if (value == null) { + return; + } + + ArrowType type = field.getType(); + try { + switch (Types.getMinorTypeForArrowType(type)) { + case DATEMILLI: + if (value instanceof Date) { + writer.dateMilli(field.getName()).writeDateMilli(((Date) value).getTime()); + } + else { + writer.dateMilli(field.getName()).writeDateMilli((long) value); + } + break; + + case DATEDAY: + if (value instanceof Date) { + org.joda.time.Days days = org.joda.time.Days.daysBetween(EPOCH, + new org.joda.time.DateTime(((Date) value).getTime())); + writer.dateDay(field.getName()).writeDateDay(days.getDays()); + } + else if (value instanceof LocalDate) { + int days = (int) ((LocalDate) value).toEpochDay(); + writer.dateDay(field.getName()).writeDateDay(days); + } + else if (value instanceof Long) { + writer.dateDay(field.getName()).writeDateDay(((Long) value).intValue()); + } + else { + writer.dateDay(field.getName()).writeDateDay((int) value); + } + break; + case FLOAT8: + writer.float8(field.getName()).writeFloat8((double) value); + break; + case FLOAT4: + writer.float4(field.getName()).writeFloat4((float) value); + break; + case INT: + if (value != null && value instanceof Long) { + //This may seem odd at first but many frameworks (like Presto) use long as the preferred + //native java type for representing integers. We do this to keep type conversions simple. + writer.integer(field.getName()).writeInt(((Long) value).intValue()); + } + else { + writer.integer(field.getName()).writeInt((int) value); + } + break; + case TINYINT: + writer.tinyInt(field.getName()).writeTinyInt((byte) value); + break; + case SMALLINT: + writer.smallInt(field.getName()).writeSmallInt((short) value); + break; + case UINT1: + writer.uInt1(field.getName()).writeUInt1((byte) value); + break; + case UINT2: + writer.uInt2(field.getName()).writeUInt2((char) value); + break; + case UINT4: + writer.uInt4(field.getName()).writeUInt4((int) value); + break; + case UINT8: + writer.uInt8(field.getName()).writeUInt8((long) value); + break; + case BIGINT: + writer.bigInt(field.getName()).writeBigInt((long) value); + break; + case VARBINARY: + if (value instanceof ArrowBuf) { + ArrowBuf buf = (ArrowBuf) value; + writer.varBinary(field.getName()).writeVarBinary(0, buf.capacity(), buf); + } + else if (value instanceof byte[]) { + byte[] bytes = (byte[]) value; + try (ArrowBuf buf = allocator.buffer(bytes.length)) { + buf.writeBytes(bytes); + writer.varBinary(field.getName()).writeVarBinary(0, buf.readableBytes(), buf); + } + } + break; + case DECIMAL: + int scale = ((ArrowType.Decimal) type).getScale(); + if (value instanceof Double) { + int precision = ((ArrowType.Decimal) type).getPrecision(); + BigDecimal bdVal = new BigDecimal((double) value); + bdVal = bdVal.setScale(scale, RoundingMode.HALF_UP); + writer.decimal(field.getName(), scale, precision).writeDecimal(bdVal); + } + else { + BigDecimal scaledValue = ((BigDecimal) value).setScale(scale, RoundingMode.HALF_UP); + writer.decimal(field.getName()).writeDecimal(scaledValue); + } + break; + case VARCHAR: + if (value instanceof String) { + byte[] bytes = ((String) value).getBytes(Charsets.UTF_8); + try (ArrowBuf buf = allocator.buffer(bytes.length)) { + buf.writeBytes(bytes); + writer.varChar(field.getName()).writeVarChar(0, buf.readableBytes(), buf); + } + } + else if (value instanceof ArrowBuf) { + ArrowBuf buf = (ArrowBuf) value; + writer.varChar(field.getName()).writeVarChar(0, buf.readableBytes(), buf); + } + else if (value instanceof byte[]) { + byte[] bytes = (byte[]) value; + try (ArrowBuf buf = allocator.buffer(bytes.length)) { + buf.writeBytes(bytes); + writer.varChar(field.getName()).writeVarChar(0, buf.readableBytes(), buf); + } + } + break; + case BIT: + if (value instanceof Integer && (int) value > 0) { + writer.bit(field.getName()).writeBit(1); + } + else if (value instanceof Boolean && (boolean) value) { + writer.bit(field.getName()).writeBit(1); + } + else { + writer.bit(field.getName()).writeBit(0); + } + break; + default: + throw new IllegalArgumentException("Unknown type " + type); + } + } + catch (RuntimeException ex) { + throw new RuntimeException("Unable to write value for field " + field.getName() + " using value " + value, ex); + } + } + + /** + * Used to mark a particular cell as null. + * + * @param vector The FieldVector to write the null value to. + * @param pos The position (row) in the FieldVector to mark as null. + */ + private static void setNullValue(FieldVector vector, int pos) + { + switch (vector.getMinorType()) { + case DATEMILLI: + ((DateMilliVector) vector).setNull(pos); + break; + case DATEDAY: + ((DateDayVector) vector).setNull(pos); + break; + case FLOAT8: + ((Float8Vector) vector).setNull(pos); + break; + case FLOAT4: + ((Float4Vector) vector).setNull(pos); + break; + case INT: + ((IntVector) vector).setNull(pos); + break; + case TINYINT: + ((TinyIntVector) vector).setNull(pos); + break; + case SMALLINT: + ((SmallIntVector) vector).setNull(pos); + break; + case UINT1: + ((UInt1Vector) vector).setNull(pos); + break; + case UINT2: + ((UInt2Vector) vector).setNull(pos); + break; + case UINT4: + ((UInt4Vector) vector).setNull(pos); + break; + case UINT8: + ((UInt8Vector) vector).setNull(pos); + break; + case BIGINT: + ((BigIntVector) vector).setNull(pos); + break; + case VARBINARY: + ((VarBinaryVector) vector).setNull(pos); + break; + case DECIMAL: + ((DecimalVector) vector).setNull(pos); + break; + case VARCHAR: + ((VarCharVector) vector).setNull(pos); + break; + case BIT: + ((BitVector) vector).setNull(pos); + break; + default: + throw new IllegalArgumentException("Unknown type " + vector.getMinorType()); + } + } + + /** + * In some filtering situations it can be useful to 'unset' a row as an indication to a later processing stage + * that the row is irrelevant. The mechanism by which we 'unset' a row is actually field type specific and as such + * this method is not supported for all field types. + * + * @param row The row number to unset in the provided Block. + * @param block The Block where we'd like to unset the specified row. + */ + public static void unsetRow(int row, Block block) + { + for (FieldVector vector : block.getFieldVectors()) { + switch (vector.getMinorType()) { + case DATEDAY: + ((DateDayVector) vector).setNull(row); + break; + case DATEMILLI: + ((DateMilliVector) vector).setNull(row); + break; + case TINYINT: + ((TinyIntVector) vector).setNull(row); + break; + case UINT1: + ((UInt1Vector) vector).setNull(row); + break; + case SMALLINT: + ((SmallIntVector) vector).setNull(row); + break; + case UINT2: + ((UInt2Vector) vector).setNull(row); + break; + case UINT4: + ((UInt4Vector) vector).setNull(row); + break; + case INT: + ((IntVector) vector).setNull(row); + break; + case UINT8: + ((UInt8Vector) vector).setNull(row); + break; + case BIGINT: + ((BigIntVector) vector).setNull(row); + break; + case FLOAT4: + ((Float4Vector) vector).setNull(row); + break; + case FLOAT8: + ((Float8Vector) vector).setNull(row); + break; + case DECIMAL: + ((DecimalVector) vector).setNull(row); + break; + case VARBINARY: + ((VarBinaryVector) vector).setNull(row); + break; + case VARCHAR: + ((VarCharVector) vector).setNull(row); + break; + case BIT: + ((BitVector) vector).setNull(row); + break; + case STRUCT: + ((StructVector) vector).setNull(row); + break; + case LIST: + UnionListWriter writer = ((ListVector) vector).getWriter(); + writer.setPosition(row); + writer.startList(); + writer.endList(); + writer.setValueCount(0); + break; + default: + throw new IllegalArgumentException("Unknown type " + vector.getMinorType()); + } + } + } + + public static final org.joda.time.MutableDateTime EPOCH = new org.joda.time.MutableDateTime(); + + static { + EPOCH.setDate(0); + } + + private BlockUtils() {} + + private static final char[] HEX_ARRAY = "0123456789ABCDEF".toCharArray(); + + private static String bytesToHex(byte[] bytes) + { + char[] hexChars = new char[bytes.length * 2]; + for (int j = 0; j < bytes.length; j++) { + int v = bytes[j] & 0xFF; + hexChars[j * 2] = HEX_ARRAY[v >>> 4]; + hexChars[j * 2 + 1] = HEX_ARRAY[v & 0x0F]; + } + return new String(hexChars); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockWriter.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockWriter.java new file mode 100644 index 0000000000..26423bd39a --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/BlockWriter.java @@ -0,0 +1,61 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.data; + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; + +/** + * Defines an abstraction that can be used to write to a Block without owning the lifecycle of the + * Block. + */ +public interface BlockWriter +{ + /** + * The interface you should implement for writing to a Block via + * the inverted ownership model offered by BlockWriter. + */ + interface RowWriter + { + /** + * Used to accumulate rows as part of a block. + * + * @param block The block you can add your row to. + * @param rowNum The row number in that block that the next row represents. + * @return The number of rows that were added + * @note We do not recommend writing more than 1 row per call. There are some use-cases which + * are made much simpler by being able to write a small number (<100) rows per call. These often + * relate to batched operators, scan side joins, or field expansions. Writing too many rows + * will result in errors related to Block size management and are implementation specific. + */ + int writeRows(Block block, int rowNum); + } + + /** + * Used to write rows via the BlockWriter. + * + * @param rowWriter The RowWriter that the BlockWriter should use to write rows into the Block(s) it is managing. + */ + void writeRows(RowWriter rowWriter); + + /** + * Provides access to the ConstraintEvaluator that will be applied to the generated Blocks. + */ + ConstraintEvaluator getConstraintEvaluator(); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/FieldBuilder.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/FieldBuilder.java new file mode 100644 index 0000000000..99f7b8d10e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/FieldBuilder.java @@ -0,0 +1,250 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.FieldType; + +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; + +/** + * Convenience builder that can be used to create new Apache Arrow fields for common + * types more easily than alternative methods of construction, especially for complex types. + */ +public class FieldBuilder +{ + private final String name; + private final ArrowType type; + private final List children = new ArrayList<>(); + + /** + * Creates a FieldBuilder for a Field with the given name and type. + * + * @param name The name to use for the Field being built. + * @param type The type to use for the Field being built, most often one of STRUCT or LIST. + */ + private FieldBuilder(String name, ArrowType type) + { + this.name = name; + this.type = type; + } + + /** + * Creates a FieldBuilder for a Field with the given name and type. + * + * @param name The name to use for the Field being built. + * @param type The type to use for the Field being built, most often one of STRUCT or LIST. + * @return A new FieldBuilder for the specified name and type. + */ + public static FieldBuilder newBuilder(String name, ArrowType type) + { + return new FieldBuilder(name, type); + } + + /** + * Adds a new child field with the requested attributes. + * + * @param fieldName The name of the child field. + * @param type The type of the child field. + * @param children The children to add to the child field (empty list if no children desired). + * @return This FieldBuilder itself. + */ + public FieldBuilder addField(String fieldName, ArrowType type, List children) + { + this.children.add(new Field(fieldName, FieldType.nullable(type), children)); + return this; + } + + /** + * Adds the provided field as a child to the builder. + * + * @param child The child to add to the Field being built. + * @return This FieldBuilder itself. + */ + public FieldBuilder addField(Field child) + { + this.children.add(child); + return this; + } + + /** + * Adds a new VARCHAR child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addStringField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.VARCHAR.getType()), null)); + return this; + } + + /** + * Adds a new LIST child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @param type The concrete type for values in the List + * @return This FieldBuilder itself. + */ + public FieldBuilder addListField(String fieldName, ArrowType type) + { + Field baseField = new Field("", FieldType.nullable(type), null); + Field field = new Field(fieldName, + FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(baseField)); + this.children.add(field); + return this; + } + + /** + * Adds a new INT child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addIntField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.INT.getType()), null)); + return this; + } + + /** + * Adds a new FLOAT8 child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addFloat8Field(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.FLOAT8.getType()), null)); + return this; + } + + /** + * Adds a new BIGINT child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addBigIntField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.BIGINT.getType()), null)); + return this; + } + + /** + * Adds a new BIT child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addBitField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.BIT.getType()), null)); + return this; + } + + /** + * Adds a new TinyInt child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addTinyIntField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.TINYINT.getType()), null)); + return this; + } + + /** + * Adds a new SmallInt child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addSmallIntField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.SMALLINT.getType()), null)); + return this; + } + + /** + * Adds a new Float4 child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addFloat4Field(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.FLOAT4.getType()), null)); + return this; + } + + /** + * Adds a new Decimal child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addDecimalField(String fieldName, int precision, int scale) + { + this.children.add(new Field(fieldName, FieldType.nullable(new ArrowType.Decimal(precision, scale)), null)); + return this; + } + + /** + * Adds a new DateDay child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addDateDayField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.DATEDAY.getType()), null)); + return this; + } + + /** + * Adds a new DateMilli child field with the given name to the builder. + * + * @param fieldName The name to use for the newly added child field. + * @return This FieldBuilder itself. + */ + public FieldBuilder addDateMilliField(String fieldName) + { + this.children.add(new Field(fieldName, FieldType.nullable(Types.MinorType.DATEMILLI.getType()), null)); + return this; + } + + /** + * Builds the fields. + * + * @return The newly constructed Field. + */ + public Field build() + { + return new Field(name, FieldType.nullable(type), children); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/FieldResolver.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/FieldResolver.java new file mode 100644 index 0000000000..8a6bb5d4b9 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/FieldResolver.java @@ -0,0 +1,69 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; + +import java.util.List; +import java.util.Map; + +/** + * Assists in writing values for complex types like List and Struct by providing a way to extract child field + * values from the provided complex value. + */ +public interface FieldResolver +{ + /** + * Basic FieldResolver capable of resolving nested (or single level) Lists and Structs + * if the List values are iterable and the Structs values are represented + * as Map + * + * @note This This approach is relatively simple and convenient in terms of programming + * interface but sacrifices some performance due to Object overhead vs. using + * ApacheArrow directly. It is provided for basic usecases which don't have a high + * row count and also as a way to teach by example. For better performance, provide + * your own FieldResolver. And for even better performance, use ApacheArrow directly. + */ + FieldResolver DEFAULT = new FieldResolver() + { + public Object getFieldValue(Field field, Object value) + { + Types.MinorType minorType = Types.getMinorTypeForArrowType(field.getType()); + if (minorType == Types.MinorType.LIST) { + return ((List) value).iterator(); + } + else if (value instanceof Map) { + return ((Map) value).get(field.getName()); + } + throw new RuntimeException("Expected LIST type but found " + minorType); + } + }; + + /** + * Used to extract a value for the given Field from the provided value. + * + * @param field The field that we would like to extract from the provided value. + * @param value The complex value we'd like to extract the provided field from. + * @return The value to use for the given field. + */ + Object getFieldValue(Field field, Object value); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/RecordBatchSerDe.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/RecordBatchSerDe.java new file mode 100644 index 0000000000..7982161ed6 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/RecordBatchSerDe.java @@ -0,0 +1,90 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.memory.BufferAllocator; +import org.apache.arrow.vector.ipc.ReadChannel; +import org.apache.arrow.vector.ipc.WriteChannel; +import org.apache.arrow.vector.ipc.message.ArrowRecordBatch; +import org.apache.arrow.vector.ipc.message.MessageSerializer; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.OutputStream; +import java.nio.channels.Channels; + +/** + * used to serialize and deserialize ArrowRecordBatch. + */ +public class RecordBatchSerDe +{ + private final BlockAllocator allocator; + + public RecordBatchSerDe(BlockAllocator allocator) + { + this.allocator = allocator; + } + + /** + * Serialized the provided ArrowRecordBatch to the provided OutputStream and closes the batch once + * it is fully written to the OutputStream. + * + * @param batch The ArrowRecordBatch to serialize. + * @param out The OutputStream to write to. + * @throws IOException + */ + public void serialize(ArrowRecordBatch batch, OutputStream out) + throws IOException + { + try { + MessageSerializer.serialize(new WriteChannel(Channels.newChannel(out)), batch); + } + finally { + batch.close(); + } + } + + /** + * Attempts to deserialize the provided byte[] into an ArrowRecordBatch. + * + * @param in The byte[] that is expected to contain a serialized ArrowRecordBatch. + * @return The resulting ArrowRecordBatch if the byte[] contains a valid ArrowRecordBatch. + * @throws IOException + */ + public ArrowRecordBatch deserialize(byte[] in) + throws IOException + { + ArrowRecordBatch batch = null; + try { + return allocator.registerBatch((BufferAllocator root) -> + (ArrowRecordBatch) MessageSerializer.deserializeMessageBatch( + new ReadChannel(Channels.newChannel(new ByteArrayInputStream(in))), + root) + ); + } + catch (Exception ex) { + if (batch != null) { + batch.close(); + } + throw ex; + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpillReader.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpillReader.java new file mode 100644 index 0000000000..48806b99dc --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpillReader.java @@ -0,0 +1,118 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.security.AesGcmBlockCrypto; +import com.amazonaws.athena.connector.lambda.security.BlockCrypto; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.NoOpBlockCrypto; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.S3Object; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; + +import static java.util.Objects.requireNonNull; + +public class S3BlockSpillReader +{ + private static final Logger logger = LoggerFactory.getLogger(S3BlockSpillReader.class); + + private final AmazonS3 amazonS3; + private final BlockAllocator allocator; + + public S3BlockSpillReader(AmazonS3 amazonS3, BlockAllocator allocator) + { + this.amazonS3 = requireNonNull(amazonS3, "amazonS3 was null"); + this.allocator = requireNonNull(allocator, "allocator was null"); + } + + /** + * Reads a spilled block. + * + * @param spillLocation The location to read the spilled Block from. + * @param key The encryption key to use when reading the spilled Block. + * @param schema The Schema to use when deserializing the spilled Block. + * @return The Block stored at the spill location. + */ + public Block read(S3SpillLocation spillLocation, EncryptionKey key, Schema schema) + { + S3Object fullObject = null; + try { + logger.debug("read: Started reading block from S3"); + fullObject = amazonS3.getObject(spillLocation.getBucket(), spillLocation.getKey()); + logger.debug("read: Completed reading block from S3"); + BlockCrypto blockCrypto = (key != null) ? new AesGcmBlockCrypto(allocator) : new NoOpBlockCrypto(allocator); + Block block = blockCrypto.decrypt(key, ByteStreams.toByteArray(fullObject.getObjectContent()), schema); + logger.debug("read: Completed decrypting block of size."); + return block; + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + finally { + if (fullObject != null) { + try { + fullObject.close(); + } + catch (IOException ex) { + logger.warn("read: Exception while closing S3 object", ex); + } + } + } + } + + /** + * Reads spilled data as a byte[]. + * + * @param spillLocation The location to read the spilled Block from. + * @param key The encryption key to use when reading the spilled Block. + * @return The Block stored at the spill location. + */ + public byte[] read(S3SpillLocation spillLocation, EncryptionKey key) + { + S3Object fullObject = null; + try { + logger.debug("read: Started reading block from S3"); + fullObject = amazonS3.getObject(spillLocation.getBucket(), spillLocation.getKey()); + logger.debug("read: Completed reading block from S3"); + BlockCrypto blockCrypto = (key != null) ? new AesGcmBlockCrypto(allocator) : new NoOpBlockCrypto(allocator); + return blockCrypto.decrypt(key, ByteStreams.toByteArray(fullObject.getObjectContent())); + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + finally { + if (fullObject != null) { + try { + fullObject.close(); + } + catch (IOException ex) { + logger.warn("read: Exception while closing S3 object", ex); + } + } + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpiller.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpiller.java new file mode 100644 index 0000000000..85d44117de --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpiller.java @@ -0,0 +1,422 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.security.AesGcmBlockCrypto; +import com.amazonaws.athena.connector.lambda.security.BlockCrypto; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.NoOpBlockCrypto; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.ObjectMetadata; +import com.amazonaws.services.s3.model.S3Object; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.util.ArrayList; +import java.util.List; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.TimeUnit; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.atomic.AtomicReference; +import java.util.concurrent.locks.Lock; +import java.util.concurrent.locks.ReadWriteLock; +import java.util.concurrent.locks.StampedLock; + +import static java.util.Objects.requireNonNull; + +/** + * Implementation of BlockSpiller which spills Blocks from large results to S3 with optional AES-GCM encryption. + * + * @note The size at which this implementation will spill to S3 are configured using SpillConfig. + */ +public class S3BlockSpiller + implements AutoCloseable, BlockSpiller +{ + private static final Logger logger = LoggerFactory.getLogger(S3BlockSpiller.class); + //Used to control how long we will wait for background spill threads to exit. + private static final long ASYNC_SHUTDOWN_MILLIS = 10_000; + //The default max number of rows that are allowed to be written per call to writeRows(...) + private static final int MAX_ROWS_PER_CALL = 100; + + //Used to write to S3 + private final AmazonS3 amazonS3; + //Used to optionally encrypt Blocks. + private final BlockCrypto blockCrypto; + //Used to create new blocks. + private final BlockAllocator allocator; + //Controls how/when/where/if this implementation will spill to S3. + private final SpillConfig spillConfig; + //The schema to use for Blocks. + private final Schema schema; + //The max number of rows that are allowed to be written per call to writeRows(...) + private final long maxRowsPerCall; + //If we spilled, the spill locations are kept here. + private final List spillLocations = new ArrayList<>(); + //Reference to the in progress Block. + private final AtomicReference inProgressBlock = new AtomicReference<>(); + //Allows a degree of pipelining to take place so we don't block reading from the source + //while we are spilling. + private final ExecutorService asyncSpillPool; + //Allows us to provide thread safety between async spill completion and calls to getSpill status + private final ReadWriteLock spillLock = new StampedLock().asReadWriteLock(); + //Used to create monotonically increasing spill locations, if the locations are not + //monotonically increasing then read performance may suffer as the engine's ability to + //pre-fetch/pipeline reads before write are completed may use this characteristic of the writes + //to ensure consistency + private final AtomicLong spillNumber = new AtomicLong(0); + //Holder that is used to surface any exceptions encountered in our background spill threads. + private final AtomicReference asyncException = new AtomicReference<>(null); + // + private final ConstraintEvaluator constraintEvaluator; + + /** + * Constructor which uses the default maxRowsPerCall. + * + * @param amazonS3 AmazonS3 client to use for writing to S3. + * @param spillConfig The spill config for this instance. Includes things like encryption key, s3 path, etc... + * @param allocator The BlockAllocator to use when creating blocks. + * @param schema The schema for blocks that should be written. + * @param constraintEvaluator The ConstraintEvaluator that should be used to constrain writes. + */ + public S3BlockSpiller(AmazonS3 amazonS3, + SpillConfig spillConfig, + BlockAllocator allocator, + Schema schema, + ConstraintEvaluator constraintEvaluator) + { + this(amazonS3, spillConfig, allocator, schema, constraintEvaluator, MAX_ROWS_PER_CALL); + } + + /** + * Constructs a new S3BlockSpiller. + * + * @param amazonS3 AmazonS3 client to use for writing to S3. + * @param spillConfig The spill config for this instance. Includes things like encryption key, s3 path, etc... + * @param allocator The BlockAllocator to use when creating blocks. + * @param schema The schema for blocks that should be written. + * @param constraintEvaluator The ConstraintEvaluator that should be used to constrain writes. + * @param maxRowsPerCall The max number of rows to allow callers to write in one call. + */ + public S3BlockSpiller(AmazonS3 amazonS3, + SpillConfig spillConfig, + BlockAllocator allocator, + Schema schema, + ConstraintEvaluator constraintEvaluator, + int maxRowsPerCall) + { + this.amazonS3 = requireNonNull(amazonS3, "amazonS3 was null"); + this.spillConfig = requireNonNull(spillConfig, "spillConfig was null"); + this.allocator = requireNonNull(allocator, "allocator was null"); + this.schema = requireNonNull(schema, "schema was null"); + this.blockCrypto = (spillConfig.getEncryptionKey() != null) ? new AesGcmBlockCrypto(allocator) : new NoOpBlockCrypto(allocator); + asyncSpillPool = (spillConfig.getNumSpillThreads() <= 0) ? null : + Executors.newFixedThreadPool(spillConfig.getNumSpillThreads()); + this.maxRowsPerCall = maxRowsPerCall; + this.constraintEvaluator = constraintEvaluator; + } + + /** + * Provides access to the constraint evaluator used to constrain blocks written via this BlockSpiller. + * + * @return + */ + @Override + public ConstraintEvaluator getConstraintEvaluator() + { + return constraintEvaluator; + } + + /** + * Used to write rows via the BlockWriter. + * + * @param rowWriter The RowWriter that the BlockWriter should use to write rows into the Block(s) it is managing. + * @see BlockSpiller + */ + public void writeRows(RowWriter rowWriter) + { + ensureInit(); + + Block block = inProgressBlock.get(); + int rowCount = block.getRowCount(); + + int rows = rowWriter.writeRows(block, rowCount); + + if (rows > maxRowsPerCall) { + throw new RuntimeException("Call generated more than " + maxRowsPerCall + "rows. Generating " + + "too many rows per call to writeRows(...) can result in blocks that exceed the max size."); + } + if (rows > 0) { + block.setRowCount(rowCount + rows); + } + + if (block.getSize() > spillConfig.getMaxBlockBytes()) { + logger.info("writeRow: Spilling block with {} rows and {} bytes and config {} bytes", + new Object[] {block.getRowCount(), block.getSize(), spillConfig.getMaxBlockBytes()}); + spillBlock(block); + inProgressBlock.set(this.allocator.createBlock(this.schema)); + inProgressBlock.get().constrain(constraintEvaluator); + } + } + + /** + * Used to tell if any blocks were spilled or if the response can be inline. + * + * @return True is spill occurred, false otherwise. + */ + public boolean spilled() + { + if (asyncException.get() != null) { + throw asyncException.get(); + } + + //We use the write lock because we want this to have exclusive access to the state + Lock lock = spillLock.writeLock(); + try { + lock.lock(); + ensureInit(); + Block block = inProgressBlock.get(); + return !spillLocations.isEmpty() || block.getSize() >= spillConfig.getMaxInlineBlockSize(); + } + finally { + lock.unlock(); + } + } + + /** + * If spilled() returns false this can be used to access the block. + * + * @return Block to be inlined in the response. + * @Throws RuntimeException if blocks were spilled and this method is called. + */ + public Block getBlock() + { + if (spilled()) { + throw new RuntimeException("Blocks have spilled, calls to getBlock not permitted. use getSpillLocations instead."); + } + + logger.info("getBlock: Inline Block size[{}] bytes vs {}", inProgressBlock.get().getSize(), spillConfig.getMaxInlineBlockSize()); + return inProgressBlock.get(); + } + + /** + * If spilled() returns true this can be used to access the spill locations of all blocks. + * + * @return List of spill locations. + * @Throws RuntimeException if blocks were not spilled and this method is called. + */ + public List getSpillLocations() + { + if (!spilled()) { + throw new RuntimeException("Blocks have not spilled, calls to getSpillLocations not permitted. use getBlock instead."); + } + + Lock lock = spillLock.writeLock(); + try { + /** + * Flush the in-progress block in nessesary. + */ + Block block = inProgressBlock.get(); + if (block.getRowCount() > 0) { + logger.info("getSpillLocations: Spilling final block with {} rows and {} bytes and config {} bytes", + new Object[] {block.getRowCount(), block.getSize(), spillConfig.getMaxBlockBytes()}); + + spillBlock(block); + + inProgressBlock.set(this.allocator.createBlock(this.schema)); + inProgressBlock.get().constrain(constraintEvaluator); + } + + lock.lock(); + return spillLocations; + } + finally { + lock.unlock(); + } + } + + /** + * Frees any resources held by this BlockSpiller. + * + * @see BlockSpiller + */ + public void close() + { + if (asyncSpillPool == null) { + return; + } + + asyncSpillPool.shutdown(); + try { + if (!asyncSpillPool.awaitTermination(ASYNC_SHUTDOWN_MILLIS, TimeUnit.MILLISECONDS)) { + asyncSpillPool.shutdownNow(); + } + } + catch (InterruptedException e) { + Thread.currentThread().interrupt(); + asyncSpillPool.shutdownNow(); + } + } + + /** + * Writes (aka spills) a Block. + */ + protected SpillLocation write(Block block) + { + try { + S3SpillLocation spillLocation = makeSpillLocation(); + EncryptionKey encryptionKey = spillConfig.getEncryptionKey(); + + logger.info("write: Started encrypting block for write to {}", spillLocation); + byte[] bytes = blockCrypto.encrypt(encryptionKey, block); + + logger.info("write: Started spilling block of size {} bytes", bytes.length); + amazonS3.putObject(spillLocation.getBucket(), + spillLocation.getKey(), + new ByteArrayInputStream(bytes), + new ObjectMetadata()); + logger.info("write: Completed spilling block of size {} bytes", bytes.length); + + return spillLocation; + } + catch (RuntimeException ex) { + asyncException.compareAndSet(null, ex); + logger.warn("write: Encountered error while writing block.", ex); + throw ex; + } + } + + /** + * Reads a spilled block. + * + * @param spillLocation The location to read the spilled Block from. + * @param key The encryption key to use when reading the spilled Block. + * @param schema The Schema to use when deserializing the spilled Block. + * @return The Block stored at the spill location. + */ + protected Block read(S3SpillLocation spillLocation, EncryptionKey key, Schema schema) + { + try { + logger.debug("write: Started reading block from S3"); + S3Object fullObject = amazonS3.getObject(spillLocation.getBucket(), spillLocation.getKey()); + logger.debug("write: Completed reading block from S3"); + Block block = blockCrypto.decrypt(key, ByteStreams.toByteArray(fullObject.getObjectContent()), schema); + logger.debug("write: Completed decrypting block of size."); + return block; + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + /** + * Spills a block, potentially asynchronously depending on the settings. + * + * @param block The Block to spill. + */ + private void spillBlock(Block block) + { + if (asyncSpillPool != null) { + //We use the read lock here because we want to allow these in parallel, its a bit counter intuitive + Lock lock = spillLock.readLock(); + try { + //We lock before going async but unlock after spilling in the async thread, this makes it easy to use + //the ReadWrite lock to tell if all spills are completed without killing the thread pool. + lock.lock(); + asyncSpillPool.submit(() -> { + try { + SpillLocation spillLocation = write(block); + spillLocations.add(spillLocation); + //Free the memory from the previous block since it has been spilled + safeClose(block); + } + finally { + lock.unlock(); + } + }); + } + catch (Exception ex) { + //If we hit an exception, make sure we unlock to avoid a deadlock before throwing. + lock.unlock(); + throw ex; + } + } + else { + SpillLocation spillLocation = write(block); + spillLocations.add(spillLocation); + safeClose(block); + } + } + + /** + * Ensures that the initial Block is initialized. + */ + private void ensureInit() + { + if (inProgressBlock.get() == null) { + //Create the initial block + inProgressBlock.set(this.allocator.createBlock(this.schema)); + inProgressBlock.get().constrain(constraintEvaluator); + } + } + + /** + * This needs to be thread safe and generate locations in a format of: + * location.0 + * location.1 + * location.2 + *

+ * The read engine may elect to exploit this naming convention to speed up the pipelining of + * reads while the spiller is still writing. Violating this convention may reduce performance + * or increase calls to S3. + */ + private S3SpillLocation makeSpillLocation() + { + S3SpillLocation splitSpillLocation = (S3SpillLocation) spillConfig.getSpillLocation(); + if (!splitSpillLocation.isDirectory()) { + throw new RuntimeException("Split's SpillLocation must be a directory because multiple blocks may be spilled."); + } + String blockKey = splitSpillLocation.getKey() + "." + spillNumber.getAndIncrement(); + return new S3SpillLocation(splitSpillLocation.getBucket(), blockKey, false); + } + + /** + * Closes the supplied AutoCloseable and remaps any actions to Runtime. + * + * @param block The Block to close. + */ + private void safeClose(AutoCloseable block) + { + try { + block.close(); + } + catch (Exception ex) { + throw new RuntimeException(ex); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaAware.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaAware.java new file mode 100644 index 0000000000..d688d1733a --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaAware.java @@ -0,0 +1,71 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.List; +import java.util.Map; + +/** + * Defines a component that is aware of Apache Arrow Schema. + */ +public abstract class SchemaAware +{ + /** + * Provides access to the Schema object. + * + * @return The Schema currently being used by this object. + */ + protected abstract Schema internalGetSchema(); + + /** + * Provides access to the Fields on the Schema currently being used by this Object. + * + * @return The list of fields. + */ + public List getFields() + { + return internalGetSchema().getFields(); + } + + /** + * Provides access to metadata stored on the Schema currently being used by this Object. + * + * @param key The metadata key to lookup. + * @return The value associated with that key in the Schema's metadata, null if no such key exists. + */ + public String getMetaData(String key) + { + return internalGetSchema().getCustomMetadata().get(key); + } + + /** + * Provides access to all avaialable metadata on the Schema. + * + * @return All metadata key-value pairs as a map. + */ + public Map getMetaData() + { + return internalGetSchema().getCustomMetadata(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaBuilder.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaBuilder.java new file mode 100644 index 0000000000..f1892ebccd --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaBuilder.java @@ -0,0 +1,304 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.FieldType; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +/** + * Convenience builder that can be used to create new Apache Arrow Schema for common + * types more easily than alternative methods of construction, especially for complex types. + */ +public class SchemaBuilder +{ + private final ImmutableList.Builder fields = ImmutableList.builder(); + private final ImmutableMap.Builder metadata = ImmutableMap.builder(); + private final Map nestedFieldBuilderMap = new HashMap<>(); + + public SchemaBuilder addField(Field field) + { + fields.add(field); + return this; + } + + /** + * Adds a new Field with the provided details to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @param type The type of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addField(String fieldName, ArrowType type) + { + fields.add(new Field(fieldName, FieldType.nullable(type), null)); + return this; + } + + /** + * Adds a new Field with the provided details to the Schema as a top-level Field. + * + * @param fieldName The name of the field to add. + * @param type The type of the field to add. + * @param children The list of child fields to add to the new Field. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addField(String fieldName, ArrowType type, List children) + { + fields.add(new Field(fieldName, FieldType.nullable(type), children)); + return this; + } + + /** + * Adds a new STRUCT Field to the Schema as a top-level Field. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addStructField(String fieldName) + { + nestedFieldBuilderMap.put(fieldName, FieldBuilder.newBuilder(fieldName, Types.MinorType.STRUCT.getType())); + return this; + } + + /** + * Adds a new LIST Field to the Schema as a top-level Field. + * + * @param fieldName The name of the field to add. + * @param type The concrete type of the values that are held within the list. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addListField(String fieldName, ArrowType type) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(new Field("", FieldType.nullable(type), null)))); + return this; + } + + /** + * Adds a new Field as a child of the requested top-level parent field. + * + * @param parent The name of the pre-existing top-level parent field to add a child field to. + * @param child The name of the new child field. + * @param type The type of the new child field to add. + * @return This SchemaBuilder itself. + * @note For more complex nesting, please use FieldBuilder. + */ + public SchemaBuilder addChildField(String parent, String child, ArrowType type) + { + nestedFieldBuilderMap.get(parent).addField(child, type, null); + return this; + } + + /** + * Adds a new Field as a child of the requested top-level parent field. + * + * @param parent The name of the pre-existing top-level parent field to add a child field to. + * @param child The child field to add to the parent. + * @return This SchemaBuilder itself. + * @note For more complex nesting, please use FieldBuilder. + */ + public SchemaBuilder addChildField(String parent, Field child) + { + nestedFieldBuilderMap.get(parent).addField(child); + return this; + } + + /** + * Adds a new VARCHAR Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addStringField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.VARCHAR.getType()), null)); + return this; + } + + /** + * Adds a new INT Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addIntField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.INT.getType()), null)); + return this; + } + + /** + * Adds a new TINYINT Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addTinyIntField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.TINYINT.getType()), null)); + return this; + } + + /** + * Adds a new SMALLINT Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addSmallIntField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.SMALLINT.getType()), null)); + return this; + } + + /** + * Adds a new FLOAT8 Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addFloat8Field(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.FLOAT8.getType()), null)); + return this; + } + + /** + * Adds a new FLOAT4 Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addFloat4Field(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.FLOAT4.getType()), null)); + return this; + } + + /** + * Adds a new BIGINT Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addBigIntField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.BIGINT.getType()), null)); + return this; + } + + /** + * Adds a new BIT Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addBitField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.BIT.getType()), null)); + return this; + } + + /** + * Adds a new DECIMAL Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @param precision The precision to use for the new decimal field. + * @param scale The scale to use for the new decimal field. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addDecimalField(String fieldName, int precision, int scale) + { + fields.add(new Field(fieldName, FieldType.nullable(new ArrowType.Decimal(precision, scale)), null)); + return this; + } + + /** + * Adds a new DateDay Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addDateDayField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.DATEDAY.getType()), null)); + return this; + } + + /** + * Adds a new DateMilli Field to the Schema as a top-level Field with no children. + * + * @param fieldName The name of the field to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addDateMilliField(String fieldName) + { + fields.add(new Field(fieldName, FieldType.nullable(Types.MinorType.DATEMILLI.getType()), null)); + return this; + } + + /** + * Adds the provided metadata to the Schema. + * + * @param key The key of the metadata to add. + * @param value The value of the metadata to add. + * @return This SchemaBuilder itself. + */ + public SchemaBuilder addMetadata(String key, String value) + { + metadata.put(key, value); + return this; + } + + /** + * Creates a new SchemaBuilder. + * + * @return A new SchemaBuilder. + */ + public static SchemaBuilder newBuilder() + { + return new SchemaBuilder(); + } + + /** + * Builds an Apache Arrow Schema from the collected metadata and fields. + * + * @return A new Apache Arrow Schema. + * @note Attempting to reuse this builder will have unexpected side affects. + */ + public Schema build() + { + for (FieldBuilder next : nestedFieldBuilderMap.values()) { + fields.add(next.build()); + } + return new Schema(fields.build(), metadata.build()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaSerDe.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaSerDe.java new file mode 100644 index 0000000000..0b81a61157 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SchemaSerDe.java @@ -0,0 +1,64 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.ipc.ReadChannel; +import org.apache.arrow.vector.ipc.WriteChannel; +import org.apache.arrow.vector.ipc.message.MessageSerializer; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.nio.channels.Channels; + +/** + * Used to serialize and deserialize Apache Arrow Schema objects. + */ +public class SchemaSerDe +{ + /** + * Serialized the provided Schema to the provided OutputStream. + * + * @param schema The Schema to serialize. + * @param out The OutputStream to write to. + * @throws IOException + */ + public void serialize(Schema schema, OutputStream out) + throws IOException + { + MessageSerializer.serialize(new WriteChannel(Channels.newChannel(out)), schema); + } + + /** + * Attempts to deserialize a Schema from the provided InputStream. + * + * @param in The InputStream that is expected to contain a serialized Schema. + * @return The resulting Schema if the InputStream contains a valid Schema. + * @throws IOException + * @note This method does _not_ close the input stream and also reads the InputStream to the end. + */ + public Schema deserialize(InputStream in) + throws IOException + { + return MessageSerializer.deserializeSchema(new ReadChannel(Channels.newChannel(in))); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SimpleBlockWriter.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SimpleBlockWriter.java new file mode 100644 index 0000000000..e8d38128d4 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SimpleBlockWriter.java @@ -0,0 +1,71 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.data; + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; + +/** + * Used to write a single Block using the BlockWriter programming model. + * + * @see BlockWriter + */ +public class SimpleBlockWriter + implements BlockWriter +{ + private final Block block; + + /** + * Basic constructor using a pre-allocated Block. + * + * @param block The Block to write into. + */ + public SimpleBlockWriter(Block block) + { + this.block = block; + } + + /** + * Used to write rows into the Block that is managed by this BlockWriter. + * + * @param rowWriter The RowWriter that the BlockWriter should use to write rows into the Block(s) it is managing. + * @See BlockWriter + */ + public void writeRows(BlockWriter.RowWriter rowWriter) + { + int rowCount = block.getRowCount(); + + int rows = rowWriter.writeRows(block, rowCount); + + if (rows > 0) { + block.setRowCount(rowCount + rows); + } + } + + /** + * Provides access to the ConstraintEvaluator that will be applied to the generated Blocks. + * + * @See BlockWriter + */ + @Override + public ConstraintEvaluator getConstraintEvaluator() + { + return block.getConstraintEvaluator(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SpillConfig.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SpillConfig.java new file mode 100644 index 0000000000..21ad0f7d5e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/SpillConfig.java @@ -0,0 +1,180 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; + +import static java.util.Objects.requireNonNull; + +/** + * Used to configure Spill functionality. + */ +public class SpillConfig +{ + //The default number of threads to use for async spill operations. 0 indicates that the calling thread should be used. + private static final int DEFAULT_SPILL_THREADS = 1; + //The encryption key that should be used to read/write spilled data. If null, encryption is disabled. + private final EncryptionKey encryptionKey; + //The location where the data is spilled. + private final SpillLocation spillLocation; + //The id of the request. + private final String requestId; + //The max bytes that can be in a single Block. + private final long maxBlockBytes; + //The max bytes that can be in an inline (non-spilled) Block. + private final long maxInlineBlockSize; + //The default number of threads to use for async spill operations. 0 indicates that the calling thread should be used. + private final int numSpillThreads; + + private SpillConfig(Builder builder) + { + encryptionKey = builder.encryptionKey; + spillLocation = requireNonNull(builder.spillLocation, "spillLocation was null"); + requestId = requireNonNull(builder.requestId, "requestId was null"); + maxBlockBytes = builder.maxBlockBytes; + maxInlineBlockSize = builder.maxInlineBlockSize; + numSpillThreads = builder.numSpillThreads; + } + + /** + * Gets the Encryption key to use when reading/writing data to the spill location. + * + * @return The EncryptionKey. + * @note If null, spill encryption is disabled. + */ + public EncryptionKey getEncryptionKey() + { + return encryptionKey; + } + + /** + * Gets the SpillLocation, if spill is enabled. + * + * @return The SpillLocation + */ + public SpillLocation getSpillLocation() + { + return spillLocation; + } + + /** + * Gets the request Id, typically the Athena query ID. + * @return The request id. + */ + public String getRequestId() + { + return requestId; + } + + /** + * Gets max number of bytes a spilled Block can contain. + * @return The number of bytes. + */ + public long getMaxBlockBytes() + { + return maxBlockBytes; + } + + /** + * Gets max number of bytes an inline Block can contain. + * @return The number of bytes. + */ + public long getMaxInlineBlockSize() + { + return maxInlineBlockSize; + } + + /** + * Gets the number of threads the BlockSpiller can use. + * @return The number of threads. + */ + public int getNumSpillThreads() + { + return numSpillThreads; + } + + public static Builder newBuilder() + { + return new Builder(); + } + + public static Builder newBuilder(SpillConfig copy) + { + Builder builder = new Builder(); + builder.encryptionKey = copy.getEncryptionKey(); + builder.maxBlockBytes = copy.getMaxBlockBytes(); + return builder; + } + + public static final class Builder + { + private EncryptionKey encryptionKey; + private String requestId; + private SpillLocation spillLocation; + private long maxBlockBytes; + private long maxInlineBlockSize; + private int numSpillThreads = DEFAULT_SPILL_THREADS; + + private Builder() {} + + public Builder withEncryptionKey(EncryptionKey val) + { + encryptionKey = val; + return this; + } + + public Builder withRequestId(String val) + { + requestId = val; + return this; + } + + public Builder withSpillLocation(SpillLocation val) + { + spillLocation = val; + return this; + } + + public Builder withNumSpillThreads(int val) + { + numSpillThreads = val; + return this; + } + + public Builder withMaxBlockBytes(long val) + { + maxBlockBytes = val; + return this; + } + + public Builder withMaxInlineBlockBytes(long val) + { + maxInlineBlockSize = val; + return this; + } + + public SpillConfig build() + { + return new SpillConfig(this); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ArrowValueProjector.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ArrowValueProjector.java new file mode 100644 index 0000000000..61d5185480 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ArrowValueProjector.java @@ -0,0 +1,36 @@ +package com.amazonaws.athena.connector.lambda.data.projectors; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +/** + * Implementation of this interface is expected to project Arrow data into Java objects. The implementation is + * expected to take into a fieldReader during initialization. Each call of {@link #project(int) project} would + * project one Arrow datum to a Java Object the object. + */ +public interface ArrowValueProjector +{ + /** + * Projects Arrow datum into a matching Java object + * @param pos the position/row to project to + * @return The corresponding Java object matching the Arrow datum. + */ + Object project(int pos); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ArrowValueProjectorImpl.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ArrowValueProjectorImpl.java new file mode 100644 index 0000000000..4bfac587d5 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ArrowValueProjectorImpl.java @@ -0,0 +1,133 @@ +package com.amazonaws.athena.connector.lambda.data.projectors; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.util.Text; + +import java.time.Instant; +import java.time.LocalDate; +import java.util.Objects; + +/** + * Abstract class that shares common logic to create the {@link ArrowValueProjectorImpl.Projection Projection} instance. + * {@link ArrowValueProjectorImpl.Projection}'s implementation is decided at runtime based on + * input {@link Types.MinorType Arrow minor type}. + */ +public abstract class ArrowValueProjectorImpl + implements ArrowValueProjector +{ + /** + * Concrete implementation of ArrowValueProjectorImpl should invoke thie method to get the Projection instance. + * @param minorType + * @return Projection used by child class to do actual projection work. + */ + protected Projection createValueProjection(Types.MinorType minorType) + { + switch (minorType) { + case LIST: + case STRUCT: + return createComplexValueProjection(minorType); + default: + return createSimpleValueProjection(minorType); + } + } + + private Projection createSimpleValueProjection(Types.MinorType minorType) + { + switch (minorType) { + case DATEMILLI: + return (fieldReader) -> { + if (Objects.isNull(fieldReader.readLocalDateTime())) { + return null; + } + long millis = fieldReader.readLocalDateTime().toDateTime(org.joda.time.DateTimeZone.UTC).getMillis(); + return Instant.ofEpochMilli(millis).atZone(BlockUtils.UTC_ZONE_ID).toLocalDateTime(); + }; + case TINYINT: + case UINT1: + return (fieldReader) -> fieldReader.readByte(); + case UINT2: + return (fieldReader) -> fieldReader.readCharacter(); + case SMALLINT: + return (fieldReader) -> fieldReader.readShort(); + case DATEDAY: + return (fieldReader) -> { + Integer intVal = fieldReader.readInteger(); + if (Objects.isNull(intVal)) { + return null; + } + return LocalDate.ofEpochDay(intVal); + }; + case INT: + case UINT4: + return (fieldReader) -> fieldReader.readInteger(); + case UINT8: + case BIGINT: + return (fieldReader) -> fieldReader.readLong(); + case DECIMAL: + return (fieldReader) -> fieldReader.readBigDecimal(); + case FLOAT4: + return (fieldReader) -> fieldReader.readFloat(); + case FLOAT8: + return (fieldReader) -> fieldReader.readDouble(); + case VARCHAR: + return (fieldReader) -> { + Text text = fieldReader.readText(); + if (Objects.isNull(text)) { + return null; + } + return text.toString(); + }; + case VARBINARY: + return (fieldReader) -> fieldReader.readByteArray(); + case BIT: + return (fieldReader) -> fieldReader.readBoolean(); + default: + throw new IllegalArgumentException("Unsupported type " + minorType); + } + } + + private Projection createComplexValueProjection(Types.MinorType minorType) + { + switch (minorType) { + case LIST: + return (fieldReader) -> { + ListArrowValueProjector subListProjector = new ListArrowValueProjector(fieldReader); + return subListProjector.doProject(); + }; + case STRUCT: + return (fieldReader) -> { + StructArrowValueProjector subStructProjector = new StructArrowValueProjector(fieldReader); + return subStructProjector.doProject(); + }; + default: + throw new IllegalArgumentException("Unsupported type " + minorType); + } + } + + interface Projection + { + Object doProjection(FieldReader fieldReader); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ListArrowValueProjector.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ListArrowValueProjector.java new file mode 100644 index 0000000000..ba788598a1 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ListArrowValueProjector.java @@ -0,0 +1,78 @@ +package com.amazonaws.athena.connector.lambda.data.projectors; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; + +import java.util.ArrayList; +import java.util.List; + +import static java.util.Objects.requireNonNull; + +public class ListArrowValueProjector + extends ArrowValueProjectorImpl +{ + private final FieldReader listReader; + private final Projection projection; + + public ListArrowValueProjector(FieldReader listReader) + { + this.listReader = requireNonNull(listReader, "listReader is null"); + + List children = listReader.getField().getChildren(); + if (children.size() != 1) { + throw new RuntimeException("Unexpected number of children for ListProjector field " + + listReader.getField().getName()); + } + Types.MinorType minorType = Types.getMinorTypeForArrowType(children.get(0).getType()); + projection = createValueProjection(minorType); + } + + @Override + public Object project(int pos) + { + listReader.setPosition(pos); + if (!listReader.isSet()) { + return null; + } + + return doProject(); + } + + protected Object doProject() + { + List list = new ArrayList<>(); + + while (listReader.next()) { + FieldReader subReader = listReader.reader(); // same reader with different idx + if (!subReader.isSet()) { + list.add(null); + continue; + } + + Object value = projection.doProjection(subReader); + list.add(value); + } + return list; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ProjectorUtils.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ProjectorUtils.java new file mode 100644 index 0000000000..391f3fd94a --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/ProjectorUtils.java @@ -0,0 +1,42 @@ +package com.amazonaws.athena.connector.lambda.data.projectors; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.complex.reader.FieldReader; + +public class ProjectorUtils +{ + private ProjectorUtils() + { + } + + public static ArrowValueProjector createArrowValueProjector(FieldReader fieldReader) + { + switch (fieldReader.getMinorType()) { + case LIST: + return new ListArrowValueProjector(fieldReader); + case STRUCT: + return new StructArrowValueProjector(fieldReader); + default: + return new SimpleArrowValueProjector(fieldReader); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/SimpleArrowValueProjector.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/SimpleArrowValueProjector.java new file mode 100644 index 0000000000..91f74e193d --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/SimpleArrowValueProjector.java @@ -0,0 +1,45 @@ +package com.amazonaws.athena.connector.lambda.data.projectors; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.complex.reader.FieldReader; + +import static java.util.Objects.requireNonNull; + +public class SimpleArrowValueProjector + extends ArrowValueProjectorImpl +{ + private final Projection projection; + private final FieldReader fieldReader; + + public SimpleArrowValueProjector(FieldReader fieldReader) + { + this.fieldReader = requireNonNull(fieldReader, "fieldReader is null"); + this.projection = createValueProjection(fieldReader.getMinorType()); + } + + @Override + public Object project(int pos) + { + fieldReader.setPosition(pos); + return projection.doProjection(fieldReader); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/StructArrowValueProjector.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/StructArrowValueProjector.java new file mode 100644 index 0000000000..3374ada54e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/projectors/StructArrowValueProjector.java @@ -0,0 +1,81 @@ +package com.amazonaws.athena.connector.lambda.data.projectors; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; + +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static java.util.Objects.requireNonNull; + +public class StructArrowValueProjector + extends ArrowValueProjectorImpl +{ + private final Map projectionsMap; + private final FieldReader structReader; + + public StructArrowValueProjector(FieldReader structReader) + { + this.structReader = requireNonNull(structReader, "structReader is null"); + + ImmutableMap.Builder projectionMapBuilder = ImmutableMap.builder(); + + List children = structReader.getField().getChildren(); + + for (Field child : children) { + String childName = child.getName(); + Types.MinorType minorType = Types.getMinorTypeForArrowType(child.getType()); + Projection projection = createValueProjection(minorType); + projectionMapBuilder.put(childName, projection); + } + + this.projectionsMap = projectionMapBuilder.build(); + } + + @Override + public Object project(int pos) + { + structReader.setPosition(pos); + if (!structReader.isSet()) { + return null; + } + + return doProject(); + } + + protected Map doProject() + { + List fields = structReader.getField().getChildren(); + Map nameToValues = new HashMap<>(); + for (Field child : fields) { + String childName = child.getName(); + FieldReader subReader = structReader.reader(childName); + Projection childProjection = projectionsMap.get(childName); + nameToValues.put(child.getName(), childProjection.doProjection(subReader)); + } + return nameToValues; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/ArrowValueWriter.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/ArrowValueWriter.java new file mode 100644 index 0000000000..e368ccd076 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/ArrowValueWriter.java @@ -0,0 +1,36 @@ +package com.amazonaws.athena.connector.lambda.data.writers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +/** + * Implementation of this interface is expected to project java values into Arrow vectors. The implementation is + * expected to take into a vector during initialization. Each call of {@link #write(int, Object) project} would + * write one Java value into the Arrow vector. + */ +public interface ArrowValueWriter +{ + /** + * Project the java value into Arrow's vector. + * @param pos the position/row to project to + * @param value the original java value to be projected + */ + void write(int pos, Object value); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/ComplexArrowValueWriter.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/ComplexArrowValueWriter.java new file mode 100644 index 0000000000..3a4776269c --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/ComplexArrowValueWriter.java @@ -0,0 +1,47 @@ +package com.amazonaws.athena.connector.lambda.data.writers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import org.apache.arrow.vector.FieldVector; + +import static java.util.Objects.requireNonNull; + +public class ComplexArrowValueWriter + implements ArrowValueWriter +{ + private final FieldResolver resolver; + private final FieldVector fieldVector; + + public ComplexArrowValueWriter(FieldVector fieldVector, FieldResolver resolver) + { + this.resolver = requireNonNull(resolver, "resolver is null"); + this.fieldVector = requireNonNull(fieldVector, "fieldVector is null"); + } + + @Override + public void write(int pos, Object value) + { + //todo: use projection pattern + BlockUtils.setComplexValue(fieldVector, pos, resolver, value); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/SimpleArrowValueWriter.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/SimpleArrowValueWriter.java new file mode 100644 index 0000000000..73d46748bd --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/SimpleArrowValueWriter.java @@ -0,0 +1,312 @@ +package com.amazonaws.athena.connector.lambda.data.writers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import org.apache.arrow.vector.BigIntVector; +import org.apache.arrow.vector.BitVector; +import org.apache.arrow.vector.DateDayVector; +import org.apache.arrow.vector.DateMilliVector; +import org.apache.arrow.vector.DecimalVector; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.Float4Vector; +import org.apache.arrow.vector.Float8Vector; +import org.apache.arrow.vector.IntVector; +import org.apache.arrow.vector.SmallIntVector; +import org.apache.arrow.vector.TinyIntVector; +import org.apache.arrow.vector.UInt1Vector; +import org.apache.arrow.vector.UInt2Vector; +import org.apache.arrow.vector.UInt4Vector; +import org.apache.arrow.vector.UInt8Vector; +import org.apache.arrow.vector.VarBinaryVector; +import org.apache.arrow.vector.VarCharVector; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.util.Text; +import org.apache.commons.codec.Charsets; + +import java.math.BigDecimal; +import java.math.RoundingMode; +import java.time.LocalDate; +import java.time.LocalDateTime; +import java.util.Date; + +import static com.amazonaws.athena.connector.lambda.data.BlockUtils.UTC_ZONE_ID; +import static java.util.Objects.requireNonNull; + +public class SimpleArrowValueWriter + implements ArrowValueWriter +{ + private final Writer writer; + private final FieldVector fieldVector; + + public SimpleArrowValueWriter(FieldVector fieldVector) + { + this.fieldVector = requireNonNull(fieldVector, "fieldVector is null"); + this.writer = createWriter(fieldVector.getMinorType()); + } + + private Writer createWriter(Types.MinorType minorType) + { + switch (minorType) { + case DATEMILLI: + return (vector, pos, value) -> { + DateMilliVector dateMilliVector = ((DateMilliVector) vector); + if (value == null) { + dateMilliVector.setNull(pos); + return; + } + + if (value instanceof Date) { + dateMilliVector.setSafe(pos, ((Date) value).getTime()); + } + else if (value instanceof LocalDateTime) { + dateMilliVector.setSafe( + pos, + ((LocalDateTime) value).atZone(UTC_ZONE_ID).toInstant().toEpochMilli()); + } + else { + dateMilliVector.setSafe(pos, (long) value); + } + }; + case DATEDAY: + return (vector, pos, value) -> { + DateDayVector dateDayVector = ((DateDayVector) vector); + if (value == null) { + dateDayVector.setNull(pos); + return; + } + + if (value instanceof Date) { + org.joda.time.Days days = org.joda.time.Days.daysBetween(BlockUtils.EPOCH, + new org.joda.time.DateTime(((Date) value).getTime())); + dateDayVector.setSafe(pos, days.getDays()); + } + if (value instanceof LocalDate) { + int days = (int) ((LocalDate) value).toEpochDay(); + dateDayVector.setSafe(pos, days); + } + else if (value instanceof Long) { + dateDayVector.setSafe(pos, ((Long) value).intValue()); + } + else { + dateDayVector.setSafe(pos, (int) value); + } + }; + case FLOAT8: + return (vector, pos, value) -> { + Float8Vector float8Vector = ((Float8Vector) vector); + if (value == null) { + float8Vector.setNull(pos); + return; + } + + float8Vector.setSafe(pos, (double) value); + }; + case FLOAT4: + return (vector, pos, value) -> { + Float4Vector float4Vector = ((Float4Vector) vector); + if (value == null) { + float4Vector.setNull(pos); + return; + } + + ((Float4Vector) vector).setSafe(pos, (float) value); + }; + case INT: + return (vector, pos, value) -> { + IntVector intVector = ((IntVector) vector); + if (value == null) { + intVector.setNull(pos); + return; + } + + if (value != null && value instanceof Long) { + //This may seem odd at first but many frameworks (like Presto) use long as the preferred + //native java type for representing integers. We do this to keep type conversions simple. + intVector.setSafe(pos, ((Long) value).intValue()); + } + else { + intVector.setSafe(pos, (int) value); + } + }; + case TINYINT: + return (vector, pos, value) -> { + TinyIntVector tinyIntVector = ((TinyIntVector) vector); + if (value == null) { + tinyIntVector.setNull(pos); + return; + } + + if (value instanceof Byte) { + tinyIntVector.setSafe(pos, (byte) value); + } + else { + tinyIntVector.setSafe(pos, (int) value); + } + }; + case SMALLINT: + return (vector, pos, value) -> { + SmallIntVector smallIntVector = ((SmallIntVector) vector); + if (value == null) { + smallIntVector.setNull(pos); + return; + } + + if (value instanceof Short) { + smallIntVector.setSafe(pos, (short) value); + } + else { + smallIntVector.setSafe(pos, (int) value); + } + }; + case UINT1: + return (vector, pos, value) -> { + UInt1Vector uInt1Vector = ((UInt1Vector) vector); + if (value == null) { + uInt1Vector.setNull(pos); + return; + } + + ((UInt1Vector) vector).setSafe(pos, (int) value); + }; + case UINT2: + return (vector, pos, value) -> { + UInt2Vector uInt2Vector = ((UInt2Vector) vector); + if (value == null) { + uInt2Vector.setNull(pos); + return; + } + + ((UInt2Vector) vector).setSafe(pos, (int) value); + }; + case UINT4: + return (vector, pos, value) -> { + UInt4Vector uInt4Vector = ((UInt4Vector) vector); + if (value == null) { + uInt4Vector.setNull(pos); + return; + } + + ((UInt4Vector) vector).setSafe(pos, (int) value); + }; + case UINT8: + return (vector, pos, value) -> { + UInt8Vector uInt8Vector = ((UInt8Vector) vector); + if (value == null) { + uInt8Vector.setNull(pos); + return; + } + + ((UInt8Vector) vector).setSafe(pos, (int) value); + }; + case BIGINT: + return (vector, pos, value) -> { + BigIntVector bigIntVector = ((BigIntVector) vector); + if (value == null) { + bigIntVector.setNull(pos); + return; + } + + ((BigIntVector) vector).setSafe(pos, (long) value); + }; + case VARBINARY: + return (vector, pos, value) -> { + VarBinaryVector varBinaryVector = ((VarBinaryVector) vector); + if (value == null) { + varBinaryVector.setNull(pos); + return; + } + + ((VarBinaryVector) vector).setSafe(pos, (byte[]) value); + }; + case DECIMAL: + return (vector, pos, value) -> { + DecimalVector dVector = ((DecimalVector) vector); + if (value == null) { + dVector.setNull(pos); + return; + } + + if (value instanceof Double) { + BigDecimal bdVal = new BigDecimal((double) value); + bdVal = bdVal.setScale(dVector.getScale(), RoundingMode.HALF_UP); + dVector.setSafe(pos, bdVal); + } + else { + BigDecimal scaledValue = ((BigDecimal) value).setScale(dVector.getScale(), RoundingMode.HALF_UP); + ((DecimalVector) vector).setSafe(pos, scaledValue); + } + }; + case VARCHAR: + return (vector, pos, value) -> { + VarCharVector varCharVector = ((VarCharVector) vector); + if (value == null) { + varCharVector.setNull(pos); + return; + } + + if (value instanceof String) { + varCharVector.setSafe(pos, ((String) value).getBytes(Charsets.UTF_8)); + } + else { + varCharVector.setSafe(pos, (Text) value); + } + }; + case BIT: + return (vector, pos, value) -> { + BitVector bitVector = ((BitVector) vector); + if (value == null) { + bitVector.setNull(pos); + return; + } + + if (value instanceof Integer && (int) value > 0) { + bitVector.setSafe(pos, 1); + } + else if (value instanceof Boolean && (boolean) value) { + bitVector.setSafe(pos, 1); + } + else { + bitVector.setSafe(pos, 0); + } + }; + default: + throw new IllegalArgumentException("Unknown type " + fieldVector.getMinorType()); + } + } + + @Override + public void write(int pos, Object value) + { + try { + writer.write(fieldVector, pos, value); + } + catch (RuntimeException ex) { + String fieldName = (fieldVector != null) ? fieldVector.getField().getName() : "null_vector"; + throw new RuntimeException("Unable to set value for field " + fieldName + " using value " + value, ex); + } + } + + interface Writer + { + void write(FieldVector vector, int pos, Object value); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/WriterUtils.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/WriterUtils.java new file mode 100644 index 0000000000..73cf5dbb2c --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/data/writers/WriterUtils.java @@ -0,0 +1,42 @@ +package com.amazonaws.athena.connector.lambda.data.writers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import org.apache.arrow.vector.FieldVector; + +public class WriterUtils +{ + private WriterUtils() + { + } + + public static ArrowValueWriter createArrowValueWriter(FieldVector fieldVector) + { + switch (fieldVector.getMinorType()) { + case LIST: + case STRUCT: + return new ComplexArrowValueWriter(fieldVector, FieldResolver.DEFAULT); + default: + return new SimpleArrowValueWriter(fieldVector); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/Split.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/Split.java new file mode 100644 index 0000000000..e6f0925df9 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/Split.java @@ -0,0 +1,241 @@ +package com.amazonaws.athena.connector.lambda.domain; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import java.beans.Transient; +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; +import java.util.Objects; + +import static java.util.Objects.requireNonNull; + +/** + * A Split is best thought of as a unit of work that is part of a larger activity. For example, if we needed to + * read a 100GB table stored in S3. We might want to improve performance by parallelizing the reads of this large file + * by _splitting_ it up into 100MB pieces. You could think of each piece as a split. In general, Splits are opaque + * to Athena with the exception of the SpillLocation and EncryptionKey which are used by Athena to find any data that + * was spilled by the processing of the split. All properties on the split are soley produced by and consumed by the + * connector. + */ +public class Split +{ + //The optional SpillLocation this Split can write to. + private final SpillLocation spillLocation; + //The optional EncryptionKey this Split can use to encrypt/decrypt data. + private final EncryptionKey encryptionKey; + //The properties that define what this split is meant to do. + private final Map properties; + + /** + * Basic constructor. + * + * @param spillLocation The optional SpillLocation this Split can write to. + * @param encryptionKey The optional EncryptionKey this Split can use to encrypt/decrypt data. + * @param properties The properties that define what this split is meant to do. + */ + @JsonCreator + public Split(@JsonProperty("spillLocation") SpillLocation spillLocation, + @JsonProperty("encryptionKey") EncryptionKey encryptionKey, + @JsonProperty("properties") Map properties) + { + requireNonNull(properties, "properties is null"); + this.spillLocation = spillLocation; + this.encryptionKey = encryptionKey; + this.properties = Collections.unmodifiableMap(properties); + } + + private Split(Builder builder) + { + this.properties = Collections.unmodifiableMap(builder.properties); + this.spillLocation = builder.spillLocation; + this.encryptionKey = builder.encryptionKey; + } + + /** + * Retrieves the value of the requested property. + * + * @param key The name of the property to retrieve. + * @return The value for that property or null if there is no such property. + */ + @Transient + public String getProperty(String key) + { + return properties.get(key); + } + + /** + * Retrieves the value of the requested property and attempts to parse the value into an int. + * + * @param key The name of the property to retrieve. + * @return The value for that property, throws if there is no such property. + */ + @Transient + public int getPropertyAsInt(String key) + { + return Integer.parseInt(properties.get(key)); + } + + /** + * Retrieves the value of the requested property and attempts to parse the value into an Long. + * + * @param key The name of the property to retrieve. + * @return The value for that property, throws if there is no such property. + */ + @Transient + public long getPropertyAsLong(String key) + { + return Long.parseLong(properties.get(key)); + } + + /** + * Retrieves the value of the requested property and attempts to parse the value into a double. + * + * @param key The name of the property to retrieve. + * @return The value for that property, throws if there is no such property. + */ + @Transient + public double getPropertyAsDouble(String key) + { + return Double.parseDouble(properties.get(key)); + } + + /** + * Provides access to all properties on this Split. + * + * @return Map containing all properties on the split. + */ + @JsonProperty + public Map getProperties() + { + return properties; + } + + /** + * The optional SpillLocation this Split can write to. + * + * @return The SpillLocation. + */ + @JsonProperty + public SpillLocation getSpillLocation() + { + return spillLocation; + } + + /** + * The optional EncryptionKey this Split can use to encrypt/decrypt data. + * + * @return The EncryptionKey. + */ + @JsonProperty + public EncryptionKey getEncryptionKey() + { + return encryptionKey; + } + + @Transient + public static Builder newBuilder(SpillLocation spillLocation, EncryptionKey encryptionKey) + { + return new Builder().withSpillLocation(spillLocation).withEncryptionKey(encryptionKey); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + Split split = (Split) o; + return Objects.equals(spillLocation, split.spillLocation) && + Objects.equals(encryptionKey, split.encryptionKey) && + Objects.equals(getProperties(), split.getProperties()); + } + + @Override + public int hashCode() + { + return Objects.hash(spillLocation, encryptionKey, getProperties()); + } + + public static class Builder + { + private final Map properties = new HashMap<>(); + private SpillLocation spillLocation; + private EncryptionKey encryptionKey; + + private Builder() {} + + /** + * Adds the provided property key,value pair to the Split, overwriting any previous value for the key. + * + * @param key The key for the property. + * @param value The value for the property. + * @return The Builder itself. + */ + public Builder add(String key, String value) + { + properties.put(key, value); + return this; + } + + /** + * Sets the optional SpillLocation this Split can write to. + * + * @param val The SpillLocation + * @return The Builder itself. + */ + public Builder withSpillLocation(SpillLocation val) + { + this.spillLocation = val; + return this; + } + + /** + * Sets the optional EncryptionKey this Split can use to encrypt/decrypt data. + * + * @param key The EncryptionKey + * @return The Builder itself. + */ + public Builder withEncryptionKey(EncryptionKey key) + { + this.encryptionKey = key; + return this; + } + + /** + * Builds the Split + * + * @return A newly constructed Split using the attributes collected by this builder. + */ + public Split build() + { + return new Split(this); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/TableName.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/TableName.java new file mode 100644 index 0000000000..e74ce5d289 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/TableName.java @@ -0,0 +1,106 @@ +package com.amazonaws.athena.connector.lambda.domain; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; + +import static java.util.Objects.requireNonNull; + +/** + * Represents a fully qualified TableName. + */ +public class TableName +{ + //The schema name that the table belongs to. + private final String schemaName; + //The name of the table. + private final String tableName; + + /** + * Constructs a fully qualified TableName. + * + * @param schemaName The name of the schema that the table belongs to. + * @param tableName The name of the table. + */ + @JsonCreator + public TableName(@JsonProperty("schemaName") String schemaName, + @JsonProperty("tableName") String tableName) + { + this.schemaName = requireNonNull(schemaName, "schemaName is null"); + this.tableName = requireNonNull(tableName, "tableName is null"); + } + + /** + * Gets the name of the schema the table belongs to. + * + * @return A String containing the schema name for the table. + */ + @JsonProperty + public String getSchemaName() + { + return schemaName; + } + + /** + * Gets the name of the table. + * + * @return A String containing the name of the table. + */ + @JsonProperty + public String getTableName() + { + return tableName; + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("schemaName", schemaName) + .add("tableName", tableName) + .toString(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + TableName that = (TableName) o; + + return Objects.equal(this.schemaName, that.schemaName) && + Objects.equal(this.tableName, that.tableName); + } + + @Override + public int hashCode() + { + return Objects.hashCode(schemaName, tableName); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/AllOrNoneValueSet.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/AllOrNoneValueSet.java new file mode 100644 index 0000000000..f794100e9e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/AllOrNoneValueSet.java @@ -0,0 +1,202 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.util.Objects; + +import static java.util.Objects.requireNonNull; + +/** + * Describes a constraint as a ValueSet which can have one of several states: + * 1. No value can match + * 2. Only NULL values can match + * 3. Only non-null values can match + * 4. All values can match + * + * @see ValueSet + */ +public class AllOrNoneValueSet + implements ValueSet +{ + private final ArrowType type; + private final boolean all; + private final boolean nullAllowed; + + @JsonCreator + public AllOrNoneValueSet(@JsonProperty("type") ArrowType type, + @JsonProperty("all") boolean all, + @JsonProperty("nullAllowed") boolean nullAllowed) + { + this.type = requireNonNull(type, "type is null"); + this.all = all; + this.nullAllowed = nullAllowed; + } + + static AllOrNoneValueSet all(ArrowType type) + { + return new AllOrNoneValueSet(type, true, true); + } + + static AllOrNoneValueSet none(ArrowType type) + { + return new AllOrNoneValueSet(type, false, false); + } + + static AllOrNoneValueSet onlyNull(ArrowType type) + { + return new AllOrNoneValueSet(type, false, true); + } + + static AllOrNoneValueSet notNull(ArrowType type) + { + return new AllOrNoneValueSet(type, true, false); + } + + @Override + @JsonProperty("nullAllowed") + public boolean isNullAllowed() + { + return nullAllowed; + } + + @Override + @JsonProperty + public ArrowType getType() + { + return type; + } + + @Override + public boolean isNone() + { + return !all && !nullAllowed; + } + + @Override + @JsonProperty + public boolean isAll() + { + return all && nullAllowed; + } + + @Override + public boolean isSingleValue() + { + return false; + } + + @Override + public Object getSingleValue() + { + throw new UnsupportedOperationException(); + } + + @Override + public boolean containsValue(Marker value) + { + if (value.isNullValue() && nullAllowed) { + return true; + } + else if (value.isNullValue() && !nullAllowed) { + return false; + } + + return all; + } + + @Override + public ValueSet intersect(BlockAllocator allocator, ValueSet other) + { + AllOrNoneValueSet otherValueSet = checkCompatibility(other); + return new AllOrNoneValueSet(type, all && otherValueSet.all, nullAllowed && other.isNullAllowed()); + } + + @Override + public ValueSet union(BlockAllocator allocator, ValueSet other) + { + AllOrNoneValueSet otherValueSet = checkCompatibility(other); + return new AllOrNoneValueSet(type, all || otherValueSet.all, nullAllowed || other.isNullAllowed()); + } + + @Override + public ValueSet complement(BlockAllocator allocator) + { + return new AllOrNoneValueSet(type, !all, !nullAllowed); + } + + @Override + public String toString() + { + return "[" + (all ? "ALL" : "NONE") + " nullAllowed:" + isNullAllowed() + "]"; + } + + @Override + public int hashCode() + { + return Objects.hash(type, all); + } + + @Override + public boolean equals(Object obj) + { + if (this == obj) { + return true; + } + if (obj == null || getClass() != obj.getClass()) { + return false; + } + final AllOrNoneValueSet other = (AllOrNoneValueSet) obj; + return Objects.equals(this.type, other.type) + && this.all == other.all; + } + + private AllOrNoneValueSet checkCompatibility(ValueSet other) + { + if (!getType().equals(other.getType())) { + throw new IllegalArgumentException(String.format("Mismatched types: %s vs %s", + getType(), other.getType())); + } + if (!(other instanceof AllOrNoneValueSet)) { + throw new IllegalArgumentException(String.format("ValueSet is not a AllOrNoneValueSet: %s", + other.getClass())); + } + return (AllOrNoneValueSet) other; + } + + @Override + public void close() + throws Exception + { + } + + private void checkTypeCompatibility(Marker marker) + { + if (!getType().equals(marker.getType())) { + throw new IllegalStateException(String.format("Marker of %s does not match SortedRangeSet of %s", + marker.getType(), getType())); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/ConstraintEvaluator.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/ConstraintEvaluator.java new file mode 100644 index 0000000000..8cb487740d --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/ConstraintEvaluator.java @@ -0,0 +1,105 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.Map; + +/** + * Used to apply predicates to values inside your connector. Ideally you would also be able to push + * constraints into your source system (e.g. RDBMS via SQL). For each value you'd like to write for a given row, + * you call the 'apply' function on this class and if the values for all columns in the row return 'true' that + * indicates that the row passes the constraints. + * + * After being used, ConstraintEvaluator instance must be closed to ensure no Apache Arrow resources used by + * Markers that it creates as part of evaluation are leaked. + * + * @note This abstraction works well for the associative predicates that are made available to your connector + * today but will likely require enhancement as we expose more sophisticated predicates (e.g. col1 + col2 < 100) + * in the future. Additionally, we do not support constraints on complex types are this time. + *

+ * For usage examples, please see the ExampleRecordHandler or connectors like athena-redis. + * + * TODO: We can improve the filtering performance of ConstraintEvaluator by refactoring how ValueSets works. + */ +public class ConstraintEvaluator + implements AutoCloseable +{ + private static final Logger logger = LoggerFactory.getLogger(ConstraintEvaluator.class); + + private final Constraints constraints; + + //Used to reduce the object overhead of constraints by sharing blocks across Markers. + //This is a byproduct of the way we are using Apache Arrow to hold Markers which are essentially + //a single value blocks. This factory allows us to represent a Marker (aka a single value) as + //a row in a shared block to improve memory and perf. + private final MarkerFactory markerFactory; + //Holds the type for each field. + private final Map typeMap = new HashMap<>(); + + public ConstraintEvaluator(BlockAllocator allocator, Schema schema, Constraints constraints) + { + this.constraints = constraints; + for (Field next : schema.getFields()) { + typeMap.put(next.getName(), next.getType()); + } + markerFactory = new MarkerFactory(allocator); + } + + public static ConstraintEvaluator emptyEvaluator() + { + return new ConstraintEvaluator(null, SchemaBuilder.newBuilder().build(), new Constraints(new HashMap<>())); + } + + public boolean apply(String fieldName, Object value) + { + try { + ValueSet constraint = constraints.getSummary().get(fieldName); + if (constraint != null && typeMap.get(fieldName) != null) { + try (Marker marker = markerFactory.createNullable(typeMap.get(fieldName), + value, + Marker.Bound.EXACTLY)) { + return constraint.containsValue(marker); + } + } + + return true; + } + catch (Exception ex) { + throw (ex instanceof RuntimeException) ? (RuntimeException) ex : new RuntimeException(ex); + } + } + + @Override + public void close() + throws Exception + { + markerFactory.close(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Constraints.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Constraints.java new file mode 100644 index 0000000000..5b07363c69 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Constraints.java @@ -0,0 +1,87 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +import java.util.Map; + +public class Constraints + implements AutoCloseable +{ + private Map summary; + + @JsonCreator + public Constraints(@JsonProperty("summary") Map summary) + { + this.summary = summary; + } + + public Map getSummary() + { + return summary; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + Constraints that = (Constraints) o; + + return Objects.equal(this.summary, that.summary); + } + + @Override + public String toString() + { + return "Constraints{" + + "summary=" + summary + + '}'; + } + + @Override + public int hashCode() + { + return Objects.hashCode(summary); + } + + @Override + public void close() + throws Exception + { + for (ValueSet next : summary.values()) { + try { + next.close(); + } + catch (Exception ex) { + throw new RuntimeException(ex); + } + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/EquatableValueSet.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/EquatableValueSet.java new file mode 100644 index 0000000000..7448333b76 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/EquatableValueSet.java @@ -0,0 +1,458 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.ArrowTypeComparator; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.beans.Transient; +import java.util.ArrayList; +import java.util.Collection; +import java.util.List; +import java.util.Objects; + +import static java.util.Objects.requireNonNull; + +/** + * A set containing values that are uniquely identifiable. + * Assumes an infinite number of possible values. The values may be collectively included (aka whitelist) + * or collectively excluded (aka !whitelist). + */ +public class EquatableValueSet + implements ValueSet +{ + private static final String DEFAULT_COLUMN = "col1"; + private final boolean whiteList; + private final Block valueBlock; + public final boolean nullAllowed; + + @JsonCreator + public EquatableValueSet( + @JsonProperty("valueBlock") Block valueBlock, + @JsonProperty("whiteList") boolean whiteList, + @JsonProperty("nullAllowed") boolean nullAllowed) + { + requireNonNull(valueBlock, "valueBlock is null"); + this.valueBlock = valueBlock; + this.whiteList = whiteList; + this.nullAllowed = nullAllowed; + } + + static EquatableValueSet none(BlockAllocator allocator, ArrowType type) + { + return new EquatableValueSet(BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, type), true, false); + } + + static EquatableValueSet all(BlockAllocator allocator, ArrowType type) + { + return new EquatableValueSet(BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, type), false, true); + } + + static EquatableValueSet onlyNull(BlockAllocator allocator, ArrowType type) + { + return new EquatableValueSet(BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, type), false, true); + } + + static EquatableValueSet notNull(BlockAllocator allocator, ArrowType type) + { + return new EquatableValueSet(BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, type), false, false); + } + + static EquatableValueSet of(BlockAllocator allocator, ArrowType type, Object... values) + { + return new EquatableValueSet(BlockUtils.newBlock(allocator, DEFAULT_COLUMN, type, values), true, false); + } + + static EquatableValueSet of(BlockAllocator allocator, ArrowType type, boolean nullAllowed, Collection values) + { + return new EquatableValueSet(BlockUtils.newBlock(allocator, DEFAULT_COLUMN, type, values), true, nullAllowed); + } + + @JsonProperty("nullAllowed") + @Override + public boolean isNullAllowed() + { + return nullAllowed; + } + + @Transient + public Schema getSchema() + { + return valueBlock.getSchema(); + } + + @JsonProperty + public Block getValueBlock() + { + return valueBlock; + } + + @Transient + @Override + public ArrowType getType() + { + return valueBlock.getFieldReader(DEFAULT_COLUMN).getField().getType(); + } + + @JsonProperty + public boolean isWhiteList() + { + return whiteList; + } + + public Block getValues() + { + return valueBlock; + } + + public Object getValue(int pos) + { + FieldReader reader = valueBlock.getFieldReader(DEFAULT_COLUMN); + reader.setPosition(pos); + return reader.readObject(); + } + + @Override + public boolean isNone() + { + return whiteList && valueBlock.getRowCount() == 0 && !nullAllowed; + } + + @Override + public boolean isAll() + { + return !whiteList && valueBlock.getRowCount() == 0 && nullAllowed; + } + + @Override + public boolean isSingleValue() + { + return (whiteList && valueBlock.getRowCount() == 1 && !nullAllowed) || + (whiteList && valueBlock.getRowCount() == 0 && nullAllowed); + } + + @Override + public Object getSingleValue() + { + if (!isSingleValue()) { + throw new IllegalStateException("EquatableValueSet does not have just a single value"); + } + + if (nullAllowed && valueBlock.getRowCount() == 0) { + return null; + } + + FieldReader reader = valueBlock.getFieldReader(DEFAULT_COLUMN); + reader.setPosition(0); + return reader.readObject(); + } + + @Override + public boolean containsValue(Marker marker) + { + if (marker.isNullValue() && nullAllowed) { + return true; + } + else if (marker.isNullValue() && !nullAllowed) { + return false; + } + + Object value = marker.getValue(); + boolean result = false; + FieldReader reader = valueBlock.getFieldReader(DEFAULT_COLUMN); + for (int i = 0; i < valueBlock.getRowCount() && !result; i++) { + reader.setPosition(i); + result = ArrowTypeComparator.compare(reader, value, reader.readObject()) == 0; + } + return whiteList == result; + } + + protected boolean containsValue(Object value) + { + if (value == null && nullAllowed) { + return true; + } + + boolean result = false; + FieldReader reader = valueBlock.getFieldReader(DEFAULT_COLUMN); + for (int i = 0; i < valueBlock.getRowCount() && !result; i++) { + reader.setPosition(i); + result = ArrowTypeComparator.compare(reader, value, reader.readObject()) == 0; + } + return whiteList == result; + } + + @Override + public EquatableValueSet intersect(BlockAllocator allocator, ValueSet other) + { + EquatableValueSet otherValueSet = checkCompatibility(other); + boolean intersectNullAllowed = this.isNullAllowed() && other.isNullAllowed(); + + if (whiteList && otherValueSet.isWhiteList()) { + return new EquatableValueSet(intersect(allocator, this, otherValueSet), true, intersectNullAllowed); + } + else if (whiteList) { + return new EquatableValueSet(subtract(allocator, this, otherValueSet), true, intersectNullAllowed); + } + else if (otherValueSet.isWhiteList()) { + return new EquatableValueSet(subtract(allocator, otherValueSet, this), true, intersectNullAllowed); + } + else { + return new EquatableValueSet(union(allocator, otherValueSet, this), false, intersectNullAllowed); + } + } + + @Override + public EquatableValueSet union(BlockAllocator allocator, ValueSet other) + { + EquatableValueSet otherValueSet = checkCompatibility(other); + boolean unionNullAllowed = this.isNullAllowed() || other.isNullAllowed(); + + if (whiteList && otherValueSet.isWhiteList()) { + return new EquatableValueSet(union(allocator, otherValueSet, this), true, unionNullAllowed); + } + else if (whiteList) { + return new EquatableValueSet(subtract(allocator, otherValueSet, this), false, unionNullAllowed); + } + else if (otherValueSet.isWhiteList()) { + return new EquatableValueSet(subtract(allocator, this, otherValueSet), false, unionNullAllowed); + } + else { + return new EquatableValueSet(intersect(allocator, otherValueSet, this), false, unionNullAllowed); + } + } + + @Override + public EquatableValueSet complement(BlockAllocator allocator) + { + return new EquatableValueSet(valueBlock, !whiteList, !nullAllowed); + } + + @Override + public String toString() + { + return "EquatableValueSet{" + + "whiteList=" + whiteList + + "nullAllowed=" + nullAllowed + + ", valueBlock=" + valueBlock + + '}'; + } + + private static Block intersect(BlockAllocator allocator, EquatableValueSet left, EquatableValueSet right) + { + Block resultBlock = BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, left.getType()); + FieldVector result = resultBlock.getFieldVector(DEFAULT_COLUMN); + + Block lhsBlock = left.getValues(); + + FieldReader lhs = lhsBlock.getFieldReader(DEFAULT_COLUMN); + + int count = 0; + for (int i = 0; i < lhsBlock.getRowCount(); i++) { + lhs.setPosition(i); + if (isPresent(lhs.readObject(), right.valueBlock)) { + BlockUtils.setValue(result, count++, lhs.readObject()); + } + } + resultBlock.setRowCount(count); + return resultBlock; + } + + private static Block union(BlockAllocator allocator, EquatableValueSet left, EquatableValueSet right) + { + Block resultBlock = BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, left.getType()); + FieldVector result = resultBlock.getFieldVector(DEFAULT_COLUMN); + + Block lhsBlock = left.getValues(); + FieldReader lhs = lhsBlock.getFieldReader(DEFAULT_COLUMN); + + int count = 0; + for (int i = 0; i < lhsBlock.getRowCount(); i++) { + lhs.setPosition(i); + BlockUtils.setValue(result, count++, lhs.readObject()); + } + + Block rhsBlock = right.getValues(); + FieldReader rhs = rhsBlock.getFieldReader(DEFAULT_COLUMN); + for (int i = 0; i < rhsBlock.getRowCount(); i++) { + rhs.setPosition(i); + if (!isPresent(rhs.readObject(), left.valueBlock)) { + BlockUtils.setValue(result, count++, rhs.readObject()); + } + } + + resultBlock.setRowCount(count); + return resultBlock; + } + + private static Block subtract(BlockAllocator allocator, EquatableValueSet left, EquatableValueSet right) + { + Block resultBlock = BlockUtils.newEmptyBlock(allocator, DEFAULT_COLUMN, left.getType()); + FieldVector result = resultBlock.getFieldVector(DEFAULT_COLUMN); + + Block lhsBlock = left.getValues(); + + FieldReader lhs = lhsBlock.getFieldReader(DEFAULT_COLUMN); + + int count = 0; + for (int i = 0; i < lhsBlock.getRowCount(); i++) { + lhs.setPosition(i); + if (!isPresent(lhs.readObject(), right.valueBlock)) { + BlockUtils.setValue(result, count++, lhs.readObject()); + } + } + resultBlock.setRowCount(count); + return resultBlock; + } + + private static boolean isPresent(Object lhs, Block right) + { + FieldReader rhs = right.getFieldReader(DEFAULT_COLUMN); + Types.MinorType type = rhs.getMinorType(); + for (int j = 0; j < right.getRowCount(); j++) { + rhs.setPosition(j); + if (ArrowTypeComparator.compare(rhs, lhs, rhs.readObject()) == 0) { + return true; + } + } + return false; + } + + private EquatableValueSet checkCompatibility(ValueSet other) + { + if (!getType().equals(other.getType())) { + throw new IllegalStateException(String.format("Mismatched types: %s vs %s", + getType(), other.getType())); + } + if (!(other instanceof EquatableValueSet)) { + throw new IllegalStateException(String.format("ValueSet is not a EquatableValueSet: %s", other.getClass())); + } + return (EquatableValueSet) other; + } + + @Override + public int hashCode() + { + return Objects.hash(getType(), whiteList, valueBlock, nullAllowed); + } + + @Override + public boolean equals(Object obj) + { + if (this == obj) { + return true; + } + if (obj == null || getClass() != obj.getClass()) { + return false; + } + final EquatableValueSet other = (EquatableValueSet) obj; + + if (this.getType() != null && other.getType() != null && Types.getMinorTypeForArrowType(this.getType()) == Types.getMinorTypeForArrowType(other.getType())) { + //some arrow types require checking the minor type only, like Decimal. + //We ignore any params though we may want to reconsider that in the future + } + else if (this.getType() != other.getType()) { + return false; + } + + if (this.whiteList != other.whiteList) { + return false; + } + + if (this.nullAllowed != other.nullAllowed) { + return false; + } + + if (this.valueBlock == null && other.valueBlock != null) { + return false; + } + + if (this.valueBlock != null && !this.valueBlock.equalsAsSet(other.valueBlock)) { + return false; + } + + return true; + } + + @Override + public void close() + throws Exception + { + valueBlock.close(); + } + + public static Builder newBuilder(BlockAllocator allocator, ArrowType type, boolean isWhiteList, boolean nullAllowed) + { + return new Builder(allocator, type, isWhiteList, nullAllowed); + } + + public static class Builder + { + private ArrowType type; + private boolean isWhiteList; + private boolean nullAllowed; + private List values = new ArrayList<>(); + private BlockAllocator allocator; + + Builder(BlockAllocator allocator, ArrowType type, boolean isWhiteList, boolean nullAllowed) + { + requireNonNull(type, "minorType is null"); + this.allocator = allocator; + this.type = type; + this.isWhiteList = isWhiteList; + this.nullAllowed = nullAllowed; + } + + public Builder add(Object value) + { + values.add(value); + return this; + } + + public Builder addAll(Collection value) + { + values.addAll(value); + return this; + } + + public EquatableValueSet build() + { + return new EquatableValueSet(BlockUtils.newBlock(allocator, DEFAULT_COLUMN, type, values), isWhiteList, nullAllowed); + } + } + + private void checkTypeCompatibility(Marker marker) + { + if (!getType().equals(marker.getType())) { + throw new IllegalStateException(String.format("Marker of %s does not match SortedRangeSet of %s", + marker.getType(), getType())); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Marker.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Marker.java new file mode 100644 index 0000000000..3b6acab61a --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Marker.java @@ -0,0 +1,368 @@ +/* + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.ArrowTypeComparator; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.beans.Transient; + +import static java.util.Objects.requireNonNull; + +/** + * A point on the continuous space defined by the specified type. + * Each point may be just below, exact, or just above the specified value according to the Bound. + */ +public class Marker + implements Comparable, AutoCloseable +{ + protected static final String DEFAULT_COLUMN = "col1"; + + public enum Bound + { + BELOW, // lower than the value, but infinitesimally close to the value + EXACTLY, // exactly the value + ABOVE // higher than the value, but infinitesimally close to the value + } + + private final int valuePosition; + private final Block valueBlock; + private final Bound bound; + private final boolean nullValue; + + /** + * LOWER UNBOUNDED is specified with an empty value and a ABOVE bound + * UPPER UNBOUNDED is specified with an empty value and a BELOW bound + */ + @JsonCreator + public Marker( + @JsonProperty("valueBlock") Block valueBlock, + @JsonProperty("bound") Bound bound, + @JsonProperty("nullValue") boolean nullValue) + { + requireNonNull(valueBlock, "valueBlock is null"); + requireNonNull(bound, "bound is null"); + + this.valueBlock = valueBlock; + this.bound = bound; + this.nullValue = nullValue; + this.valuePosition = 0; + } + + protected Marker(Block block, + int valuePosition, + Bound bound, + boolean nullValue) + { + requireNonNull(block, "block is null"); + requireNonNull(bound, "bound is null"); + + this.valueBlock = block; + this.bound = bound; + this.nullValue = nullValue; + this.valuePosition = valuePosition; + } + + public boolean isNullValue() + { + return nullValue; + } + + @Transient + public ArrowType getType() + { + return valueBlock.getFieldReader(DEFAULT_COLUMN).getField().getType(); + } + + @Transient + public Object getValue() + { + if (nullValue) { + throw new IllegalStateException("No value to get"); + } + + FieldReader reader = valueBlock.getFieldReader(DEFAULT_COLUMN); + reader.setPosition(valuePosition); + return reader.readObject(); + } + + @JsonProperty + public Bound getBound() + { + return bound; + } + + @Transient + public Schema getSchema() + { + return valueBlock.getSchema(); + } + + @JsonProperty + public Block getValueBlock() + { + if (valueBlock.getRowCount() > 1) { + throw new RuntimeException("Attempting to get batch for a marker that appears to have a shared block"); + } + return valueBlock; + } + + @Transient + public boolean isUpperUnbounded() + { + return nullValue && bound == Bound.BELOW; + } + + @Transient + public boolean isLowerUnbounded() + { + return nullValue && bound == Bound.ABOVE; + } + + /** + * Adjacency is defined by two Markers being infinitesimally close to each other. + * This means they must share the same value and have adjacent Bounds. + */ + @Transient + public boolean isAdjacent(Marker other) + { + checkTypeCompatibility(other); + if (isUpperUnbounded() || isLowerUnbounded() || other.isUpperUnbounded() || other.isLowerUnbounded()) { + return false; + } + + if (ArrowTypeComparator.compare(getType(), getValue(), other.getValue()) != 0) { + return false; + } + + return (bound == Bound.EXACTLY && other.bound != Bound.EXACTLY) || + (bound != Bound.EXACTLY && other.bound == Bound.EXACTLY); + } + + public Marker greaterAdjacent() + { + if (nullValue) { + throw new IllegalStateException("No marker adjacent to unbounded"); + } + switch (bound) { + case BELOW: + return new Marker(valueBlock, valuePosition, Bound.EXACTLY, nullValue); + case EXACTLY: + return new Marker(valueBlock, valuePosition, Bound.ABOVE, nullValue); + case ABOVE: + throw new IllegalStateException("No greater marker adjacent to an ABOVE bound"); + default: + throw new AssertionError("Unsupported type: " + bound); + } + } + + public Marker lesserAdjacent() + { + if (nullValue) { + throw new IllegalStateException("No marker adjacent to unbounded"); + } + switch (bound) { + case BELOW: + throw new IllegalStateException("No lesser marker adjacent to a BELOW bound"); + case EXACTLY: + return new Marker(valueBlock, valuePosition, Bound.BELOW, nullValue); + case ABOVE: + return new Marker(valueBlock, valuePosition, Bound.EXACTLY, nullValue); + default: + throw new AssertionError("Unsupported type: " + bound); + } + } + + private void checkTypeCompatibility(Marker marker) + { + if (!getType().equals(marker.getType())) { + throw new IllegalArgumentException(String.format("Mismatched Marker types: %s vs %s", getType(), marker.getType())); + } + } + + public int compareTo(Marker o) + { + checkTypeCompatibility(o); + if (isUpperUnbounded()) { + return o.isUpperUnbounded() ? 0 : 1; + } + if (isLowerUnbounded()) { + return o.isLowerUnbounded() ? 0 : -1; + } + if (o.isUpperUnbounded()) { + return -1; + } + if (o.isLowerUnbounded()) { + return 1; + } + + // INVARIANT: value and o.value are present + if (valueBlock.getRowCount() < 1 || o.valueBlock.getRowCount() < 1) { + return Integer.compare(valueBlock.getRowCount(), o.valueBlock.getRowCount()); + } + + int compare = ArrowTypeComparator.compare(getType(), getValue(), o.getValue()); + if (compare == 0) { + if (bound == o.bound) { + return 0; + } + if (bound == Bound.BELOW) { + return -1; + } + if (bound == Bound.ABOVE) { + return 1; + } + // INVARIANT: bound == EXACTLY + return (o.bound == Bound.BELOW) ? 1 : -1; + } + return compare; + } + + public static Marker min(Marker marker1, Marker marker2) + { + return marker1.compareTo(marker2) <= 0 ? marker1 : marker2; + } + + public static Marker max(Marker marker1, Marker marker2) + { + return marker1.compareTo(marker2) >= 0 ? marker1 : marker2; + } + + private static Marker create(BlockAllocator allocator, ArrowType type, Object value, Bound bound) + { + return new Marker(BlockUtils.newBlock(allocator, Marker.DEFAULT_COLUMN, type, value), 0, bound, false); + } + + private static Marker create(BlockAllocator allocator, ArrowType type, Bound bound) + { + return new Marker(BlockUtils.newEmptyBlock(allocator, Marker.DEFAULT_COLUMN, type), 0, bound, true); + } + + public static Marker upperUnbounded(BlockAllocator allocator, ArrowType type) + { + requireNonNull(type, "type is null"); + return create(allocator, type, Bound.BELOW); + } + + public static Marker lowerUnbounded(BlockAllocator allocator, ArrowType type) + { + requireNonNull(type, "type is null"); + return create(allocator, type, Bound.ABOVE); + } + + public static Marker above(BlockAllocator allocator, ArrowType type, Object value) + { + requireNonNull(type, "type is null"); + requireNonNull(value, "value is null"); + return create(allocator, type, value, Bound.ABOVE); + } + + public static Marker exactly(BlockAllocator allocator, ArrowType type, Object value) + { + requireNonNull(type, "type is null"); + requireNonNull(value, "value is null"); + return create(allocator, type, value, Bound.EXACTLY); + } + + public static Marker nullMarker(BlockAllocator allocator, ArrowType type) + { + return create(allocator, type, Bound.EXACTLY); + } + + public static Marker below(BlockAllocator allocator, ArrowType type, Object value) + { + requireNonNull(type, "type is null"); + requireNonNull(value, "value is null"); + return create(allocator, type, value, Bound.BELOW); + } + + @Override + public int hashCode() + { + if (nullValue) { + return com.google.common.base.Objects.hashCode(nullValue, getType(), bound); + } + + return com.google.common.base.Objects.hashCode(nullValue, getType(), getValue(), bound); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + Marker that = (Marker) o; + + boolean result = com.google.common.base.Objects.equal(nullValue, that.nullValue) && + com.google.common.base.Objects.equal(this.getType(), that.getType()) && + com.google.common.base.Objects.equal(this.bound, that.bound); + + if (result && !nullValue) { + result = com.google.common.base.Objects.equal(this.getValue(), that.getValue()); + } + + return result; + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("valueBlock", getType()) + .add("nullValue", nullValue) + .add("valueBlock", nullValue ? nullValue : getValue()) + .add("bound", bound) + .toString(); + } + + @Override + public void close() + throws Exception + { + valueBlock.close(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/MarkerFactory.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/MarkerFactory.java new file mode 100644 index 0000000000..68eea57133 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/MarkerFactory.java @@ -0,0 +1,151 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.util.HashMap; +import java.util.Map; + +import static com.amazonaws.athena.connector.lambda.data.BlockUtils.setValue; + +public class MarkerFactory + implements AutoCloseable +{ + private final BlockAllocator allocator; + private final Map sharedMarkerBlocks = new HashMap<>(); + private final Map markerLeases = new HashMap<>(); + + public MarkerFactory(BlockAllocator allocator) + { + this.allocator = allocator; + } + + public Marker createNullable(ArrowType type, Object value, Marker.Bound bound) + { + BlockLease lease = getOrCreateBlock(type); + if (value != null) { + setValue(lease.getBlock().getFieldVector(Marker.DEFAULT_COLUMN), lease.getPos(), value); + } + return new SharedBlockMarker(this, lease.getBlock(), lease.getPos(), bound, value == null); + } + + public Marker create(ArrowType type, Object value, Marker.Bound bound) + { + BlockLease lease = getOrCreateBlock(type); + setValue(lease.getBlock().getFieldVector(Marker.DEFAULT_COLUMN), lease.getPos(), value); + return new SharedBlockMarker(this, lease.getBlock(), lease.getPos(), bound, false); + } + + public Marker create(ArrowType type, Marker.Bound bound) + { + BlockLease lease = getOrCreateBlock(type); + return new SharedBlockMarker(this, lease.getBlock(), lease.getPos(), bound, true); + } + + private synchronized BlockLease getOrCreateBlock(ArrowType type) + { + Block sharedBlock = sharedMarkerBlocks.get(type); + Integer leaseNumber = markerLeases.get(type); + if (sharedBlock == null) { + sharedBlock = BlockUtils.newEmptyBlock(allocator, Marker.DEFAULT_COLUMN, type); + sharedMarkerBlocks.put(type, sharedBlock); + leaseNumber = 0; + } + markerLeases.put(type, ++leaseNumber); + BlockLease lease = new BlockLease(sharedBlock, leaseNumber - 1); + sharedBlock.setRowCount(leaseNumber); + return lease; + } + + /** + * This leasing strategy optimizes for the create, return usecase it does not attempt to handle fragmentation + * in any meaningful way beyond what the columnar nature of Arrow provides. + */ + private synchronized void returnBlockLease(ArrowType type, int pos) + { + Block sharedBlock = sharedMarkerBlocks.get(type); + Integer leaseNumber = markerLeases.get(type); + + if (sharedBlock != null && leaseNumber > 0 && leaseNumber == pos + 1) { + markerLeases.put(type, leaseNumber - 1); + } + } + + @Override + public void close() + throws Exception + { + for (Block next : sharedMarkerBlocks.values()) { + next.close(); + } + + sharedMarkerBlocks.clear(); + markerLeases.clear(); + } + + private static class BlockLease + { + private final Block block; + private final int pos; + + public BlockLease(Block block, int pos) + { + this.block = block; + this.pos = pos; + } + + public Block getBlock() + { + return block; + } + + public int getPos() + { + return pos; + } + } + + public class SharedBlockMarker + extends Marker + { + private final MarkerFactory factory; + private final int valuePosition; + + public SharedBlockMarker(MarkerFactory factory, Block block, int valuePosition, Bound bound, boolean nullValue) + { + super(block, valuePosition, bound, nullValue); + this.factory = factory; + this.valuePosition = valuePosition; + } + + @Override + public void close() + throws Exception + { + //Don't call close on the super since we don't own the block, it shared. + factory.returnBlockLease(getType(), valuePosition); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Range.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Range.java new file mode 100644 index 0000000000..71170a0f35 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Range.java @@ -0,0 +1,231 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.beans.Transient; +import java.util.Objects; + +import static java.util.Objects.requireNonNull; + +public class Range + implements AutoCloseable +{ + private final Marker low; + private final Marker high; + + @JsonCreator + public Range( + @JsonProperty("low") Marker low, + @JsonProperty("high") Marker high) + { + requireNonNull(low, "low value is null"); + requireNonNull(high, "high value is null"); + if (!low.getType().equals(high.getType())) { + throw new IllegalArgumentException( + String.format("Marker types do not match: %s vs %s", low.getType(), high.getType())); + } + if (low.getBound() == Marker.Bound.BELOW) { + throw new IllegalArgumentException("low bound must be EXACTLY or ABOVE"); + } + if (high.getBound() == Marker.Bound.ABOVE) { + throw new IllegalArgumentException("high bound must be EXACTLY or BELOW"); + } + if (low.compareTo(high) > 0) { + throw new IllegalArgumentException("low must be less than or equal to high"); + } + this.low = low; + this.high = high; + } + + public static Range all(BlockAllocator allocator, ArrowType type) + { + return new Range(Marker.lowerUnbounded(allocator, type), Marker.upperUnbounded(allocator, type)); + } + + public static Range greaterThan(BlockAllocator allocator, ArrowType type, Object low) + { + return new Range(Marker.above(allocator, type, low), Marker.upperUnbounded(allocator, type)); + } + + public static Range greaterThanOrEqual(BlockAllocator allocator, ArrowType type, Object low) + { + return new Range(Marker.exactly(allocator, type, low), Marker.upperUnbounded(allocator, type)); + } + + public static Range lessThan(BlockAllocator allocator, ArrowType type, Object high) + { + return new Range(Marker.lowerUnbounded(allocator, type), Marker.below(allocator, type, high)); + } + + public static Range lessThanOrEqual(BlockAllocator allocator, ArrowType type, Object high) + { + return new Range(Marker.lowerUnbounded(allocator, type), Marker.exactly(allocator, type, high)); + } + + public static Range equal(BlockAllocator allocator, ArrowType type, Object value) + { + return new Range(Marker.exactly(allocator, type, value), Marker.exactly(allocator, type, value)); + } + + public static Range range(BlockAllocator allocator, ArrowType type, Object low, boolean lowInclusive, Object high, boolean highInclusive) + { + Marker lowMarker = lowInclusive ? Marker.exactly(allocator, type, low) : Marker.above(allocator, type, low); + Marker highMarker = highInclusive ? Marker.exactly(allocator, type, high) : Marker.below(allocator, type, high); + return new Range(lowMarker, highMarker); + } + + public ArrowType getType() + { + return low.getType(); + } + + @JsonProperty + public Marker getLow() + { + return low; + } + + @JsonProperty + public Marker getHigh() + { + return high; + } + + @Transient + public boolean isSingleValue() + { + return low.getBound() == Marker.Bound.EXACTLY && low.equals(high); + } + + @Transient + public Object getSingleValue() + { + if (!isSingleValue()) { + throw new IllegalStateException("Range does not have just a single value"); + } + return low.getValue(); + } + + @Transient + public boolean isAll() + { + return low.isLowerUnbounded() && high.isUpperUnbounded(); + } + + public boolean includes(Marker marker) + { + requireNonNull(marker, "marker is null"); + checkTypeCompatibility(marker); + return low.compareTo(marker) <= 0 && high.compareTo(marker) >= 0; + } + + public boolean contains(Range other) + { + checkTypeCompatibility(other); + return this.getLow().compareTo(other.getLow()) <= 0 && + this.getHigh().compareTo(other.getHigh()) >= 0; + } + + public Range span(Range other) + { + checkTypeCompatibility(other); + Marker lowMarker = Marker.min(low, other.getLow()); + Marker highMarker = Marker.max(high, other.getHigh()); + + return new Range(lowMarker, highMarker); + } + + public boolean overlaps(Range other) + { + checkTypeCompatibility(other); + return this.getLow().compareTo(other.getHigh()) <= 0 && + other.getLow().compareTo(this.getHigh()) <= 0; + } + + public Range intersect(Range other) + { + checkTypeCompatibility(other); + if (!this.overlaps(other)) { + throw new IllegalArgumentException("Cannot intersect non-overlapping ranges"); + } + Marker lowMarker = Marker.max(low, other.getLow()); + Marker highMarker = Marker.min(high, other.getHigh()); + return new Range(lowMarker, highMarker); + } + + private void checkTypeCompatibility(Range range) + { + if (!getType().equals(range.getType())) { + throw new IllegalArgumentException(String.format("Mismatched Range types: %s vs %s", + getType(), range.getType())); + } + } + + private void checkTypeCompatibility(Marker marker) + { + if (!getType().equals(marker.getType())) { + throw new IllegalArgumentException(String.format("Marker of %s does not match Range of %s", + marker.getType(), getType())); + } + } + + @Override + public int hashCode() + { + return Objects.hash(low, high); + } + + @Override + public boolean equals(Object obj) + { + if (this == obj) { + return true; + } + if (obj == null || getClass() != obj.getClass()) { + return false; + } + final Range other = (Range) obj; + return Objects.equals(this.low, other.low) && + Objects.equals(this.high, other.high); + } + + @Override + public String toString() + { + return com.google.common.base.MoreObjects.toStringHelper(this) + .add("low", low) + .add("high", high) + .toString(); + } + + @Override + public void close() + throws Exception + { + low.close(); + high.close(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Ranges.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Ranges.java new file mode 100644 index 0000000000..ac5d15acc2 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/Ranges.java @@ -0,0 +1,38 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import java.util.List; + +public interface Ranges +{ + int getRangeCount(); + + /** + * @return Allowed non-overlapping predicate ranges sorted in increasing order + */ + List getOrderedRanges(); + + /** + * @return Single range encompassing all of allowed the ranges + */ + Range getSpan(); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/SortedRangeSet.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/SortedRangeSet.java new file mode 100644 index 0000000000..9d0666e9e4 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/SortedRangeSet.java @@ -0,0 +1,505 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.beans.Transient; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.Collections; +import java.util.Comparator; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.NavigableMap; +import java.util.Objects; +import java.util.TreeMap; + +import static java.util.Objects.requireNonNull; + +public class SortedRangeSet + implements ValueSet +{ + private final boolean nullAllowed; + private final ArrowType type; + private final NavigableMap lowIndexedRanges; + + private SortedRangeSet(ArrowType type, NavigableMap lowIndexedRanges, boolean nullAllowed) + { + requireNonNull(type, "type is null"); + requireNonNull(lowIndexedRanges, "lowIndexedRanges is null"); + + this.type = type; + this.lowIndexedRanges = lowIndexedRanges; + this.nullAllowed = nullAllowed; + } + + static SortedRangeSet none(ArrowType type) + { + return copyOf(type, Collections.emptyList(), false); + } + + static SortedRangeSet all(BlockAllocator allocator, ArrowType type) + { + return copyOf(type, Collections.singletonList(Range.all(allocator, type)), true); + } + + static SortedRangeSet onlyNull(ArrowType type) + { + return copyOf(type, Collections.emptyList(), true); + } + + static SortedRangeSet notNull(BlockAllocator allocator, ArrowType type) + { + return copyOf(type, Collections.singletonList(Range.all(allocator, type)), false); + } + + static SortedRangeSet of(BlockAllocator allocator, ArrowType type, Object first, Object... rest) + { + return of(allocator, type, false, first, Arrays.asList(rest)); + } + + /** + * Provided discrete values that are unioned together to form the SortedRangeSet + */ + static SortedRangeSet of(BlockAllocator allocator, ArrowType type, boolean nullAllowed, Object first, Collection rest) + { + List ranges = new ArrayList<>(rest.size() + 1); + ranges.add(Range.equal(allocator, type, first)); + for (Object value : rest) { + ranges.add(Range.equal(allocator, type, value)); + } + return copyOf(type, ranges, nullAllowed); + } + + /** + * Provided Ranges are unioned together to form the SortedRangeSet + */ + public static SortedRangeSet of(Range first, Range... rest) + { + return of(false, first, Arrays.asList(rest)); + } + + /** + * Provided Ranges are unioned together to form the SortedRangeSet + */ + public static SortedRangeSet of(boolean nullAllowed, Range first, Range... rest) + { + return of(nullAllowed, first, Arrays.asList(rest)); + } + + /** + * Provided Ranges are unioned together to form the SortedRangeSet + */ + public static SortedRangeSet of(boolean nullAllowed, Range first, Collection rest) + { + List rangeList = new ArrayList<>(rest.size() + 1); + rangeList.add(first); + for (Range range : rest) { + rangeList.add(range); + } + return copyOf(first.getType(), rangeList, nullAllowed); + } + + /** + * Provided Ranges are unioned together to form the SortedRangeSet + */ + static SortedRangeSet copyOf(ArrowType type, Iterable ranges, boolean nullAllowed) + { + return new Builder(type, nullAllowed).addAll(ranges).build(); + } + + @JsonCreator + public static SortedRangeSet copyOf( + @JsonProperty("type") ArrowType type, + @JsonProperty("ranges") List ranges, + @JsonProperty("nullAllowed") boolean nullAllowed + ) + { + return copyOf(type, (Iterable) ranges, nullAllowed); + } + + @JsonProperty("nullAllowed") + @Override + public boolean isNullAllowed() + { + return nullAllowed; + } + + @JsonProperty + public ArrowType getType() + { + return type; + } + + @JsonProperty("ranges") + public List getOrderedRanges() + { + return new ArrayList<>(lowIndexedRanges.values()); + } + + @Transient + public int getRangeCount() + { + return lowIndexedRanges.size(); + } + + @Transient + @Override + public boolean isNone() + { + return lowIndexedRanges.isEmpty(); + } + + @Transient + @Override + public boolean isAll() + { + return lowIndexedRanges.size() == 1 && lowIndexedRanges.values().iterator().next().isAll(); + } + + @Transient + @Override + public boolean isSingleValue() + { + return (lowIndexedRanges.size() == 1 && lowIndexedRanges.values().iterator().next().isSingleValue() && !nullAllowed) || + lowIndexedRanges.isEmpty() && nullAllowed; + } + + @Transient + @Override + public Object getSingleValue() + { + if (!isSingleValue()) { + throw new IllegalStateException("SortedRangeSet does not have just a single value"); + } + + if (nullAllowed && lowIndexedRanges.isEmpty()) { + return null; + } + + return lowIndexedRanges.values().iterator().next().getSingleValue(); + } + + @Override + public boolean containsValue(Marker marker) + { + requireNonNull(marker, "marker is null"); + checkTypeCompatibility(marker); + + if (marker.isNullValue() && nullAllowed) { + return true; + } + else if (marker.isNullValue() && !nullAllowed) { + return false; + } + + if (marker.getBound() != Marker.Bound.EXACTLY) { + throw new RuntimeException("Expected Bound.EXACTLY but found " + marker.getBound()); + } + + Map.Entry floorEntry = lowIndexedRanges.floorEntry(marker); + return floorEntry != null && floorEntry.getValue().includes(marker); + } + + boolean includesMarker(Marker marker) + { + requireNonNull(marker, "marker is null"); + checkTypeCompatibility(marker); + + if (marker.isNullValue() && nullAllowed) { + return true; + } + + Map.Entry floorEntry = lowIndexedRanges.floorEntry(marker); + return floorEntry != null && floorEntry.getValue().includes(marker); + } + + @Transient + public Range getSpan() + { + if (lowIndexedRanges.isEmpty()) { + throw new IllegalStateException("Can not get span if no ranges exist"); + } + return lowIndexedRanges.firstEntry().getValue().span(lowIndexedRanges.lastEntry().getValue()); + } + + @Override + public Ranges getRanges() + { + return new Ranges() + { + @Override + public int getRangeCount() + { + return SortedRangeSet.this.getRangeCount(); + } + + @Override + public List getOrderedRanges() + { + return SortedRangeSet.this.getOrderedRanges(); + } + + @Override + public Range getSpan() + { + return SortedRangeSet.this.getSpan(); + } + }; + } + + @Override + public SortedRangeSet intersect(BlockAllocator allocator, ValueSet other) + { + SortedRangeSet otherRangeSet = checkCompatibility(other); + + boolean intersectNullAllowed = this.isNullAllowed() && other.isNullAllowed(); + Builder builder = new Builder(type, intersectNullAllowed); + + Iterator iterator1 = getOrderedRanges().iterator(); + Iterator iterator2 = otherRangeSet.getOrderedRanges().iterator(); + + if (iterator1.hasNext() && iterator2.hasNext()) { + Range range1 = iterator1.next(); + Range range2 = iterator2.next(); + + while (true) { + if (range1.overlaps(range2)) { + builder.add(range1.intersect(range2)); + } + + if (range1.getHigh().compareTo(range2.getHigh()) <= 0) { + if (!iterator1.hasNext()) { + break; + } + range1 = iterator1.next(); + } + else { + if (!iterator2.hasNext()) { + break; + } + range2 = iterator2.next(); + } + } + } + + return builder.build(); + } + + @Override + public SortedRangeSet union(BlockAllocator allocator, ValueSet other) + { + boolean unionNullAllowed = this.isNullAllowed() || other.isNullAllowed(); + SortedRangeSet otherRangeSet = checkCompatibility(other); + return new Builder(type, unionNullAllowed) + .addAll(this.getOrderedRanges()) + .addAll(otherRangeSet.getOrderedRanges()) + .build(); + } + + @Override + public SortedRangeSet union(BlockAllocator allocator, Collection valueSets) + { + boolean unionNullAllowed = this.isNullAllowed(); + for (ValueSet valueSet : valueSets) { + unionNullAllowed |= valueSet.isNullAllowed(); + } + + Builder builder = new Builder(type, unionNullAllowed); + builder.addAll(this.getOrderedRanges()); + for (ValueSet valueSet : valueSets) { + builder.addAll(checkCompatibility(valueSet).getOrderedRanges()); + } + return builder.build(); + } + + @Override + public SortedRangeSet complement(BlockAllocator allocator) + { + Builder builder = new Builder(type, !nullAllowed); + + if (lowIndexedRanges.isEmpty()) { + return builder.add(Range.all(allocator, type)).build(); + } + + Iterator rangeIterator = lowIndexedRanges.values().iterator(); + + Range firstRange = rangeIterator.next(); + if (!firstRange.getLow().isLowerUnbounded()) { + builder.add(new Range(Marker.lowerUnbounded(allocator, type), firstRange.getLow().lesserAdjacent())); + } + + Range previousRange = firstRange; + while (rangeIterator.hasNext()) { + Range currentRange = rangeIterator.next(); + + Marker lowMarker = previousRange.getHigh().greaterAdjacent(); + Marker highMarker = currentRange.getLow().lesserAdjacent(); + builder.add(new Range(lowMarker, highMarker)); + + previousRange = currentRange; + } + + Range lastRange = previousRange; + if (!lastRange.getHigh().isUpperUnbounded()) { + builder.add(new Range(lastRange.getHigh().greaterAdjacent(), Marker.upperUnbounded(allocator, type))); + } + + return builder.build(); + } + + private SortedRangeSet checkCompatibility(ValueSet other) + { + if (!getType().equals(other.getType())) { + throw new IllegalStateException(String.format("Mismatched types: %s vs %s", + getType(), other.getType())); + } + if (!(other instanceof SortedRangeSet)) { + throw new IllegalStateException(String.format("ValueSet is not a SortedRangeSet: %s", other.getClass())); + } + return (SortedRangeSet) other; + } + + private void checkTypeCompatibility(Marker marker) + { + if (!getType().equals(marker.getType())) { + throw new IllegalStateException(String.format("Marker of %s does not match SortedRangeSet of %s", + marker.getType(), getType())); + } + } + + @Override + public int hashCode() + { + return Objects.hash(lowIndexedRanges, nullAllowed); + } + + @Override + public boolean equals(Object obj) + { + if (this == obj) { + return true; + } + if (obj == null || getClass() != obj.getClass()) { + return false; + } + + final SortedRangeSet other = (SortedRangeSet) obj; + if (this.nullAllowed != other.isNullAllowed()) { + return false; + } + + return Objects.equals(this.lowIndexedRanges, other.lowIndexedRanges); + } + + @Override + public String toString() + { + return com.google.common.base.MoreObjects.toStringHelper(this) + .add("type", type) + .add("nullAllowed", nullAllowed) + .add("lowIndexedRanges", lowIndexedRanges) + .toString(); + } + + public static Builder newBuilder(ArrowType type, boolean nullAllowed) + { + return new Builder(type, nullAllowed); + } + + public static class Builder + { + private final ArrowType type; + private final boolean nullAllowed; + private final List ranges = new ArrayList<>(); + + Builder(ArrowType type, boolean nullAllowed) + { + requireNonNull(type, "type is null"); + this.type = type; + this.nullAllowed = nullAllowed; + } + + public Builder add(Range range) + { + if (!type.equals(range.getType())) { + throw new IllegalArgumentException(String.format("Range type %s does not match builder type %s", + range.getType(), type)); + } + + ranges.add(range); + return this; + } + + public Builder addAll(Iterable arg) + { + for (Range range : arg) { + add(range); + } + return this; + } + + public SortedRangeSet build() + { + Collections.sort(ranges, Comparator.comparing(Range::getLow)); + + NavigableMap result = new TreeMap<>(); + + Range current = null; + for (Range next : ranges) { + if (current == null) { + current = next; + continue; + } + + if (current.overlaps(next) || current.getHigh().isAdjacent(next.getLow())) { + current = current.span(next); + } + else { + result.put(current.getLow(), current); + current = next; + } + } + + if (current != null) { + result.put(current.getLow(), current); + } + + return new SortedRangeSet(type, result, nullAllowed); + } + } + + @Override + public void close() + throws Exception + { + for (Map.Entry next : lowIndexedRanges.entrySet()) { + next.getKey().close(); + next.getValue().close(); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/ValueSet.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/ValueSet.java new file mode 100644 index 0000000000..60ee9a72e7 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/predicate/ValueSet.java @@ -0,0 +1,99 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.fasterxml.jackson.annotation.JsonSubTypes; +import com.fasterxml.jackson.annotation.JsonTypeInfo; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.beans.Transient; +import java.util.Collection; + +@JsonTypeInfo( + use = JsonTypeInfo.Id.NAME, + include = JsonTypeInfo.As.PROPERTY, + property = "@type") +@JsonSubTypes({ + @JsonSubTypes.Type(value = EquatableValueSet.class, name = "equatable"), + @JsonSubTypes.Type(value = SortedRangeSet.class, name = "sortable"), + @JsonSubTypes.Type(value = AllOrNoneValueSet.class, name = "allOrNone"), +}) +public interface ValueSet + extends AutoCloseable +{ + ArrowType getType(); + + @Transient + boolean isNone(); + + @Transient + boolean isAll(); + + @Transient + boolean isSingleValue(); + + @Transient + Object getSingleValue(); + + boolean isNullAllowed(); + + boolean containsValue(Marker value); + + /** + * @return range predicates for orderable Types + */ + @Transient + default Ranges getRanges() + { + throw new UnsupportedOperationException(); + } + + ValueSet intersect(BlockAllocator allocator, ValueSet other); + + ValueSet union(BlockAllocator allocator, ValueSet other); + + default ValueSet union(BlockAllocator allocator, Collection valueSets) + { + ValueSet current = this; + for (ValueSet valueSet : valueSets) { + current = current.union(allocator, valueSet); + } + return current; + } + + ValueSet complement(BlockAllocator allocator); + + default boolean overlaps(BlockAllocator allocator, ValueSet other) + { + return !this.intersect(allocator, other).isNone(); + } + + default ValueSet subtract(BlockAllocator allocator, ValueSet other) + { + return this.intersect(allocator, other.complement(allocator)); + } + + default boolean contains(BlockAllocator allocator, ValueSet other) + { + return this.union(allocator, other).equals(this); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/spill/S3SpillLocation.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/spill/S3SpillLocation.java new file mode 100644 index 0000000000..6f5f449820 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/spill/S3SpillLocation.java @@ -0,0 +1,176 @@ +package com.amazonaws.athena.connector.lambda.domain.spill; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import java.util.Objects; + +/** + * Defines a SpillLocation that is backed by S3. + */ +public class S3SpillLocation + implements SpillLocation +{ + private static final String SEPARATOR = "/"; + //The S3 bucket where we may have spilled data. + private final String bucket; + //The S3 key where we may have spilled data + private final String key; + //If true the Key is actually a key prefix for a location that may have multiple blocks. + private final boolean directory; + + /** + * Constructs an S3 SpillLocation. + * + * @param bucket The S3 bucket that is the root of the spill location. + * @param key The S3 key that represents the spill location + * @param directory Boolean that if True indicates the key is a pre-fix (aka directory) where multiple Blocks may + * be spilled. + */ + @JsonCreator + public S3SpillLocation(@JsonProperty("bucket") String bucket, + @JsonProperty("key") String key, + @JsonProperty("directory") boolean directory) + { + this.bucket = bucket; + this.key = key; + this.directory = directory; + } + + /** + * The S3 bucket that we may have spilled data to. + * + * @return String containing the S3 bucket name. + */ + @JsonProperty + public String getBucket() + { + return bucket; + } + + /** + * The S3 key that we may have spilled data to. + * + * @return String containing the S3 key. + */ + @JsonProperty + public String getKey() + { + return key; + } + + /** + * Indicates if the Key is actually a key prefix for a location that may have multiple blocks. + * + * @return True if the key is actually a prefix for a location that may have multiple blocks, False if the location + * points to a specific S3 object. + */ + @JsonProperty + public boolean isDirectory() + { + return directory; + } + + @Override + public String toString() + { + return "S3SpillLocation{" + + "bucket='" + bucket + '\'' + + ", key='" + key + '\'' + + ", directory=" + directory + + '}'; + } + + public static Builder newBuilder() + { + return new Builder(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + S3SpillLocation that = (S3SpillLocation) o; + return isDirectory() == that.isDirectory() && + Objects.equals(getBucket(), that.getBucket()) && + Objects.equals(getKey(), that.getKey()); + } + + @Override + public int hashCode() + { + return Objects.hash(getBucket(), getKey(), isDirectory()); + } + + public static class Builder + { + private String bucket; + private String prefix; + private String queryId; + private String splitId; + private boolean isDirectory = true; + + private Builder() {} + + public Builder withBucket(String bucket) + { + this.bucket = bucket; + return this; + } + + public Builder withPrefix(String prefix) + { + this.prefix = prefix; + return this; + } + + public Builder withIsDirectory(boolean isDirectory) + { + this.isDirectory = isDirectory; + return this; + } + + public Builder withQueryId(String queryId) + { + this.queryId = queryId; + return this; + } + + public Builder withSplitId(String splitId) + { + this.splitId = splitId; + return this; + } + + public S3SpillLocation build() + { + String key = prefix + SEPARATOR + queryId + SEPARATOR + splitId; + return new S3SpillLocation(bucket, key, true); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/spill/SpillLocation.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/spill/SpillLocation.java new file mode 100644 index 0000000000..d4116c69f1 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/domain/spill/SpillLocation.java @@ -0,0 +1,37 @@ +package com.amazonaws.athena.connector.lambda.domain.spill; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonSubTypes; +import com.fasterxml.jackson.annotation.JsonTypeInfo; + +/** + * Used to tag different types of spill locations. + */ +@JsonIgnoreProperties(ignoreUnknown = true) +@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY) +@JsonSubTypes({ + @JsonSubTypes.Type(value = S3SpillLocation.class, name = "S3SpillLocation") +}) +public interface SpillLocation +{ +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ContinuationToken.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ContinuationToken.java new file mode 100644 index 0000000000..fe0711c45c --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ContinuationToken.java @@ -0,0 +1,104 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +/** + * All items in the "com.amazonaws.athena.connector.lambda.examples" that this class belongs to are part of an + * 'Example' connector. We do not recommend using any of the classes in this package directly. Instead you can/should + * copy and modify as needed. + *

+ * This class is used to define the continuation token used by doGetSplits as well as logic for serializing and deserializing + * said token. + */ +public class ContinuationToken +{ + private static final String CONTINUATION_TOKEN_DIVIDER = ":"; + private final int partition; + private final int part; + + /** + * Basic constructor. + * + * @param partition The last partition we processed. + * @param part The next part to process. + */ + public ContinuationToken(int partition, int part) + { + this.partition = partition; + this.part = part; + } + + /** + * The partition in this token. + * + * @return int containing the last processed partition. + */ + public int getPartition() + { + return partition; + } + + /** + * The part in this token. + * + * @return int containing the next part to process. + */ + public int getPart() + { + return part; + } + + /** + * Decodes the provided String representation of a ContinuationToken into a ContinuationToken + * + * @param token An encoded ContinuationToken. + * @return The ContinuationToken represented by the token string. + */ + public static ContinuationToken decode(String token) + { + if (token != null) { + //if we have a continuation token, lets decode it. The format of this token is owned by this class + String[] tokenParts = token.split(CONTINUATION_TOKEN_DIVIDER); + + if (tokenParts.length != 2) { + throw new RuntimeException("Unable to decode continuation token " + token); + } + + int partition = Integer.valueOf(tokenParts[0]); + return new ContinuationToken(partition, Integer.valueOf(tokenParts[1])); + } + + //No continuation token present + return new ContinuationToken(0, 0); + } + + /** + * Encodes the provided partition and part into a string representation of ContinuationToken + * + * @param partition The last partition we processed. + * @param part The next part to process. + * @return The String representation of a ContinuationToken; + */ + public static String encode(int partition, int part) + { + return partition + CONTINUATION_TOKEN_DIVIDER + part; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleCompositeHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleCompositeHandler.java new file mode 100644 index 0000000000..2544fc95da --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleCompositeHandler.java @@ -0,0 +1,36 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose ExampleMetadataHandler and ExampleRecordHandler. + */ +public class ExampleCompositeHandler + extends CompositeHandler +{ + public ExampleCompositeHandler() + { + super(new ExampleMetadataHandler(), new ExampleRecordHandler()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleMetadataHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleMetadataHandler.java new file mode 100644 index 0000000000..7b4f5c6589 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleMetadataHandler.java @@ -0,0 +1,449 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.exceptions.FederationThrottleException; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.DateUnit; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; + +/** + * All items in the "com.amazonaws.athena.connector.lambda.examples" that this class belongs to are part of an + * 'Example' connector. We do not recommend using any of the classes in this package directly. Instead you can/should + * copy and modify as needed. + *

+ * This class defined an example MetadataHandler that supports a single schema and single table which showcases most + * of the features offered by the Amazon Athena Query Federation SDK. Some notable characteristics include: + * 1. Highly partitioned table. + * 2. Paginated split generation. + * 3. S3 Spill support. + * 4. Spill encryption using either KMS KeyFactory or LocalKeyFactory. + * 5. A wide range of field types including complex Struct and List types. + *

+ * + * @note All schema names, table names, and column names must be lower case at this time. Any entities that are uppercase or + * mixed case will not be accessible in queries and will be lower cased by Athena's engine to ensure consistency across + * sources. As such you may need to handle this when integrating with a source that supports mixed case. As an example, + * you can look at the CloudwatchTableResolver in the athena-cloudwatch module for one potential approach to this challenge. + *

+ * @see MetadataHandler + */ +public class ExampleMetadataHandler + extends MetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleMetadataHandler.class); + //Used to aid in diagnostic logging + private static final String SOURCE_TYPE = "custom"; + //The name of the Lambda Environment vaiable that toggles generating simulated Throttling events to trigger Athena's + //Congestion control logic. + private static final String SIMULATE_THROTTLES = "SIMULATE_THROTTLES"; + //The number of splits to generated for each Partition. Keep in mind this connector generates random data, a real + //source is unlikely to have such a setting. + protected static final int NUM_PARTS_PER_SPLIT = 10; + //This is used to illustrate how to use continuation tokens to handle partitions that generate a large number + //of splits. This helps avoid hitting the Lambda response size limit. + protected static final int MAX_SPLITS_PER_REQUEST = 300; + //Field name for storing partition location information. + protected static final String PARTITION_LOCATION = "location"; + //Field name for storing an example property on our partitions and splits. + protected static final String SERDE = "serde"; + + //Stores how frequently to generate a simulated throttling event. + private final int simulateThrottle; + //Controls if spill encryption should be enabled or disabled. + private boolean encryptionEnabled = true; + //Counter that is used in conjunction with simulateThrottle to generated simulated throttling events. + private int count = 0; + + /** + * Default constructor used by Lambda. + */ + public ExampleMetadataHandler() + { + super(SOURCE_TYPE); + this.simulateThrottle = (System.getenv(SIMULATE_THROTTLES) == null) ? 0 : Integer.parseInt(System.getenv(SIMULATE_THROTTLES)); + } + + /** + * Full DI constructor used mostly for testing + * + * @param keyFactory The EncryptionKeyFactory to use for spill encryption. + * @param awsSecretsManager The AWSSecretsManager client that can be used when attempting to resolve secrets. + * @param athena The Athena client that can be used to fetch query termination status to fast-fail this handler. + * @param spillBucket The S3 Bucket to use when spilling results. + * @param spillPrefix The S3 prefix to use when spilling results. + */ + @VisibleForTesting + protected ExampleMetadataHandler(EncryptionKeyFactory keyFactory, + AWSSecretsManager awsSecretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(keyFactory, awsSecretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + //Read the Lambda environment variable for controlling simulated throttles. + this.simulateThrottle = (System.getenv(SIMULATE_THROTTLES) == null) ? 0 : Integer.parseInt(System.getenv(SIMULATE_THROTTLES)); + } + + /** + * Used to toggle encryption during unit tests. + * + * @param enableEncryption + */ + @VisibleForTesting + protected void setEncryption(boolean enableEncryption) + { + this.encryptionEnabled = enableEncryption; + } + + /** + * Demonstrates how you can capture the identity of the caller that ran the Athena query which triggered the Lambda invocation. + * + * @param request + */ + private void logCaller(FederationRequest request) + { + FederatedIdentity identity = request.getIdentity(); + logger.info("logCaller: account[" + identity.getAccount() + "] id[" + identity.getId() + "] principal[" + identity.getPrincipal() + "]"); + } + + /** + * Returns a static, single schema. A connector for a real data source would likely query that source's metadata + * to create a real list of schemas. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @return The ListSchemasResponse which mostly contains the list of schemas (aka databases). + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator allocator, ListSchemasRequest request) + { + logCaller(request); + List schemas = new ArrayList<>(); + schemas.add(ExampleTable.schemaName); + return new ListSchemasResponse(request.getCatalogName(), schemas); + } + + /** + * Returns a static list of TableNames. A connector for a real data source would likely query that source's metadata + * to create a real list of TableNames for the requested schema name. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog and database they are querying. + * @return A ListTablesResponse containing the list of available TableNames. + */ + @Override + public ListTablesResponse doListTables(BlockAllocator allocator, ListTablesRequest request) + { + logCaller(request); + List tables = new ArrayList<>(); + tables.add(new TableName(ExampleTable.schemaName, ExampleTable.tableName)); + + //The below filter for null schema is not typical, we do this to generate a specific semantic error + //that is exercised in our unit test suite. + return new ListTablesResponse(request.getCatalogName(), + tables.stream() + .filter(table -> request.getSchemaName() == null || request.getSchemaName().equals(table.getSchemaName())) + .collect(Collectors.toList())); + } + + /** + * Retrieves a static Table schema for the example table. A connector for a real data source would likely query that + * source's metadata to create a table definition. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog, database, and table they are querying. + * @return A GetTableResponse containing the definition of the table (e.g. table schema and partition columns) + */ + @Override + public GetTableResponse doGetTable(BlockAllocator allocator, GetTableRequest request) + { + logCaller(request); + if (!request.getTableName().getSchemaName().equals(ExampleTable.schemaName) || + !request.getTableName().getTableName().equals(ExampleTable.tableName)) { + throw new IllegalArgumentException("Unknown table " + request.getTableName()); + } + + Set partitionCols = new HashSet<>(); + partitionCols.add("month"); + partitionCols.add("year"); + partitionCols.add("day"); + return new GetTableResponse(request.getCatalogName(), request.getTableName(), ExampleTable.schema, partitionCols); + } + + /** + * Here we inject the two additional columns we define for partition metadata. These columns are ignored by + * Athena but passed along to our code when Athena calls GetSplits(...). If you do not require any additional + * metadata on your partitions you may choose not to implement this function. + * + * @param partitionSchemaBuilder The SchemaBuilder you can use to add additional columns and metadata to the + * partitions response. + * @param request The GetTableLayoutResquest that triggered this call. + */ + @Override + public void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + /** + * Add any additional fields we might need to our partition response schema. + * These additional fields are ignored by Athena but will be passed to GetSplits(...) + * when Athena calls our lambda function to plan the distributed read of our partitions. + */ + partitionSchemaBuilder.addField(PARTITION_LOCATION, new ArrowType.Utf8()) + .addField(SERDE, new ArrowType.Utf8()); + } + + /** + * Our example table is partitions on year, month, day so we loop over a range of years, months, and days to generate + * our example partitions. A connector for a real data source would likely query that source's metadata + * to create a real list of partitions. + * @param writer Used to write rows (partitions) into the Apache Arrow response. The writes are automatically constrained. + * @param request Provides details of the catalog, database, and table being queried as well as any filter predicate. + * @param queryStatusChecker + */ + @Override + public void getPartitions(BlockWriter writer, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + { + logCaller(request); + + /** + * Now use the constraint that was in the request to do some partition pruning. Here we are just + * generating some fake values for the partitions but in a real implementation you'd use your metastore + * or knowledge of the actual table's physical layout to do this. + */ + for (int year = 1990; year < 2020; year++) { + for (int month = 0; month < 12; month++) { + for (int day = 0; day < 30; day++) { + final int dayVal = day; + final int monthVal = month; + final int yearVal = year; + writer.writeRows((Block block, int rowNum) -> { + //these are our partition columns and were defined by the call to doGetTable(...) + boolean matched = true; + matched &= block.setValue("day", rowNum, dayVal); + matched &= block.setValue("month", rowNum, monthVal); + matched &= block.setValue("year", rowNum, yearVal); + + //these are additional field we added by overriding enhancePartitionSchema(...) + matched &= block.setValue(PARTITION_LOCATION, rowNum, "s3://" + request.getPartitionCols()); + matched &= block.setValue(SERDE, rowNum, "TextInputFormat"); + + //if all fields passed then we wrote 1 row + return matched ? 1 : 0; + }); + } + } + } + } + + /** + * For each partition we generate a pre-determined number of splits based on the NUM_PARTS_PER_SPLIT setting. This + * method also demonstrates how to handle calls for batches of partitions and also leverage this API's ability + * to paginated. A connector for a real data source would likely query that source's metadata to determine if/how + * to split up the read operations for a particular partition. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details of the catalog, database, table, andpartition(s) being queried as well as + * any filter predicate. + * @return A GetSplitsResponse which contains a list of splits as an optional continuation token if we were not + * able to generate all splits for the partitions in this batch. + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + { + logCaller(request); + logger.info("doGetSplits: spill location " + makeSpillLocation(request)); + + /** + * It is important to try and throw any throttling events before writing data since Athena may not be able to + * continue the query, due to consistency errors, if you throttle after writing data. + */ + if (simulateThrottle > 0 && count++ % simulateThrottle == 0) { + logger.info("readWithConstraint: throwing throttle Exception!"); + throw new FederationThrottleException("Please slow down for this simulated throttling event"); + } + + ContinuationToken requestToken = ContinuationToken.decode(request.getContinuationToken()); + int partitionContd = requestToken.getPartition(); + int partContd = requestToken.getPart(); + + Set splits = new HashSet<>(); + Block partitions = request.getPartitions(); + for (int curPartition = partitionContd; curPartition < partitions.getRowCount(); curPartition++) { + //We use the makeEncryptionKey() method from our parent class to make an EncryptionKey + EncryptionKey encryptionKey = makeEncryptionKey(); + + //We prepare to read our custom metadata fields from the partition so that we can pass this info to the split(s) + FieldReader locationReader = partitions.getFieldReader(SplitProperties.LOCATION.getId()); + locationReader.setPosition(curPartition); + FieldReader storageClassReader = partitions.getFieldReader(SplitProperties.SERDE.getId()); + storageClassReader.setPosition(curPartition); + + //Do something to decide if this partition needs to be subdivided into multiple, possibly concurrent, + //table scan operations (aka splits) + for (int curPart = partContd; curPart < NUM_PARTS_PER_SPLIT; curPart++) { + if (splits.size() >= MAX_SPLITS_PER_REQUEST) { + //We exceeded the number of split we want to return in a single request, return and provide + //a continuation token. + return new GetSplitsResponse(request.getCatalogName(), + splits, + ContinuationToken.encode(curPartition, curPart)); + } + + //We use makeSpillLocation(...) from our parent class to get a unique SpillLocation for each split + Split.Builder splitBuilder = Split.newBuilder(makeSpillLocation(request), encryptionEnabled ? encryptionKey : null) + .add(SplitProperties.LOCATION.getId(), String.valueOf(locationReader.readText())) + .add(SplitProperties.SERDE.getId(), String.valueOf(storageClassReader.readText())) + .add(SplitProperties.SPLIT_PART.getId(), String.valueOf(curPart)); + + //Add the partition column values to the split's properties. + //We are doing this because our example record reader depends on it, your specific needs + //will likely vary. Our example only supports a limited number of partition column types. + for (String next : request.getPartitionCols()) { + FieldReader reader = partitions.getFieldReader(next); + reader.setPosition(curPartition); + + switch (reader.getMinorType()) { + case UINT2: + splitBuilder.add(next, Integer.valueOf(reader.readCharacter()).toString()); + break; + case UINT4: + case INT: + splitBuilder.add(next, String.valueOf(reader.readInteger())); + break; + case UINT8: + case BIGINT: + splitBuilder.add(next, String.valueOf(reader.readLong())); + break; + default: + throw new RuntimeException("Unsupported partition column type. " + reader.getMinorType()); + } + } + + splits.add(splitBuilder.build()); + } + + //part continuation only applies within a partition so we complete that partial partition and move on + //to the next one. + partContd = 0; + } + + return new GetSplitsResponse(request.getCatalogName(), splits, null); + } + + /** + * We use the ping signal to simply log the fact that a ping request came in. + * + * @param request The PingRequest. + */ + public void onPing(PingRequest request) + { + logCaller(request); + } + + /** + * We use this as our static metastore for the example implementation + */ + protected static class ExampleTable + { + public static final String schemaName = "custom_source"; + public static final String tableName = "fake_table"; + public static final Schema schema; + + static { + schema = new SchemaBuilder().newBuilder() + .addField("col1", new ArrowType.Date(DateUnit.DAY)) + .addField("day", new ArrowType.Int(32, true)) + .addField("month", new ArrowType.Int(32, true)) + .addField("year", new ArrowType.Int(32, true)) + .addField("col3", new ArrowType.Bool()) + .addField("col4", new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)) + .addField("col5", new ArrowType.Utf8()) + .addField("datemilli", Types.MinorType.DATEMILLI.getType()) + .addField("int", Types.MinorType.INT.getType()) + .addField("tinyint", Types.MinorType.TINYINT.getType()) + .addField("smallint", Types.MinorType.SMALLINT.getType()) + .addField("bigint", Types.MinorType.BIGINT.getType()) + .addField("float4", Types.MinorType.FLOAT4.getType()) + .addField("float8", Types.MinorType.FLOAT8.getType()) + .addField("bit", Types.MinorType.BIT.getType()) + .addField("varchar", Types.MinorType.VARCHAR.getType()) + .addField("varbinary", Types.MinorType.VARBINARY.getType()) + .addField("decimal", new ArrowType.Decimal(10, 2)) + .addField("decimalLong", new ArrowType.Decimal(36, 2)) + //Example of a List of Structs + .addField( + FieldBuilder.newBuilder("list", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("innerStruct", Types.MinorType.STRUCT.getType()) + .addStringField("varchar") + .addBigIntField("bigint") + .build()) + .build()) + //Example of a List Of Lists + .addField( + FieldBuilder.newBuilder("outerlist", new ArrowType.List()) + .addListField("innerList", Types.MinorType.VARCHAR.getType()) + .build()) + .addMetadata("partitionCols", "day,month,year") + .addMetadata("randomProp1", "randomPropVal1") + .addMetadata("randomProp2", "randomPropVal2").build(); + } + + private ExampleTable() {} + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleRecordHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleRecordHandler.java new file mode 100644 index 0000000000..4e1b058725 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleRecordHandler.java @@ -0,0 +1,309 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.exceptions.FederationThrottleException; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * All items in the "com.amazonaws.athena.connector.lambda.examples" that this class belongs to are part of an + * 'Example' connector. We do not recommend using any of the classes in this package directly. Instead you can/should + * copy and modify as needed. + *

+ * More specifically, this class is responsible for providing Athena with actual rows level data from our simulated + * source. Athena will call readWithConstraint(...) on this class for each 'Split' we generated in ExampleMetadataHandler. + *

+ * + * @see com.amazonaws.athena.connector.lambda.handlers.RecordHandler + */ +public class ExampleRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleRecordHandler.class); + //used in diagnostic logging + private static final String SOURCE_TYPE = "custom"; + //The name of the environment variable to read for the number of rows to generate per Split instead of + //the default below. + private static final String NUM_ROWS_PER_SPLIT = "NUM_ROWS_PER_SPLIT"; + //The name of the Lambda Environment vaiable that toggles generating simulated Throttling events to trigger Athena's + //Congestion control logic. + private static final String SIMULATE_THROTTLES = "SIMULATE_THROTTLES"; + //The number of rows to generate per Split + private int numRowsPerSplit = 400_000; + //Stores how frequently to generate a simulated throttling event. + private final int simulateThrottle; + //Counter that is used in conjunction with simulateThrottle to generated simulated throttling events. + private int count = 0; + + /** + * Default constructor used by Lambda. + */ + public ExampleRecordHandler() + { + this(AmazonS3ClientBuilder.defaultClient(), AWSSecretsManagerClientBuilder.defaultClient(), AmazonAthenaClientBuilder.defaultClient()); + if (System.getenv(NUM_ROWS_PER_SPLIT) != null) { + numRowsPerSplit = Integer.parseInt(System.getenv(NUM_ROWS_PER_SPLIT)); + } + } + + /** + * Full DI constructor used mostly for testing + * + * @param amazonS3 The AmazonS3 client to use for spills. + * @param secretsManager The AWSSecretsManager client that can be used when attempting to resolve secrets. + * @param athena The Athena client that can be used to fetch query termination status to fast-fail this handler. + */ + @VisibleForTesting + protected ExampleRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + this.simulateThrottle = (System.getenv(SIMULATE_THROTTLES) == null) ? 0 : Integer.parseInt(System.getenv(SIMULATE_THROTTLES)); + } + + /** + * Used to set the number of rows per split. This method is mostly used for testing where setting the environment + * variable to override the default is not practical. + * + * @param numRows The number of rows to generate per split. + */ + @VisibleForTesting + protected void setNumRows(int numRows) + { + this.numRowsPerSplit = numRows; + } + + /** + * Demonstrates how you can capture the identity of the caller that ran the Athena query which triggered the Lambda invocation. + * + * @param request + */ + private void logCaller(FederationRequest request) + { + FederatedIdentity identity = request.getIdentity(); + logger.info("logCaller: account[" + identity.getAccount() + "] id[" + identity.getId() + "] principal[" + identity.getPrincipal() + "]"); + } + + /** + * We use the ping signal to simply log the fact that a ping request came in. + * + * @param request The PingRequest. + */ + protected void onPing(PingRequest request) + { + logCaller(request); + } + + /** + * Here we generate our simulated row data. A real connector would instead connect to the actual source and read + * the data corresponding to the requested split. + * @param spiller A BlockSpiller that should be used to write the row data associated with this Split. + * The BlockSpiller automatically handles applying constraints, chunking the response, encrypting, and spilling to S3. + * @param request The ReadRecordsRequest containing the split and other details about what to read. + * @param queryStatusChecker A QueryStatusChecker that you can use to stop doing work for a query that has already terminated + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest request, QueryStatusChecker queryStatusChecker) + { + /** + * It is important to try and throw any throttling events before writing data since Athena may not be able to + * continue the query, due to consistency errors, if you throttle after writing data. + */ + if (simulateThrottle > 0 && count++ % simulateThrottle == 0) { + logger.info("readWithConstraint: throwing throttle Exception!"); + throw new FederationThrottleException("Please slow down for this simulated throttling event"); + } + + logCaller(request); + for (int i = 0; i < numRowsPerSplit; i++) { + if (!queryStatusChecker.isQueryRunning()) { + return; + } + final int seed = i; + spiller.writeRows((Block block, int rowNum) -> { + //This is just filling the row with random data and then partition values that match the split + //in a real implementation you would read your real data. + boolean rowMatched = makeRandomRow(block, rowNum, seed); + addPartitionColumns(request.getSplit(), block, rowNum); + return rowMatched ? 1 : 0; + }); + } + } + + /** + * Helper function that we use to ensure the partition columns values are not randomly generated and instead + * correspond to the partition that the Split belongs to. This is important because if they do not match + * then the rows will likely get filtered out of the result. This method is only applicable to our random + * row data as a real connector would not have to worry about a missmatch of these values because they would + * of course match their storage. + * + * @param split The Split that we are generating partition column values for. + * @param block The Block we need to write the partition column values into. + * @param blockRow The row twe need to write the partition column values into. + */ + private void addPartitionColumns(Split split, Block block, int blockRow) + { + for (String nextPartition : ExampleMetadataHandler.ExampleTable.schema.getCustomMetadata().get("partitionCols").split(",")) { + FieldVector vector = block.getFieldVector(nextPartition); + if (vector != null) { + switch (vector.getMinorType()) { + case INT: + case UINT2: + case BIGINT: + block.setValue(nextPartition, blockRow, Integer.valueOf(split.getProperty(nextPartition))); + break; + default: + throw new RuntimeException(vector.getMinorType() + " is not supported"); + } + } + } + } + + /** + * This should be replaced with something that actually reads useful data. + */ + private boolean makeRandomRow(Block block, int blockRow, int seed) + { + Set partitionCols = new HashSet<>(); + String partitionColsMetadata = block.getSchema().getCustomMetadata().get("partitionCols"); + if (partitionColsMetadata != null) { + partitionCols.addAll(Arrays.asList(partitionColsMetadata.split(","))); + } + + boolean matches = true; + for (Field next : block.getSchema().getFields()) { + String fieldName = next.getName(); + if (!partitionCols.contains(fieldName)) { + if (!matches) { + return false; + } + boolean negative = seed % 2 == 1; + Types.MinorType fieldType = Types.getMinorTypeForArrowType(next.getType()); + switch (fieldType) { + case INT: + int iVal = seed * (negative ? -1 : 1); + matches &= block.setValue(fieldName, blockRow, iVal); + break; + case DATEMILLI: + matches &= block.setValue(fieldName, blockRow, 100_000L); + break; + case DATEDAY: + matches &= block.setValue(fieldName, blockRow, 100_000); + break; + case TINYINT: + case SMALLINT: + int stVal = (seed % 4) * (negative ? -1 : 1); + matches &= block.setValue(fieldName, blockRow, stVal); + break; + case UINT1: + case UINT2: + case UINT4: + case UINT8: + int uiVal = seed % 4; + matches &= block.setValue(fieldName, blockRow, uiVal); + break; + case FLOAT4: + float fVal = seed * 1.1f * (negative ? -1 : 1); + matches &= block.setValue(fieldName, blockRow, fVal); + break; + case FLOAT8: + case DECIMAL: + double d8Val = seed * 1.1D * (negative ? -1 : 1); + matches &= block.setValue(fieldName, blockRow, d8Val); + break; + case BIT: + boolean bVal = seed % 2 == 0; + matches &= block.setValue(fieldName, blockRow, bVal); + break; + case BIGINT: + long lVal = seed * 1L * (negative ? -1 : 1); + matches &= block.setValue(fieldName, blockRow, lVal); + break; + case VARCHAR: + String vVal = "VarChar" + seed; + matches &= block.setValue(fieldName, blockRow, vVal); + break; + case VARBINARY: + byte[] binaryVal = ("VarChar" + seed).getBytes(); + matches &= block.setValue(fieldName, blockRow, binaryVal); + break; + case LIST: + //This is setup for the specific kinds of lists we have in our example schema, + //it is not universal. List and List is what + //this block supports. + Field child = block.getFieldVector(fieldName).getField().getChildren().get(0); + List value = new ArrayList<>(); + Types.MinorType childType = Types.getMinorTypeForArrowType(child.getType()); + switch (childType) { + case LIST: + List list = new ArrayList<>(); + list.add(String.valueOf(1000)); + list.add(String.valueOf(1001)); + list.add(String.valueOf(1002)); + value.add(list); + break; + case STRUCT: + Map struct = new HashMap<>(); + struct.put("varchar", "chars"); + struct.put("bigint", 100L); + value.add(struct); + break; + default: + throw new RuntimeException(childType + " is not supported"); + } + matches &= block.setComplexValue(fieldName, blockRow, FieldResolver.DEFAULT, value); + break; + default: + throw new RuntimeException(fieldType + " is not supported"); + } + } + } + return matches; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleUserDefinedFunctionHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleUserDefinedFunctionHandler.java new file mode 100644 index 0000000000..a6b2184b4e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/ExampleUserDefinedFunctionHandler.java @@ -0,0 +1,118 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.udf.UserDefinedFunctionHandler; +import com.google.common.collect.ImmutableMap; + +import java.math.BigDecimal; +import java.math.RoundingMode; +import java.time.LocalDate; +import java.time.LocalDateTime; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class ExampleUserDefinedFunctionHandler + extends UserDefinedFunctionHandler +{ + public Boolean example_udf(Boolean value) + { + return !value; + } + + public Byte example_udf(Byte value) + { + return (byte) (value + 1); + } + + public Short example_udf(Short value) + { + return (short) (value + 1); + } + + public Integer example_udf(Integer value) + { + return value + 1; + } + + public Long example_udf(Long value) + { + return value + 1; + } + + public Float example_udf(Float value) + { + return value + 1; + } + + public Double example_udf(Double value) + { + return value + 1; + } + + public BigDecimal example_udf(BigDecimal value) + { + BigDecimal one = new BigDecimal(1); + one.setScale(value.scale(), RoundingMode.HALF_UP); + return value.add(one); + } + + public String example_udf(String value) + { + return value + "_dada"; + } + + public LocalDateTime example_udf(LocalDateTime value) + { + return value.minusDays(1); + } + + public LocalDate example_udf(LocalDate value) + { + return value.minusDays(1); + } + + public List example_udf(List value) + { + System.out.println("Array input: " + value); + List result = value.stream().map(o -> ((Integer) o) + 1).collect(Collectors.toList()); + System.out.println("Array output: " + result); + return result; + } + + public Map example_udf(Map value) + { + Long longVal = (Long) value.get("x"); + Double doubleVal = (Double) value.get("y"); + + return ImmutableMap.of("x", longVal + 1, "y", doubleVal + 1.0); + } + + public byte[] example_udf(byte[] value) + { + byte[] output = new byte[value.length]; + for (int i = 0; i < value.length; ++i) { + output[i] = (byte) (value[i] + 1); + } + return output; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/SplitProperties.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/SplitProperties.java new file mode 100644 index 0000000000..a9bd10bf3f --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/examples/SplitProperties.java @@ -0,0 +1,40 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +public enum SplitProperties +{ + LOCATION(ExampleMetadataHandler.PARTITION_LOCATION), + SERDE(ExampleMetadataHandler.SERDE), + SPLIT_PART("SPLIT_PART"); + + private final String id; + + SplitProperties(String id) + { + this.id = id; + } + + public String getId() + { + return id; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/exceptions/FederationThrottleException.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/exceptions/FederationThrottleException.java new file mode 100644 index 0000000000..5a920096e4 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/exceptions/FederationThrottleException.java @@ -0,0 +1,54 @@ +package com.amazonaws.athena.connector.lambda.exceptions; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +/** + * Throw this exception if your source is unable to keep up with the rate or concurrency + * of requests. Athena constantly monitors for performance of federated sources and + * employs a congestion control mechanism to reduce pressure on sources that may be + * overwhelmed or unable to keep up. Throwing this exception gives Athena important + * back pressure information. Alternatively you can reduce the concurrency of the + * affected Lambda function in the Lambda console which will cause Lambda to generate + * Throttle exceptions for Athena. + * + * @note If you throw this exception after writing any data Athena may fail the operation + * to ensure consistency of your results. This is because Athena eagerly processes + * your results but is unsure if two identical calls to your source will produce + * the exact same result set (including ordering). + */ +public class FederationThrottleException + extends RuntimeException +{ + public FederationThrottleException() + { + super(); + } + + public FederationThrottleException(String message) + { + super(message); + } + + public FederationThrottleException(String message, Throwable cause) + { + super(message, cause); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/AthenaExceptionFilter.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/AthenaExceptionFilter.java new file mode 100644 index 0000000000..9a9d7e2c46 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/AthenaExceptionFilter.java @@ -0,0 +1,37 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.handlers; + +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.services.athena.model.TooManyRequestsException; + +public class AthenaExceptionFilter + implements ThrottlingInvoker.ExceptionFilter +{ + public static final ThrottlingInvoker.ExceptionFilter ATHENA_EXCEPTION_FILTER = new AthenaExceptionFilter(); + + private AthenaExceptionFilter() {} + + @Override + public boolean isMatch(Exception ex) + { + return ex instanceof TooManyRequestsException; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/CompositeHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/CompositeHandler.java new file mode 100644 index 0000000000..7638feb3ff --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/CompositeHandler.java @@ -0,0 +1,133 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.records.RecordRequest; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.request.FederationResponse; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.amazonaws.athena.connector.lambda.request.PingResponse; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.amazonaws.services.lambda.runtime.Context; +import com.amazonaws.services.lambda.runtime.RequestStreamHandler; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; + +/** + * This class allows you to have a single Lambda function be responsible for both metadata and data operations by + * composing a MetadataHandler with a RecordHandler and muxing requests to the appropriate class. You might choose + * to use this CompositeHandler to run a single lambda function for the following reasons: + * 1. Can be simpler to deploy and manage a single vs multiple Lambda functions + * 2. You don't need to independently control the cost or performance of metadata vs. data operations. + * + * @see RequestStreamHandler + */ +public class CompositeHandler + implements RequestStreamHandler +{ + private static final Logger logger = LoggerFactory.getLogger(CompositeHandler.class); + //The MetadataHandler to delegate metadata operations to. + private final MetadataHandler metadataHandler; + //The RecordHandler to delegate data operations to. + private final RecordHandler recordHandler; + + /** + * Basic constructor that composes a MetadataHandler with a RecordHandler. + * + * @param metadataHandler The MetadataHandler to delegate metadata operations to. + * @param recordHandler The RecordHandler to delegate data operations to. + */ + public CompositeHandler(MetadataHandler metadataHandler, RecordHandler recordHandler) + { + this.metadataHandler = metadataHandler; + this.recordHandler = recordHandler; + } + + /** + * Required by Lambda's RequestStreamHandler interface. In our case we use this method to handle some + * basic resource lifecycle tasks for the request, namely the BlockAllocator and the request object itself. + */ + public final void handleRequest(InputStream inputStream, OutputStream outputStream, final Context context) + throws IOException + { + try (BlockAllocatorImpl allocator = new BlockAllocatorImpl()) { + ObjectMapper objectMapper = ObjectMapperFactory.create(allocator); + try (FederationRequest rawReq = objectMapper.readValue(inputStream, FederationRequest.class)) { + handleRequest(allocator, rawReq, outputStream, objectMapper); + } + } + catch (Exception ex) { + logger.warn("handleRequest: Completed with an exception.", ex); + throw (ex instanceof RuntimeException) ? (RuntimeException) ex : new RuntimeException(ex); + } + } + + /** + * Handles routing the request to the appropriate Handler, either MetadataHandler or RecordHandler. + * + * @param allocator The BlockAllocator to use for Apache Arrow Resources. + * @param rawReq The request object itself. + * @param outputStream The OutputStream to which all responses should be written. + * @param objectMapper The ObjectMapper that can be used for serializing responses. + * @throws Exception + * @note that PingRequests are routed to the MetadataHandler even though both MetadataHandler and RecordHandler + * implemented PingRequest handling. + */ + public final void handleRequest(BlockAllocator allocator, FederationRequest rawReq, OutputStream outputStream, ObjectMapper objectMapper) + throws Exception + { + if (rawReq instanceof PingRequest) { + try (PingResponse response = metadataHandler.doPing((PingRequest) rawReq)) { + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + } + + if (rawReq instanceof MetadataRequest) { + metadataHandler.doHandleRequest(allocator, objectMapper, (MetadataRequest) rawReq, outputStream); + } + else if (rawReq instanceof RecordRequest) { + recordHandler.doHandleRequest(allocator, objectMapper, (RecordRequest) rawReq, outputStream); + } + else { + throw new IllegalArgumentException("Unknown request class " + rawReq.getClass()); + } + } + + /** + * Helper used to assert that the response generated by the handler is not null. + */ + private void assertNotNull(FederationResponse response) + { + if (response == null) { + throw new RuntimeException("Response was null"); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/FederationCapabilities.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/FederationCapabilities.java new file mode 100644 index 0000000000..43508b60c1 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/FederationCapabilities.java @@ -0,0 +1,34 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +/** + * Used to convey the capabilities of this SDK instance when negotiating functionality with + * Athena. You can think of this like a version number that is specific to the feature set + * and protocol used by the SDK. Purely client side changes in the SDK would not be expected + * to change the capabilities. + */ +public class FederationCapabilities +{ + private FederationCapabilities() {} + + protected static final int CAPABILITIES = 23; +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/GlueMetadataHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/GlueMetadataHandler.java new file mode 100644 index 0000000000..d4264013a5 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/GlueMetadataHandler.java @@ -0,0 +1,330 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.metadata.glue.GlueFieldLexer; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.model.Column; +import com.amazonaws.services.glue.model.Database; +import com.amazonaws.services.glue.model.GetDatabasesRequest; +import com.amazonaws.services.glue.model.GetDatabasesResult; +import com.amazonaws.services.glue.model.GetTableResult; +import com.amazonaws.services.glue.model.GetTablesRequest; +import com.amazonaws.services.glue.model.GetTablesResult; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; + +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; + +/** + * This class allows you to leverage AWS Glue's DataCatalog to satisfy portions of the functionality required in a + * MetadataHandler. More precisely, this implementation uses AWS Glue's DataCatalog to implement: + * 1. doListSchemas(...) + * 2. doListTables(...) + * 3. doGetTable(...) + *

+ * When you extend this class you can optionally provide a DatabaseFilter and/or TableFilter to decide which Databases + * (aka schemas) and Tables are eligible for use with your connector. You can find examples of this in the + * athena-hbase and athena-docdb connector modules. A common reason for this is when you happen to have databases/tables + * in Glue which match the names of databases and tables in your source but that aren't actually relevant. In such cases + * you may choose to ignore those Glue tables. + *

+ * At present this class does not retrieve partition information from AWS Glue's DataCatalog. There is an open task + * for how best to handle partitioning information in this class: https://github.com/awslabs/aws-athena-query-federation/issues/5 + * It is unclear at this time how many sources will have meaningful partition info in Glue but many sources (DocDB, Hbase, Redis) + * benefited from having basic schema information in Glue. As a result we punted support for partition information to + * a later time. + * + * @note All schema names, table names, and column names must be lower case at this time. Any entities that are uppercase or + * mixed case will not be accessible in queries and will be lower cased by Athena's engine to ensure consistency across + * sources. As such you may need to handle this when integrating with a source that supports mixed case. As an example, + * you can look at the CloudwatchTableResolver in the athena-cloudwatch module for one potential approach to this challenge. + * @see MetadataHandler + */ +public abstract class GlueMetadataHandler + extends MetadataHandler +{ + //name of the environment variable that can be used to set which Glue catalog to use (e.g. setting this to + //a different aws account id allows you to use cross-account catalogs) + private static final String CATALOG_NAME_ENV_OVERRIDE = "glue_catalog"; + + private final AWSGlue awsGlue; + + /** + * Basic constructor which is recommended when extending this class. + * + * @param awsGlue The glue client to use. + * @param sourceType The source type, used in diagnostic logging. + */ + public GlueMetadataHandler(AWSGlue awsGlue, String sourceType) + { + super(sourceType); + this.awsGlue = awsGlue; + } + + /** + * Full DI constructor used mostly for testing + * + * @param awsGlue The glue client to use. + * @param encryptionKeyFactory The EncryptionKeyFactory to use for spill encryption. + * @param secretsManager The AWSSecretsManager client that can be used when attempting to resolve secrets. + * @param athena The Athena client that can be used to fetch query termination status to fast-fail this handler. + * @param spillBucket The S3 Bucket to use when spilling results. + * @param spillPrefix The S3 prefix to use when spilling results. + */ + @VisibleForTesting + protected GlueMetadataHandler(AWSGlue awsGlue, + EncryptionKeyFactory encryptionKeyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String sourceType, + String spillBucket, + String spillPrefix) + { + super(encryptionKeyFactory, secretsManager, athena, sourceType, spillBucket, spillPrefix); + this.awsGlue = awsGlue; + } + + /** + * Provides access to the Glue client if the extender should need it. + * + * @return The AWSGlue client being used by this class. + */ + protected AWSGlue getAwsGlue() + { + return awsGlue; + } + + /** + * Provides access to the current AWS Glue DataCatalog being used by this class. + * + * @param request The request for which we'd like to resolve the catalog. + * @return The glue catalog to use for the request. + */ + protected String getCatalog(MetadataRequest request) + { + String override = System.getenv(CATALOG_NAME_ENV_OVERRIDE); + if (override == null) { + return request.getIdentity().getAccount(); + } + return override; + } + + /** + * Returns an unfiltered list of schemas (aka databases) from AWS Glue DataCatalog. + * + * @param blockAllocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @return The ListSchemasResponse which mostly contains the list of schemas (aka databases). + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest request) + throws Exception + { + return doListSchemaNames(blockAllocator, request, null); + } + + /** + * Returns a list of schemas (aka databases) from AWS Glue DataCatalog with optional filtering. + * + * @param blockAllocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @param filter The DatabaseFilter to apply to all schemas (aka databases) before adding them to the results list. + * @return The ListSchemasResponse which mostly contains the list of schemas (aka databases). + */ + protected ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest request, DatabaseFilter filter) + throws Exception + { + GetDatabasesRequest getDatabasesRequest = new GetDatabasesRequest(); + getDatabasesRequest.setCatalogId(getCatalog(request)); + + List schemas = new ArrayList<>(); + String nextToken = null; + do { + getDatabasesRequest.setNextToken(nextToken); + GetDatabasesResult result = awsGlue.getDatabases(getDatabasesRequest); + + for (Database next : result.getDatabaseList()) { + if (filter == null || filter.filter(next)) { + schemas.add(next.getName()); + } + } + + nextToken = result.getNextToken(); + } + while (nextToken != null); + + return new ListSchemasResponse(request.getCatalogName(), schemas); + } + + /** + * Returns an unfiltered list of tables from AWS Glue DataCatalog for the requested schema (aka database) + * + * @param blockAllocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @return The ListTablesResponse which mostly contains the list of table names. + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest request) + throws Exception + { + return doListTables(blockAllocator, request, null); + } + + /** + * Returns a list of tables from AWS Glue DataCatalog with optional filtering for the requested schema (aka database) + * + * @param blockAllocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @param filter The TableFilter to apply to all tables before adding them to the results list. + * @return The ListTablesResponse which mostly contains the list of table names. + */ + protected ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest request, TableFilter filter) + throws Exception + { + GetTablesRequest getTablesRequest = new GetTablesRequest(); + getTablesRequest.setCatalogId(getCatalog(request)); + getTablesRequest.setDatabaseName(request.getSchemaName()); + + Set tables = new HashSet<>(); + String nextToken = null; + do { + getTablesRequest.setNextToken(nextToken); + GetTablesResult result = awsGlue.getTables(getTablesRequest); + + for (Table next : result.getTableList()) { + if (filter == null || filter.filter(next)) { + tables.add(new TableName(request.getSchemaName(), next.getName())); + } + } + + nextToken = result.getNextToken(); + } + while (nextToken != null); + + return new ListTablesResponse(request.getCatalogName(), tables); + } + + /** + * Attempts to retrieve a Table (columns and properties) from AWS Glue for the request schema (aka database) and table + * name with no fitlering. + * + * @param blockAllocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog, database, and table they are querying. + * @return A GetTableResponse mostly containing the columns, their types, and any table properties for the requested table. + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request) + throws Exception + { + return doGetTable(blockAllocator, request, null); + } + + /** + * Attempts to retrieve a Table (columns and properties) from AWS Glue for the request schema (aka database) and table + * name with no filtering. + * + * @param blockAllocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog, database, and table they are querying. + * @param filter The TableFilter to apply to any matching table before generating the result. + * @return A GetTableResponse mostly containing the columns, their types, and any table properties for the requested table. + * @note This method throws a RuntimeException if not table matching the requested criteria (and filter) is found. + */ + protected GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request, TableFilter filter) + throws Exception + { + TableName tableName = request.getTableName(); + com.amazonaws.services.glue.model.GetTableRequest getTableRequest = new com.amazonaws.services.glue.model.GetTableRequest(); + getTableRequest.setCatalogId(getCatalog(request)); + getTableRequest.setDatabaseName(tableName.getSchemaName()); + getTableRequest.setName(tableName.getTableName()); + + GetTableResult result = awsGlue.getTable(getTableRequest); + Table table = result.getTable(); + + if (filter != null && !filter.filter(table)) { + throw new RuntimeException("No matching table found " + request.getTableName()); + } + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + table.getParameters().entrySet().forEach(next -> schemaBuilder.addMetadata(next.getKey(), next.getValue())); + + Set partitionCols = table.getPartitionKeys() + .stream().map(next -> next.getName()).collect(Collectors.toSet()); + + for (Column next : table.getStorageDescriptor().getColumns()) { + schemaBuilder.addField(convertField(next.getName(), next.getType())); + if (next.getComment() != null) { + schemaBuilder.addMetadata(next.getName(), next.getComment()); + } + } + + return new GetTableResponse(request.getCatalogName(), + request.getTableName(), + schemaBuilder.build(), + partitionCols); + } + + protected Field convertField(String name, String glueType) + { + return GlueFieldLexer.lex(name, glueType); + } + + public interface TableFilter + { + /** + * Used to filter table results. + * + * @param table The table to evaluate. + * @return True if the provided table should be in the result, False if not. + */ + boolean filter(Table table); + } + + public interface DatabaseFilter + { + /** + * Used to filter database results. + * + * @param database The database to evaluate. + * @return True if the provided database should be in the result, False if not. + */ + boolean filter(Database database); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/MetadataHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/MetadataHandler.java new file mode 100644 index 0000000000..e504d2a797 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/MetadataHandler.java @@ -0,0 +1,457 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.data.SimpleBlockWriter; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.request.FederationResponse; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.amazonaws.athena.connector.lambda.request.PingResponse; +import com.amazonaws.athena.connector.lambda.security.CachableSecretsManager; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.KmsKeyFactory; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.kms.AWSKMSClientBuilder; +import com.amazonaws.services.lambda.runtime.Context; +import com.amazonaws.services.lambda.runtime.RequestStreamHandler; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; +import java.util.UUID; + +import static com.amazonaws.athena.connector.lambda.handlers.AthenaExceptionFilter.ATHENA_EXCEPTION_FILTER; +import static com.amazonaws.athena.connector.lambda.handlers.FederationCapabilities.CAPABILITIES; + +/** + * This class defines the functionality required by any valid source of federated metadata for Athena. It is recommended + * that all connectors extend this class for Metadata operations though it is possible for you to write your own + * from the ground up as long as you satisfy the wire protocol. For all cases we've encountered it has made more sense + * to start with this base class and use it's implementation for most of the boilerplate related to Lambda and resource + * lifecycle so we could focus on the task of integrating with the source we were interested in. + * + * @note All schema names, table names, and column names must be lower case at this time. Any entities that are uppercase or + * mixed case will not be accessible in queries and will be lower cased by Athena's engine to ensure consistency across + * sources. As such you may need to handle this when integrating with a source that supports mixed case. As an example, + * you can look at the CloudwatchTableResolver in the athena-cloudwatch module for one potential approach to this challenge. + */ +public abstract class MetadataHandler + implements RequestStreamHandler +{ + private static final Logger logger = LoggerFactory.getLogger(MetadataHandler.class); + //name of the default column used when a default single-partition response is required for connectors that + //do not support robust partitioning. In such cases Athena requires at least 1 partition in order indicate + //there is indeed data to be read vs. queries that were able to fully partition prune and thus decide there + //was no data to read. + private static final String PARTITION_ID_COL = "partitionId"; + //The value that denotes encryption should be disabled, encryption is enabled by default. + private static final String DISABLE_ENCRYPTION = "true"; + //The default S3 prefix to use when spilling to S3 + private static final String DEFAULT_SPILL_PREFIX = "athena-federation-spill"; + protected static final String SPILL_BUCKET_ENV = "spill_bucket"; + protected static final String SPILL_PREFIX_ENV = "spill_prefix"; + protected static final String KMS_KEY_ID_ENV = "kms_key_id"; + protected static final String DISABLE_SPILL_ENCRYPTION = "disable_spill_encryption"; + + private final CachableSecretsManager secretsManager; + private final AmazonAthena athena; + private final ThrottlingInvoker athenaInvoker = ThrottlingInvoker.newDefaultBuilder(ATHENA_EXCEPTION_FILTER).build(); + private final EncryptionKeyFactory encryptionKeyFactory; + private final String spillBucket; + private final String spillPrefix; + private final String sourceType; + + /** + * @param sourceType Used to aid in logging diagnostic info when raising a support case. + */ + public MetadataHandler(String sourceType) + { + this.sourceType = sourceType; + this.spillBucket = System.getenv(SPILL_BUCKET_ENV); + this.spillPrefix = System.getenv(SPILL_PREFIX_ENV) == null ? + DEFAULT_SPILL_PREFIX : System.getenv(SPILL_PREFIX_ENV); + if (System.getenv(DISABLE_SPILL_ENCRYPTION) == null || + !DISABLE_ENCRYPTION.equalsIgnoreCase(System.getenv(DISABLE_SPILL_ENCRYPTION))) { + encryptionKeyFactory = (System.getenv(KMS_KEY_ID_ENV) != null) ? + new KmsKeyFactory(AWSKMSClientBuilder.standard().build(), System.getenv(KMS_KEY_ID_ENV)) : + new LocalKeyFactory(); + } + else { + encryptionKeyFactory = null; + } + + this.secretsManager = new CachableSecretsManager(AWSSecretsManagerClientBuilder.defaultClient()); + this.athena = AmazonAthenaClientBuilder.defaultClient(); + } + + /** + * @param sourceType Used to aid in logging diagnostic info when raising a support case. + */ + public MetadataHandler(EncryptionKeyFactory encryptionKeyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String sourceType, + String spillBucket, + String spillPrefix) + { + this.encryptionKeyFactory = encryptionKeyFactory; + this.secretsManager = new CachableSecretsManager(secretsManager); + this.athena = athena; + this.sourceType = sourceType; + this.spillBucket = spillBucket; + this.spillPrefix = spillPrefix; + } + + /** + * Resolves any secrets found in the supplied string, for example: MyString${WithSecret} would have ${WithSecret} + * by the corresponding value of the secret in AWS Secrets Manager with that name. If no such secret is found + * the function throws. + * + * @param rawString The string in which you'd like to replace SecretsManager placeholders. + * (e.g. ThisIsA${Secret}Here - The ${Secret} would be replaced with the contents of an SecretsManager + * secret called Secret. If no such secret is found, the function throws. If no ${} are found in + * the input string, nothing is replaced and the original string is returned. + */ + protected String resolveSecrets(String rawString) + { + return secretsManager.resolveSecrets(rawString); + } + + protected String getSecret(String secretName) + { + return secretsManager.getSecret(secretName); + } + + protected EncryptionKey makeEncryptionKey() + { + return (encryptionKeyFactory != null) ? encryptionKeyFactory.create() : null; + } + + protected SpillLocation makeSpillLocation(MetadataRequest request) + { + return S3SpillLocation.newBuilder() + .withBucket(spillBucket) + .withPrefix(spillPrefix) + .withQueryId(request.getQueryId()) + .withSplitId(UUID.randomUUID().toString()) + .build(); + } + + public final void handleRequest(InputStream inputStream, OutputStream outputStream, final Context context) + throws IOException + { + try (BlockAllocator allocator = new BlockAllocatorImpl()) { + ObjectMapper objectMapper = ObjectMapperFactory.create(allocator); + try (FederationRequest rawReq = objectMapper.readValue(inputStream, FederationRequest.class)) { + if (rawReq instanceof PingRequest) { + try (PingResponse response = doPing((PingRequest) rawReq)) { + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + } + + if (!(rawReq instanceof MetadataRequest)) { + throw new RuntimeException("Expected a MetadataRequest but found " + rawReq.getClass()); + } + doHandleRequest(allocator, objectMapper, (MetadataRequest) rawReq, outputStream); + } + catch (Exception ex) { + logger.warn("handleRequest: Completed with an exception.", ex); + throw (ex instanceof RuntimeException) ? (RuntimeException) ex : new RuntimeException(ex); + } + } + } + + protected final void doHandleRequest(BlockAllocator allocator, + ObjectMapper objectMapper, + MetadataRequest req, + OutputStream outputStream) + throws Exception + { + logger.info("doHandleRequest: request[{}]", req); + MetadataRequestType type = req.getRequestType(); + switch (type) { + case LIST_SCHEMAS: + try (ListSchemasResponse response = doListSchemaNames(allocator, (ListSchemasRequest) req)) { + logger.info("doHandleRequest: response[{}]", response); + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + case LIST_TABLES: + try (ListTablesResponse response = doListTables(allocator, (ListTablesRequest) req)) { + logger.info("doHandleRequest: response[{}]", response); + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + case GET_TABLE: + try (GetTableResponse response = doGetTable(allocator, (GetTableRequest) req)) { + logger.info("doHandleRequest: response[{}]", response); + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + case GET_TABLE_LAYOUT: + try (GetTableLayoutResponse response = doGetTableLayout(allocator, (GetTableLayoutRequest) req)) { + logger.info("doHandleRequest: response[{}]", response); + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + case GET_SPLITS: + try (GetSplitsResponse response = doGetSplits(allocator, (GetSplitsRequest) req)) { + logger.info("doHandleRequest: response[{}]", response); + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + default: + throw new IllegalArgumentException("Unknown request type " + type); + } + } + + /** + * Used to get the list of schemas (aka databases) that this source contains. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog they are querying. + * @return A ListSchemasResponse which primarily contains a Set of schema names and a catalog name + * corresponding the Athena catalog that was queried. + */ + public abstract ListSchemasResponse doListSchemaNames(final BlockAllocator allocator, final ListSchemasRequest request) + throws Exception; + + /** + * Used to get the list of tables that this source contains. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog and database they are querying. + * @return A ListTablesResponse which primarily contains a List enumerating the tables in this + * catalog, database tuple. It also contains the catalog name corresponding the Athena catalog that was queried. + */ + public abstract ListTablesResponse doListTables(final BlockAllocator allocator, final ListTablesRequest request) + throws Exception; + + /** + * Used to get definition (field names, types, descriptions, etc...) of a Table. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details on who made the request and which Athena catalog, database, and table they are querying. + * @return A GetTableResponse which primarily contains: + * 1. An Apache Arrow Schema object describing the table's columns, types, and descriptions. + * 2. A Set of partition column names (or empty if the table isn't partitioned). + */ + public abstract GetTableResponse doGetTable(final BlockAllocator allocator, final GetTableRequest request) + throws Exception; + + /** + * Used to get the partitions that must be read from the request table in order to satisfy the requested predicate. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details of the catalog, database, and table being queried as well as any filter predicate. + * @return A GetTableLayoutResponse which primarily contains: + * 1. An Apache Arrow Block with 0 or more partitions to read. 0 partitions implies there are 0 rows to read. + * 2. Set of partition column names which should correspond to columns in your Apache Arrow Block. + * @note Partitions are opaque to Amazon Athena in that it does not understand their contents, just that it must call + * doGetSplits(...) for each partition you return in order to determine which reads to perform and if those reads + * can be parallelized. This means the contents of this response are more for you than they are for Athena. + * @note Partitions are partially opaque to Amazon Athena in that it only understands your partition columns and + * how to filter out partitions that do not meet the query's constraints. Any additional columns you add to the + * partition data are ignored by Athena but passed on to calls on GetSplits. + */ + public GetTableLayoutResponse doGetTableLayout(final BlockAllocator allocator, final GetTableLayoutRequest request) + throws Exception + { + SchemaBuilder constraintSchema = new SchemaBuilder().newBuilder(); + SchemaBuilder partitionSchemaBuilder = new SchemaBuilder().newBuilder(); + + /** + * Add our partition columns to the response schema so the engine knows how to interpret the list of + * partitions we are going to return. + */ + for (String nextPartCol : request.getPartitionCols()) { + Field partitionCol = request.getSchema().findField(nextPartCol); + partitionSchemaBuilder.addField(nextPartCol, partitionCol.getType()); + constraintSchema.addField(nextPartCol, partitionCol.getType()); + } + + enhancePartitionSchema(partitionSchemaBuilder, request); + Schema partitionSchema = partitionSchemaBuilder.build(); + + if (partitionSchema.getFields().isEmpty() && partitionSchema.getCustomMetadata().isEmpty()) { + //Even though our table doesn't support complex layouts, partitioning or metadata, we need to convey that there is at least + //1 partition to read as part of the query or Athena will assume partition pruning found no candidate layouts to read. + Block partitions = BlockUtils.newBlock(allocator, PARTITION_ID_COL, Types.MinorType.INT.getType(), 1); + return new GetTableLayoutResponse(request.getCatalogName(), request.getTableName(), partitions); + } + + /** + * Now use the constraint that was in the request to do some partition pruning. Here we are just + * generating some fake values for the partitions but in a real implementation you'd use your metastore + * or knowledge of the actual table's physical layout to do this. + */ + try (ConstraintEvaluator constraintEvaluator = new ConstraintEvaluator(allocator, + constraintSchema.build(), + request.getConstraints()); + QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, request.getQueryId()) + ) { + Block partitions = allocator.createBlock(partitionSchemaBuilder.build()); + partitions.constrain(constraintEvaluator); + SimpleBlockWriter blockWriter = new SimpleBlockWriter(partitions); + getPartitions(blockWriter, request, queryStatusChecker); + return new GetTableLayoutResponse(request.getCatalogName(), request.getTableName(), partitions); + } + } + + /** + * This method can be used to add additional fields to the schema of our partition response. Athena + * expects each partitions in the response to have a column corresponding to your partition columns. + * You can choose to add additional columns to that response which Athena will ignore but will pass + * on to you when it call GetSplits(...) for each partition. + * + * @param partitionSchemaBuilder The SchemaBuilder you can use to add additional columns and metadata to the + * partitions response. + * @param request The GetTableLayoutResquest that triggered this call. + */ + public void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + //You can add additional fields to the partition schema which are ignored by Athena + //but will be passed on to called to GetSplits(...). This can be handy when you + //want to avoid extra round trips to your metastore. For example, when you generate + //the partition list you may have easy access to the storage details (e.g. S3 location) + //of the partition. Athena doesn't need the S3 location but when Athena calls you + //to generate the Splits for the partition, having the S3 location would save you + //extra work. For that reason you can add a field to the partition schema which + //contains the s3 location. + } + + /** + * Used to get the partitions that must be read from the request table in order to satisfy the requested predicate. + * + * @param blockWriter Used to write rows (partitions) into the Apache Arrow response. + * @param request Provides details of the catalog, database, and table being queried as well as any filter predicate. + * @param queryStatusChecker A QueryStatusChecker that you can use to stop doing work for a query that has already terminated + * @note Partitions are partially opaque to Amazon Athena in that it only understands your partition columns and + * how to filter out partitions that do not meet the query's constraints. Any additional columns you add to the + * partition data are ignored by Athena but passed on to calls on GetSplits. Also note tat the BlockWriter handlers + * automatically constraining and filtering out values that don't satisfy the query's predicate. This is how we + * we accomplish partition pruning. You can optionally retreive a ConstraintEvaluator from BlockWriter if you have + * your own need to apply filtering in Lambda. Otherwise you can get the actual preducate from the request object + * for pushing down into the source you are querying. + */ + public abstract void getPartitions(final BlockWriter blockWriter, + final GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception; + + /** + * Used to split-up the reads required to scan the requested batch of partition(s). + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Provides details of the catalog, database, table, andpartition(s) being queried as well as + * any filter predicate. + * @return A GetSplitsResponse which primarily contains: + * 1. A Set which represent read operations Amazon Athena must perform by calling your read function. + * 2. (Optional) A continuation token which allows you to paginate the generation of splits for large queries. + * @note A Split is a mostly opaque object to Amazon Athena. Amazon Athena will use the optional SpillLocation and + * optional EncryptionKey for pipelined reads but all properties you set on the Split are passed to your read + * function to help you perform the read. + */ + public abstract GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + throws Exception; + + /** + * Used to warm up your function as well as to discovery its capabilities (e.g. SDK capabilities) + * + * @param request The PingRequest. + * @return A PingResponse. + * @note We do not recommend modifying this function, instead you should implement doPing(...) + */ + public PingResponse doPing(PingRequest request) + { + PingResponse response = new PingResponse(request.getCatalogName(), request.getQueryId(), sourceType, CAPABILITIES); + try { + onPing(request); + } + catch (Exception ex) { + logger.warn("doPing: encountered an exception while delegating onPing.", ex); + } + return response; + } + + /** + * Provides you a signal that can be used to warm up your function. + * + * @param request The PingRequest. + */ + public void onPing(PingRequest request) + { + //NoOp + } + + /** + * Helper function that is used to ensure we always have a non-null response. + * + * @param response The response to assert is not null. + */ + private void assertNotNull(FederationResponse response) + { + if (response == null) { + throw new RuntimeException("Response was null"); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/RecordHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/RecordHandler.java new file mode 100644 index 0000000000..0a97becaa5 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/handlers/RecordHandler.java @@ -0,0 +1,260 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.ThrottlingInvoker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SpillConfig; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordRequest; +import com.amazonaws.athena.connector.lambda.records.RecordRequestType; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.request.FederationResponse; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.amazonaws.athena.connector.lambda.request.PingResponse; +import com.amazonaws.athena.connector.lambda.security.CachableSecretsManager; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.lambda.runtime.Context; +import com.amazonaws.services.lambda.runtime.RequestStreamHandler; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.io.InputStream; +import java.io.OutputStream; + +import static com.amazonaws.athena.connector.lambda.handlers.AthenaExceptionFilter.ATHENA_EXCEPTION_FILTER; +import static com.amazonaws.athena.connector.lambda.handlers.FederationCapabilities.CAPABILITIES; + +public abstract class RecordHandler + implements RequestStreamHandler +{ + private static final Logger logger = LoggerFactory.getLogger(RecordHandler.class); + private static final String MAX_BLOCK_SIZE_BYTES = "MAX_BLOCK_SIZE_BYTES"; + private static final int NUM_SPILL_THREADS = 2; + private final AmazonS3 amazonS3; + private final String sourceType; + private final CachableSecretsManager secretsManager; + private final AmazonAthena athena; + private final ThrottlingInvoker athenaInvoker = ThrottlingInvoker.newDefaultBuilder(ATHENA_EXCEPTION_FILTER).build(); + + /** + * @param sourceType Used to aid in logging diagnostic info when raising a support case. + */ + public RecordHandler(String sourceType) + { + this.sourceType = sourceType; + this.amazonS3 = AmazonS3ClientBuilder.defaultClient(); + this.secretsManager = new CachableSecretsManager(AWSSecretsManagerClientBuilder.defaultClient()); + this.athena = AmazonAthenaClientBuilder.defaultClient(); + } + + /** + * @param sourceType Used to aid in logging diagnostic info when raising a support case. + */ + public RecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, String sourceType) + { + this.sourceType = sourceType; + this.amazonS3 = amazonS3; + this.secretsManager = new CachableSecretsManager(secretsManager); + this.athena = athena; + } + + /** + * Resolves any secrets found in the supplied string, for example: MyString${WithSecret} would have ${WithSecret} + * by the corresponding value of the secret in AWS Secrets Manager with that name. If no such secret is found + * the function throws. + * + * @param rawString The string in which you'd like to replace SecretsManager placeholders. + * (e.g. ThisIsA${Secret}Here - The ${Secret} would be replaced with the contents of an SecretsManager + * secret called Secret. If no such secret is found, the function throws. If no ${} are found in + * the input string, nothing is replaced and the original string is returned. + */ + protected String resolveSecrets(String rawString) + { + return secretsManager.resolveSecrets(rawString); + } + + protected String getSecret(String secretName) + { + return secretsManager.getSecret(secretName); + } + + public final void handleRequest(InputStream inputStream, OutputStream outputStream, final Context context) + throws IOException + { + try (BlockAllocator allocator = new BlockAllocatorImpl()) { + ObjectMapper objectMapper = ObjectMapperFactory.create(allocator); + try (FederationRequest rawReq = objectMapper.readValue(inputStream, FederationRequest.class)) { + if (rawReq instanceof PingRequest) { + try (PingResponse response = doPing((PingRequest) rawReq)) { + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + } + + if (!(rawReq instanceof RecordRequest)) { + throw new RuntimeException("Expected a RecordRequest but found " + rawReq.getClass()); + } + + doHandleRequest(allocator, objectMapper, (RecordRequest) rawReq, outputStream); + } + catch (Exception ex) { + logger.warn("handleRequest: Completed with an exception.", ex); + throw (ex instanceof RuntimeException) ? (RuntimeException) ex : new RuntimeException(ex); + } + } + } + + protected final void doHandleRequest(BlockAllocator allocator, + ObjectMapper objectMapper, + RecordRequest req, + OutputStream outputStream) + throws Exception + { + logger.info("doHandleRequest: request[{}]", req); + RecordRequestType type = req.getRequestType(); + switch (type) { + case READ_RECORDS: + try (RecordResponse response = doReadRecords(allocator, (ReadRecordsRequest) req)) { + logger.info("doHandleRequest: response[{}]", response); + assertNotNull(response); + objectMapper.writeValue(outputStream, response); + } + return; + default: + throw new IllegalArgumentException("Unknown request type " + type); + } + } + + /** + * Used to read the row data associated with the provided Split. + * + * @param allocator Tool for creating and managing Apache Arrow Blocks. + * @param request Details of the read request, including: + * 1. The Split + * 2. The Catalog, Database, and Table the read request is for. + * 3. The filtering predicate (if any) + * 4. The columns required for projection. + * @return A RecordResponse which either a ReadRecordsResponse or a RemoteReadRecordsResponse containing the row + * data for the requested Split. + */ + public RecordResponse doReadRecords(BlockAllocator allocator, ReadRecordsRequest request) + throws Exception + { + logger.info("doReadRecords: {}:{}", request.getSchema(), request.getSplit().getSpillLocation()); + SpillConfig spillConfig = getSpillConfig(request); + try (ConstraintEvaluator evaluator = new ConstraintEvaluator(allocator, + request.getSchema(), + request.getConstraints()); + S3BlockSpiller spiller = new S3BlockSpiller(amazonS3, spillConfig, allocator, request.getSchema(), evaluator); + QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, request.getQueryId()) + ) { + readWithConstraint(spiller, request, queryStatusChecker); + + if (!spiller.spilled()) { + return new ReadRecordsResponse(request.getCatalogName(), spiller.getBlock()); + } + else { + return new RemoteReadRecordsResponse(request.getCatalogName(), + request.getSchema(), + spiller.getSpillLocations(), + spillConfig.getEncryptionKey()); + } + } + } + + /** + * A more stream lined option for reading the row data associated with the provided Split. This method differs from + * doReadRecords(...) in that the SDK handles more of the request lifecycle, leaving you to focus more closely on + * the task of actually reading from your source. + * + * @param spiller A BlockSpiller that should be used to write the row data associated with this Split. + * The BlockSpiller automatically handles chunking the response, encrypting, and spilling to S3. + * @param recordsRequest Details of the read request, including: + * 1. The Split + * 2. The Catalog, Database, and Table the read request is for. + * 3. The filtering predicate (if any) + * 4. The columns required for projection. + * @param queryStatusChecker A QueryStatusChecker that you can use to stop doing work for a query that has already terminated + * @note Avoid writing >10 rows per-call to BlockSpiller.writeRow(...) because this will limit the BlockSpiller's + * ability to control Block size. The resulting increase in Block size may cause failures and reduced performance. + */ + protected abstract void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + throws Exception; + + protected SpillConfig getSpillConfig(ReadRecordsRequest request) + { + long maxBlockSize = request.getMaxBlockSize(); + if (System.getenv(MAX_BLOCK_SIZE_BYTES) != null) { + maxBlockSize = Long.parseLong(System.getenv(MAX_BLOCK_SIZE_BYTES)); + } + + return SpillConfig.newBuilder() + .withSpillLocation(request.getSplit().getSpillLocation()) + .withMaxBlockBytes(maxBlockSize) + .withMaxInlineBlockBytes(request.getMaxInlineBlockSize()) + .withRequestId(request.getQueryId()) + .withEncryptionKey(request.getSplit().getEncryptionKey()) + .withNumSpillThreads(NUM_SPILL_THREADS) + .build(); + } + + private final PingResponse doPing(PingRequest request) + { + PingResponse response = new PingResponse(request.getCatalogName(), request.getQueryId(), sourceType, CAPABILITIES); + try { + onPing(request); + } + catch (Exception ex) { + logger.warn("doPing: encountered an exception while delegating onPing.", ex); + } + return response; + } + + protected void onPing(PingRequest request) + { + //NoOp + } + + private void assertNotNull(FederationResponse response) + { + if (response == null) { + throw new RuntimeException("Response was null"); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetSplitsRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetSplitsRequest.java new file mode 100644 index 0000000000..c6451830e3 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetSplitsRequest.java @@ -0,0 +1,174 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.CollectionsUtils; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.beans.Transient; +import java.util.Collections; +import java.util.List; + +import static java.util.Objects.requireNonNull; + +public class GetSplitsRequest + extends MetadataRequest +{ + private final TableName tableName; + private final Block partitions; + private final List partitionCols; + private final Constraints constraints; + private final String continuationToken; + + @JsonCreator + public GetSplitsRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("queryId") String queryId, + @JsonProperty("catalogName") String catalogName, + @JsonProperty("tableName") TableName tableName, + @JsonProperty("partitions") Block partitions, + @JsonProperty("partitionCols") List partitionCols, + @JsonProperty("constraints") Constraints constraints, + @JsonProperty("continuationToken") String continuationToken) + { + super(identity, MetadataRequestType.GET_SPLITS, queryId, catalogName); + requireNonNull(tableName, "tableName is null"); + requireNonNull(partitions, "partitions is null"); + requireNonNull(partitionCols, "partitionCols is null"); + requireNonNull(constraints, "constraints is null"); + this.tableName = tableName; + this.partitions = partitions; + this.partitionCols = Collections.unmodifiableList(partitionCols); + this.constraints = constraints; + this.continuationToken = continuationToken; + } + + //Helpful when making a continuation call since it requires the original request but updated token + public GetSplitsRequest(GetSplitsRequest clone, String continuationToken) + { + this(clone.getIdentity(), clone.getQueryId(), clone.getCatalogName(), clone.tableName, clone.partitions, clone.partitionCols, clone.constraints, continuationToken); + } + + @JsonProperty + public String getContinuationToken() + { + return continuationToken; + } + + @JsonProperty + public TableName getTableName() + { + return tableName; + } + + @Transient + public Schema getSchema() + { + return partitions.getSchema(); + } + + @JsonProperty + public List getPartitionCols() + { + return partitionCols; + } + + @JsonProperty + public Block getPartitions() + { + return partitions; + } + + @JsonProperty + public Constraints getConstraints() + { + return constraints; + } + + @Transient + public boolean hasContinuationToken() + { + return continuationToken != null; + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("queryId", getQueryId()) + .add("tableName", tableName) + .add("partitionCols", partitionCols) + .add("requestType", getRequestType()) + .add("catalogName", getCatalogName()) + .add("partitions", partitions) + .add("constraints", constraints) + .add("continuationToken", continuationToken) + .toString(); + } + + @Override + public void close() + throws Exception + { + partitions.close(); + constraints.close(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + GetSplitsRequest that = (GetSplitsRequest) o; + + Objects.equal(this.tableName, that.tableName); + Objects.equal(this.partitions, that.partitions); + CollectionsUtils.equals(this.partitionCols, that.partitionCols); + Objects.equal(this.continuationToken, that.continuationToken); + Objects.equal(this.getRequestType(), that.getRequestType()); + Objects.equal(this.getCatalogName(), that.getCatalogName()); + + return Objects.equal(this.tableName, that.tableName) && + Objects.equal(this.partitions, that.partitions) && + CollectionsUtils.equals(this.partitionCols, that.partitionCols) && + Objects.equal(this.continuationToken, that.continuationToken) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tableName, partitions, partitionCols, continuationToken, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetSplitsResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetSplitsResponse.java new file mode 100644 index 0000000000..025ec5cc53 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetSplitsResponse.java @@ -0,0 +1,122 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +import java.util.Collections; +import java.util.HashSet; +import java.util.Set; + +import static java.util.Objects.requireNonNull; + +public class GetSplitsResponse + extends MetadataResponse +{ + private final Set splits; + private final String continuationToken; + + @JsonCreator + public GetSplitsResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("splits") Set splits, + @JsonProperty("continuationToken") String continuationToken) + { + super(MetadataRequestType.GET_SPLITS, catalogName); + requireNonNull(splits, "splits is null"); + this.splits = Collections.unmodifiableSet(splits); + this.continuationToken = continuationToken; + } + + public GetSplitsResponse(String catalogName, + Set splits) + { + super(MetadataRequestType.GET_SPLITS, catalogName); + requireNonNull(splits, "splits is null"); + this.splits = Collections.unmodifiableSet(splits); + this.continuationToken = null; + } + + public GetSplitsResponse(String catalogName, + Split split) + { + super(MetadataRequestType.GET_SPLITS, catalogName); + requireNonNull(split, "split is null"); + Set splits = new HashSet<>(); + splits.add(split); + this.splits = Collections.unmodifiableSet(splits); + this.continuationToken = null; + } + + @JsonProperty + public Set getSplits() + { + return splits; + } + + @JsonProperty + public String getContinuationToken() + { + return continuationToken; + } + + @Override + public void close() + throws Exception + { + //NoOp + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + GetSplitsResponse that = (GetSplitsResponse) o; + + return Objects.equal(this.splits, that.splits) && + Objects.equal(this.continuationToken, that.continuationToken) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(splits, continuationToken, getRequestType(), getCatalogName()); + } + + @Override + public String toString() + { + return "GetSplitsResponse{" + + "splitSize=" + splits.size() + + ", continuationToken='" + continuationToken + '\'' + + '}'; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableLayoutRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableLayoutRequest.java new file mode 100644 index 0000000000..e1c7f67574 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableLayoutRequest.java @@ -0,0 +1,130 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.HashSet; +import java.util.Set; + +import static java.util.Objects.requireNonNull; + +public class GetTableLayoutRequest + extends MetadataRequest +{ + private final TableName tableName; + private final Constraints constraints; + private final Schema schema; + private final Set partitionCols; + + @JsonCreator + public GetTableLayoutRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("queryId") String queryId, + @JsonProperty("catalogName") String catalogName, + @JsonProperty("tableName") TableName tableName, + @JsonProperty("constraints") Constraints constraints, + @JsonProperty("schema") Schema schema, + @JsonProperty("partitionCols") Set partitionCols) + { + super(identity, MetadataRequestType.GET_TABLE_LAYOUT, queryId, catalogName); + requireNonNull(partitionCols, "partitionCols is null"); + this.tableName = requireNonNull(tableName, "tableName is null"); + this.constraints = requireNonNull(constraints, "constraints is null"); + this.schema = requireNonNull(schema, "schema is null"); + this.partitionCols = Collections.unmodifiableSet(new HashSet<>(partitionCols)); + } + + public TableName getTableName() + { + return tableName; + } + + public Constraints getConstraints() + { + return constraints; + } + + public Schema getSchema() + { + return schema; + } + + public Set getPartitionCols() + { + return partitionCols; + } + + @Override + public void close() + throws Exception + { + for (ValueSet next : constraints.getSummary().values()) { + next.close(); + } + constraints.close(); + } + + @Override + public String toString() + { + return "GetTableLayoutRequest{" + + "queryId=" + getQueryId() + + ", tableName=" + tableName + + ", constraints=" + constraints + + ", schema=" + schema + + ", partitionCols=" + partitionCols + + '}'; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + GetTableLayoutRequest that = (GetTableLayoutRequest) o; + + return Objects.equal(this.tableName, that.tableName) && + Objects.equal(this.constraints, that.constraints) && + Objects.equal(this.schema, that.schema) && + Objects.equal(this.partitionCols, that.partitionCols) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tableName, constraints, schema, partitionCols, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableLayoutResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableLayoutResponse.java new file mode 100644 index 0000000000..6d57d92fed --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableLayoutResponse.java @@ -0,0 +1,102 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; + +import static java.util.Objects.requireNonNull; + +public class GetTableLayoutResponse + extends MetadataResponse +{ + private final TableName tableName; + private final Block partitions; + + @JsonCreator + public GetTableLayoutResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("tableName") TableName tableName, + @JsonProperty("partitions") Block partitions) + { + super(MetadataRequestType.GET_TABLE_LAYOUT, catalogName); + requireNonNull(tableName, "tableName is null"); + requireNonNull(partitions, "partitions is null"); + this.tableName = tableName; + this.partitions = partitions; + } + + @JsonProperty + public TableName getTableName() + { + return tableName; + } + + @JsonProperty + public Block getPartitions() + { + return partitions; + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("tableName", tableName) + .add("requestType", getRequestType()) + .add("catalogName", getCatalogName()) + .toString(); + } + + @Override + public void close() + throws Exception + { + partitions.close(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + GetTableLayoutResponse that = (GetTableLayoutResponse) o; + + return Objects.equal(this.tableName, that.tableName) && + Objects.equal(this.partitions, that.partitions) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tableName, partitions, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableRequest.java new file mode 100644 index 0000000000..ceb2b0cd3e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableRequest.java @@ -0,0 +1,90 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +import static java.util.Objects.requireNonNull; + +public class GetTableRequest + extends MetadataRequest +{ + private final TableName tableName; + + @JsonCreator + public GetTableRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("queryId") String queryId, + @JsonProperty("catalogName") String catalogName, + @JsonProperty("tableName") TableName tableName) + { + super(identity, MetadataRequestType.GET_TABLE, queryId, catalogName); + requireNonNull(tableName, "tableName is null"); + this.tableName = tableName; + } + + public TableName getTableName() + { + return tableName; + } + + @Override + public void close() + throws Exception + { + //No Op + } + + @Override + public String toString() + { + return "GetTableRequest{" + + "queryId=" + getQueryId() + + ", tableName=" + tableName + + '}'; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + GetTableRequest that = (GetTableRequest) o; + + return Objects.equal(this.tableName, that.tableName) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tableName, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableResponse.java new file mode 100644 index 0000000000..804c309597 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/GetTableResponse.java @@ -0,0 +1,120 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Collections; +import java.util.Set; + +import static java.util.Objects.requireNonNull; + +public class GetTableResponse + extends MetadataResponse +{ + private final TableName tableName; + private final Schema schema; + private final Set partitionColumns; + + @JsonCreator + public GetTableResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("tableName") TableName tableName, + @JsonProperty("schema") Schema schema, + @JsonProperty("partitionColumns") Set partitionColumns) + { + super(MetadataRequestType.GET_TABLE, catalogName); + requireNonNull(tableName, "tableName is null"); + requireNonNull(schema, "schema is null"); + requireNonNull(partitionColumns, "partitionColumns is null"); + this.tableName = tableName; + this.schema = schema; + this.partitionColumns = partitionColumns; + } + + public GetTableResponse(String catalogName, TableName tableName, Schema schema) + { + this(catalogName, tableName, schema, Collections.emptySet()); + } + + public TableName getTableName() + { + return tableName; + } + + public Schema getSchema() + { + return schema; + } + + public Set getPartitionColumns() + { + return Collections.unmodifiableSet(partitionColumns); + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("tableName", tableName) + .add("schema", schema) + .add("partitionColumns", partitionColumns) + .add("requestType", getRequestType()) + .add("catalogName", getCatalogName()) + .toString(); + } + + @Override + public void close() + throws Exception + { + //No Op + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + GetTableResponse that = (GetTableResponse) o; + + return Objects.equal(this.tableName, that.tableName) && + Objects.equal(this.schema, that.schema) && + Objects.equal(this.partitionColumns, that.partitionColumns) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tableName, schema, partitionColumns, getRequestType(), getRequestType()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListSchemasRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListSchemasRequest.java new file mode 100644 index 0000000000..afaeefb56c --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListSchemasRequest.java @@ -0,0 +1,73 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +public class ListSchemasRequest + extends MetadataRequest +{ + @JsonCreator + public ListSchemasRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("queryId") String queryId, + @JsonProperty("catalogName") String catalogName) + { + super(identity, MetadataRequestType.LIST_SCHEMAS, queryId, catalogName); + } + + @Override + public void close() + throws Exception + { + //No Op + } + + @Override + public String toString() + { + return "ListSchemasRequest{" + "queryId=" + getQueryId() + "}"; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + ListSchemasRequest that = (ListSchemasRequest) o; + + return Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListSchemasResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListSchemasResponse.java new file mode 100644 index 0000000000..3eb0c7ab82 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListSchemasResponse.java @@ -0,0 +1,89 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.CollectionsUtils; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +import java.util.Collection; +import java.util.Collections; + +import static java.util.Objects.requireNonNull; + +public class ListSchemasResponse + extends MetadataResponse +{ + private final Collection schemas; + + @JsonCreator + public ListSchemasResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("schemas") Collection schemas) + { + super(MetadataRequestType.LIST_SCHEMAS, catalogName); + requireNonNull(schemas, "schemas is null"); + this.schemas = Collections.unmodifiableCollection(schemas); + } + + public Collection getSchemas() + { + return schemas; + } + + @Override + public void close() + throws Exception + { + //No Op + } + + @Override + public String toString() + { + return "ListSchemasResponse{" + + "schemas=" + schemas + + '}'; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + ListSchemasResponse that = (ListSchemasResponse) o; + + return CollectionsUtils.equals(this.schemas, that.schemas) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(schemas, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListTablesRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListTablesRequest.java new file mode 100644 index 0000000000..85cc84c8e9 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListTablesRequest.java @@ -0,0 +1,90 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +public class ListTablesRequest + extends MetadataRequest +{ + private final String schemaName; + + /** + * @param catalogName The name of the catalog being requested. + * @param schemaName This may be null if no specific schema is requested. + */ + @JsonCreator + public ListTablesRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("queryId") String queryId, + @JsonProperty("catalogName") String catalogName, + @JsonProperty("schemaName") String schemaName) + { + super(identity, MetadataRequestType.LIST_TABLES, queryId, catalogName); + this.schemaName = schemaName; + } + + public String getSchemaName() + { + return schemaName; + } + + @Override + public void close() + throws Exception + { + //No Op + } + + @Override + public String toString() + { + return "ListTablesRequest{" + + "queryId=" + getQueryId() + + ", schemaName='" + schemaName + '\'' + + '}'; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + ListTablesRequest that = (ListTablesRequest) o; + + return Objects.equal(this.schemaName, that.schemaName) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(schemaName, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListTablesResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListTablesResponse.java new file mode 100644 index 0000000000..ab72a67ef0 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/ListTablesResponse.java @@ -0,0 +1,90 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.CollectionsUtils; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.Objects; + +import java.util.Collection; +import java.util.Collections; + +import static java.util.Objects.requireNonNull; + +public class ListTablesResponse + extends MetadataResponse +{ + private final Collection tables; + + @JsonCreator + public ListTablesResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("tables") Collection tables) + { + super(MetadataRequestType.LIST_TABLES, catalogName); + requireNonNull(tables, "tables is null"); + this.tables = Collections.unmodifiableCollection(tables); + } + + public Collection getTables() + { + return tables; + } + + @Override + public void close() + throws Exception + { + //No Op + } + + @Override + public String toString() + { + return "ListTablesResponse{" + + "tables=" + tables + + '}'; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + ListTablesResponse that = (ListTablesResponse) o; + + return CollectionsUtils.equals(this.tables, that.tables) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tables, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataRequest.java new file mode 100644 index 0000000000..5533f72b9f --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataRequest.java @@ -0,0 +1,59 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; + +import static java.util.Objects.requireNonNull; + +public abstract class MetadataRequest + extends FederationRequest +{ + private final MetadataRequestType requestType; + private final String queryId; + private final String catalogName; + + public MetadataRequest(FederatedIdentity identity, MetadataRequestType requestType, String queryId, String catalogName) + { + super(identity); + requireNonNull(requestType, "requestType is null"); + requireNonNull(catalogName, "catalogName is null"); + this.requestType = requestType; + this.catalogName = catalogName; + this.queryId = queryId; + } + + public MetadataRequestType getRequestType() + { + return requestType; + } + + public String getCatalogName() + { + return catalogName; + } + + public String getQueryId() + { + return queryId; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataRequestType.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataRequestType.java new file mode 100644 index 0000000000..ce76ebf7cb --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataRequestType.java @@ -0,0 +1,30 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +public enum MetadataRequestType +{ + LIST_TABLES, + LIST_SCHEMAS, + GET_TABLE, + GET_TABLE_LAYOUT, + GET_SPLITS; +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataResponse.java new file mode 100644 index 0000000000..ba768650c1 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataResponse.java @@ -0,0 +1,50 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.request.FederationResponse; + +import static java.util.Objects.requireNonNull; + +public abstract class MetadataResponse + extends FederationResponse +{ + private final MetadataRequestType requestType; + private final String catalogName; + + public MetadataResponse(MetadataRequestType requestType, String catalogName) + { + requireNonNull(requestType, "requestType is null"); + requireNonNull(catalogName, "catalogName is null"); + this.requestType = requestType; + this.catalogName = catalogName; + } + + public MetadataRequestType getRequestType() + { + return requestType; + } + + public String getCatalogName() + { + return catalogName; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataService.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataService.java new file mode 100644 index 0000000000..120a467c36 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/MetadataService.java @@ -0,0 +1,29 @@ +package com.amazonaws.athena.connector.lambda.metadata; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.services.lambda.invoke.LambdaFunction; + +public interface MetadataService +{ + @LambdaFunction(functionName = "metadata") + MetadataResponse getMetadata(final MetadataRequest request); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/DefaultGlueType.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/DefaultGlueType.java new file mode 100644 index 0000000000..8c203614e3 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/DefaultGlueType.java @@ -0,0 +1,82 @@ +package com.amazonaws.athena.connector.lambda.metadata.glue; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; + +import java.util.HashMap; +import java.util.Map; + +public enum DefaultGlueType +{ + INT("int", Types.MinorType.INT.getType()), + VARCHAR("string", Types.MinorType.VARCHAR.getType()), + BIGINT("bigint", Types.MinorType.BIGINT.getType()), + DOUBLE("double", Types.MinorType.FLOAT8.getType()), + FLOAT("float", Types.MinorType.FLOAT4.getType()), + SMALLINT("smallint", Types.MinorType.SMALLINT.getType()), + TINYINT("tinyint", Types.MinorType.TINYINT.getType()), + BIT("boolean", Types.MinorType.BIT.getType()), + VARBINARY("binary", Types.MinorType.VARBINARY.getType()); + + private static final Map TYPE_MAP = new HashMap<>(); + + static { + for (DefaultGlueType next : DefaultGlueType.values()) { + TYPE_MAP.put(next.id, next); + } + } + + private String id; + private ArrowType arrowType; + + DefaultGlueType(String id, ArrowType arrowType) + { + this.id = id; + this.arrowType = arrowType; + } + + public static DefaultGlueType fromId(String id) + { + DefaultGlueType result = TYPE_MAP.get(id.toLowerCase()); + if (result == null) { + throw new IllegalArgumentException("Unknown DefaultGlueType for id: " + id); + } + + return result; + } + + public static ArrowType toArrowType(String id) + { + DefaultGlueType result = TYPE_MAP.get(id.toLowerCase()); + if (result == null) { + return null; + } + + return result.getArrowType(); + } + + public ArrowType getArrowType() + { + return arrowType; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueFieldLexer.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueFieldLexer.java new file mode 100644 index 0000000000..6c11b02fc8 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueFieldLexer.java @@ -0,0 +1,125 @@ +package com.amazonaws.athena.connector.lambda.metadata.glue; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public class GlueFieldLexer +{ + private static final Logger logger = LoggerFactory.getLogger(GlueFieldLexer.class); + + private static final String STRUCT = "struct"; + private static final String LIST = "array"; + + private static final BaseTypeMapper DEFAULT_TYPE_MAPPER = (String type) -> DefaultGlueType.toArrowType(type); + + private GlueFieldLexer() {} + + public interface BaseTypeMapper + { + //Return Null if the supplied value is not a base type + ArrowType getType(String type); + } + + public static Field lex(String name, String input) + { + if (DEFAULT_TYPE_MAPPER.getType(input) != null) { + return FieldBuilder.newBuilder(name, DEFAULT_TYPE_MAPPER.getType(input)).build(); + } + + GlueTypeParser parser = new GlueTypeParser(input); + return lexComplex(name, parser.next(), parser, DEFAULT_TYPE_MAPPER); + } + + public static Field lex(String name, String input, BaseTypeMapper mapper) + { + if (mapper.getType(input) != null) { + return FieldBuilder.newBuilder(name, mapper.getType(input)).build(); + } + + GlueTypeParser parser = new GlueTypeParser(input); + return lexComplex(name, parser.next(), parser, mapper); + } + + private static Field lexComplex(String name, GlueTypeParser.Token startToken, GlueTypeParser parser, BaseTypeMapper mapper) + { + FieldBuilder fieldBuilder; + + logger.debug("lexComplex: enter - {}", name); + if (startToken.getMarker() != GlueTypeParser.FIELD_START) { + throw new RuntimeException("Parse error, expected " + GlueTypeParser.FIELD_START + + " but found " + startToken.getMarker()); + } + + if (startToken.getValue().toLowerCase().equals(STRUCT)) { + fieldBuilder = FieldBuilder.newBuilder(name, Types.MinorType.STRUCT.getType()); + } + else if (startToken.getValue().toLowerCase().equals(LIST)) { + GlueTypeParser.Token arrayType = parser.next(); + return FieldBuilder.newBuilder(name, Types.MinorType.LIST.getType()) + .addField(FieldBuilder.newBuilder(name, mapper.getType(arrayType.getValue())).build()) + .build(); + } + else { + throw new RuntimeException("Unexpected start type " + startToken.getValue()); + } + + while (parser.hasNext() && parser.currentToken().getMarker() != GlueTypeParser.FIELD_END) { + Field child = lex(parser.next(), parser, mapper); + fieldBuilder.addField(child); + } + parser.next(); + + logger.debug("lexComplex: exit - {}", name); + return fieldBuilder.build(); + } + + private static Field lex(GlueTypeParser.Token startToken, GlueTypeParser parser, BaseTypeMapper mapper) + { + GlueTypeParser.Token nameToken = startToken; + logger.debug("lex: enter - {}", nameToken.getValue()); + if (!nameToken.getMarker().equals(GlueTypeParser.FIELD_DIV)) { + throw new RuntimeException("Expected Field DIV but found " + nameToken.getMarker() + + " while processing " + nameToken.getValue()); + } + + String name = nameToken.getValue(); + + GlueTypeParser.Token typeToken = parser.next(); + if (typeToken.getMarker().equals(GlueTypeParser.FIELD_START)) { + logger.debug("lex: exit - {}", nameToken.getValue()); + return lexComplex(name, typeToken, parser, mapper); + } + else if (typeToken.getMarker().equals(GlueTypeParser.FIELD_SEP) || + typeToken.getMarker().equals(GlueTypeParser.FIELD_END) + ) { + logger.debug("lex: exit - {}", nameToken.getValue()); + return FieldBuilder.newBuilder(name, mapper.getType(typeToken.getValue())).build(); + } + throw new RuntimeException("Unexpected Token " + typeToken.getValue() + "[" + typeToken.getMarker() + "]" + + " @ " + typeToken.getPos() + " while processing " + name); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueTypeParser.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueTypeParser.java new file mode 100644 index 0000000000..97ea59dfdb --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueTypeParser.java @@ -0,0 +1,153 @@ +package com.amazonaws.athena.connector.lambda.metadata.glue; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashSet; +import java.util.Objects; +import java.util.Set; + +public class GlueTypeParser +{ + private static final Logger logger = LoggerFactory.getLogger(GlueTypeParser.class); + + protected static final Character FIELD_START = '<'; + protected static final Character FIELD_END = '>'; + protected static final Character FIELD_DIV = ':'; + protected static final Character FIELD_SEP = ','; + private static final Set TOKENS = new HashSet<>(); + + static { + TOKENS.add(FIELD_START); + TOKENS.add(FIELD_END); + TOKENS.add(FIELD_DIV); + TOKENS.add(FIELD_SEP); + } + + private final String input; + private int pos; + private Token current = null; + + public GlueTypeParser(String input) + { + this.input = input; + this.pos = 0; + } + + public boolean hasNext() + { + return pos < input.length(); + } + + public Token next() + { + StringBuilder sb = new StringBuilder(); + int readPos = pos; + while (input.length() > readPos) { + Character last = input.charAt(readPos++); + if (last.equals(' ')) { + //NoOp + } + else if (!TOKENS.contains(last)) { + sb.append(last); + } + else { + pos = readPos; + current = new Token(sb.toString(), last, readPos); + logger.debug("next: {}", current); + return current; + } + } + pos = readPos; + + current = new Token(sb.toString(), null, readPos); + logger.debug("next: {}", current); + + return current; + } + + public Token currentToken() + { + return current; + } + + public static class Token + { + private final String value; + private final Character marker; + private final int pos; + + public Token(String value, Character marker, int pos) + { + this.value = value; + this.marker = marker; + this.pos = pos; + } + + public String getValue() + { + return value; + } + + public Character getMarker() + { + return marker; + } + + public int getPos() + { + return pos; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + Token token = (Token) o; + return getPos() == token.getPos() && + getValue().equals(token.getValue()) && + getMarker().equals(token.getMarker()); + } + + @Override + public int hashCode() + { + return Objects.hash(getValue(), getMarker(), getPos()); + } + + @Override + public String toString() + { + return "Token{" + + "value='" + value + '\'' + + ", marker=" + marker + + ", pos=" + pos + + '}'; + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/ReadRecordsRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/ReadRecordsRequest.java new file mode 100644 index 0000000000..3b32f801c3 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/ReadRecordsRequest.java @@ -0,0 +1,155 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; +import org.apache.arrow.vector.types.pojo.Schema; + +import static java.util.Objects.requireNonNull; + +public class ReadRecordsRequest + extends RecordRequest +{ + private final TableName tableName; + private final Schema schema; + private final Split split; + private final Constraints constraints; + private final long maxBlockSize; + private final long maxInlineBlockSize; + + public ReadRecordsRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("catalogName") String catalogName, + @JsonProperty("queryId") String queryId, + @JsonProperty("tableName") TableName tableName, + @JsonProperty("schema") Schema schema, + @JsonProperty("split") Split split, + @JsonProperty("constraints") Constraints constraints, + @JsonProperty("maxBlockSize") long maxBlockSize, + @JsonProperty("maxInlineBlockSize") long maxInlineBlockSize) + { + super(identity, RecordRequestType.READ_RECORDS, catalogName, queryId); + requireNonNull(schema, "schema is null"); + requireNonNull(tableName, "tableName is null"); + requireNonNull(split, "split is null"); + requireNonNull(constraints, "constraints is null"); + this.schema = schema; + this.tableName = tableName; + this.split = split; + this.maxBlockSize = maxBlockSize; + this.maxInlineBlockSize = maxInlineBlockSize; + this.constraints = constraints; + } + + @JsonProperty + public TableName getTableName() + { + return tableName; + } + + @JsonProperty + public Schema getSchema() + { + return schema; + } + + @JsonProperty + public Split getSplit() + { + return split; + } + + @JsonProperty + public long getMaxInlineBlockSize() + { + return maxInlineBlockSize; + } + + @JsonProperty + public long getMaxBlockSize() + { + return maxBlockSize; + } + + @JsonProperty + public Constraints getConstraints() + { + return constraints; + } + + @Override + public void close() + throws Exception + { + constraints.close(); + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("queryId", getQueryId()) + .add("tableName", tableName) + .add("schema", schema) + .add("split", split) + .add("requestType", getRequestType()) + .add("catalogName", getCatalogName()) + .add("maxBlockSize", maxBlockSize) + .add("maxInlineBlockSize", maxInlineBlockSize) + .add("constraints", constraints) + .toString(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + ReadRecordsRequest that = (ReadRecordsRequest) o; + + return Objects.equal(this.tableName, that.tableName) && + Objects.equal(this.schema, that.schema) && + Objects.equal(this.split, that.split) && + Objects.equal(this.constraints, that.constraints) && + Objects.equal(this.maxBlockSize, that.maxBlockSize) && + Objects.equal(this.maxInlineBlockSize, that.maxInlineBlockSize) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()) && + Objects.equal(this.getQueryId(), that.getQueryId()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(tableName, schema, split, constraints, maxBlockSize, maxInlineBlockSize, + getRequestType(), getCatalogName(), getQueryId()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/ReadRecordsResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/ReadRecordsResponse.java new file mode 100644 index 0000000000..059ef87baf --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/ReadRecordsResponse.java @@ -0,0 +1,105 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.beans.Transient; + +import static java.util.Objects.requireNonNull; + +public class ReadRecordsResponse + extends RecordResponse +{ + private final Block records; + + @JsonCreator + public ReadRecordsResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("records") Block records) + { + super(RecordRequestType.READ_RECORDS, catalogName); + requireNonNull(records, "records is null"); + this.records = records; + } + + @Transient + public Schema getSchema() + { + return records.getSchema(); + } + + @JsonProperty + public Block getRecords() + { + return records; + } + + @Transient + public int getRecordCount() + { + return records.getRowCount(); + } + + @Override + public void close() + throws Exception + { + records.close(); + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("records", records) + .add("requestType", getRequestType()) + .add("catalogName", getCatalogName()) + .toString(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + ReadRecordsResponse that = (ReadRecordsResponse) o; + + return Objects.equal(this.records, that.records) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(records, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordRequest.java new file mode 100644 index 0000000000..c3fb204b08 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordRequest.java @@ -0,0 +1,60 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; + +import static java.util.Objects.requireNonNull; + +public abstract class RecordRequest + extends FederationRequest +{ + private final RecordRequestType requestType; + private final String catalogName; + private final String queryId; + + public RecordRequest(FederatedIdentity identity, RecordRequestType requestType, String catalogName, String queryId) + { + super(identity); + requireNonNull(requestType, "requestType is null"); + requireNonNull(catalogName, "catalogName is null"); + requireNonNull(queryId, "queryId is null"); + this.requestType = requestType; + this.catalogName = catalogName; + this.queryId = queryId; + } + + public RecordRequestType getRequestType() + { + return requestType; + } + + public String getCatalogName() + { + return catalogName; + } + + public String getQueryId() + { + return queryId; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordRequestType.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordRequestType.java new file mode 100644 index 0000000000..7c6e91fe6e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordRequestType.java @@ -0,0 +1,26 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +public enum RecordRequestType +{ + READ_RECORDS; +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordResponse.java new file mode 100644 index 0000000000..9ced8bd182 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordResponse.java @@ -0,0 +1,50 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.request.FederationResponse; + +import static java.util.Objects.requireNonNull; + +public abstract class RecordResponse + extends FederationResponse +{ + private final RecordRequestType requestType; + private final String catalogName; + + public RecordResponse(RecordRequestType requestType, String catalogName) + { + requireNonNull(requestType, "requestType is null"); + requireNonNull(catalogName, "catalogName is null"); + this.requestType = requestType; + this.catalogName = catalogName; + } + + public RecordRequestType getRequestType() + { + return requestType; + } + + public String getCatalogName() + { + return catalogName; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordService.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordService.java new file mode 100644 index 0000000000..6ed540ae7e --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RecordService.java @@ -0,0 +1,29 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.services.lambda.invoke.LambdaFunction; + +public interface RecordService +{ + @LambdaFunction(functionName = "record") + RecordResponse readRecords(final RecordRequest request); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RemoteReadRecordsResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RemoteReadRecordsResponse.java new file mode 100644 index 0000000000..f57b252f83 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/records/RemoteReadRecordsResponse.java @@ -0,0 +1,125 @@ +package com.amazonaws.athena.connector.lambda.records; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import com.google.common.base.MoreObjects; +import com.google.common.base.Objects; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.beans.Transient; +import java.util.Collections; +import java.util.List; + +import static java.util.Objects.requireNonNull; + +public class RemoteReadRecordsResponse + extends RecordResponse +{ + private final Schema schema; + private final List remoteBlocks; + private final EncryptionKey encryptionKey; + + @JsonCreator + public RemoteReadRecordsResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("schema") Schema schema, + @JsonProperty("remoteBlocks") List remoteBlocks, + @JsonProperty("encryptionKey") EncryptionKey encryptionKey) + { + super(RecordRequestType.READ_RECORDS, catalogName); + requireNonNull(schema, "schema is null"); + requireNonNull(remoteBlocks, "remoteBlocks is null"); + this.schema = schema; + this.remoteBlocks = Collections.unmodifiableList(remoteBlocks); + this.encryptionKey = encryptionKey; + } + + @JsonProperty + public Schema getSchema() + { + return schema; + } + + @JsonProperty + public List getRemoteBlocks() + { + return remoteBlocks; + } + + @Transient + public int getNumberBlocks() + { + return remoteBlocks.size(); + } + + @JsonProperty + public EncryptionKey getEncryptionKey() + { + return encryptionKey; + } + + @Override + public void close() + throws Exception + { + //NoOp + } + + @Override + public String toString() + { + return MoreObjects.toStringHelper(this) + .add("schema", schema) + .add("remoteBlocks", remoteBlocks) + .add("encryptionKey", "XXXXXX") + .add("requestType", getRequestType()) + .add("catalogName", getCatalogName()) + .toString(); + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + RemoteReadRecordsResponse that = (RemoteReadRecordsResponse) o; + + return Objects.equal(this.schema, that.schema) && + Objects.equal(this.remoteBlocks, that.remoteBlocks) && + Objects.equal(this.encryptionKey, that.encryptionKey) && + Objects.equal(this.getRequestType(), that.getRequestType()) && + Objects.equal(this.getCatalogName(), that.getCatalogName()); + } + + @Override + public int hashCode() + { + return Objects.hashCode(schema, remoteBlocks, encryptionKey, getRequestType(), getCatalogName()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/FederationRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/FederationRequest.java new file mode 100644 index 0000000000..4a4df3e1cc --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/FederationRequest.java @@ -0,0 +1,66 @@ +package com.amazonaws.athena.connector.lambda.request; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.udf.UserDefinedFunctionRequest; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonSubTypes; +import com.fasterxml.jackson.annotation.JsonTypeInfo; + +@JsonIgnoreProperties(ignoreUnknown = true) +@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY) +@JsonSubTypes({ + @JsonSubTypes.Type(value = ListSchemasRequest.class, name = "ListSchemasRequest"), + @JsonSubTypes.Type(value = ListTablesRequest.class, name = "ListTablesRequest"), + @JsonSubTypes.Type(value = GetTableRequest.class, name = "GetTableRequest"), + @JsonSubTypes.Type(value = GetTableLayoutRequest.class, name = "GetTableLayoutRequest"), + @JsonSubTypes.Type(value = GetSplitsRequest.class, name = "GetSplitsRequest"), + @JsonSubTypes.Type(value = ReadRecordsRequest.class, name = "ReadRecordsRequest"), + @JsonSubTypes.Type(value = UserDefinedFunctionRequest.class, name = "UserDefinedFunctionRequest"), + @JsonSubTypes.Type(value = PingRequest.class, name = "PingRequest") +}) +public abstract class FederationRequest + implements AutoCloseable +{ + private final FederatedIdentity identity; + + public FederationRequest() + { + identity = null; + } + + public FederationRequest(FederatedIdentity identity) + { + this.identity = identity; + } + + public FederatedIdentity getIdentity() + { + return identity; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/FederationResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/FederationResponse.java new file mode 100644 index 0000000000..f0cb86cc5b --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/FederationResponse.java @@ -0,0 +1,50 @@ +package com.amazonaws.athena.connector.lambda.request; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.udf.UserDefinedFunctionResponse; +import com.fasterxml.jackson.annotation.JsonIgnoreProperties; +import com.fasterxml.jackson.annotation.JsonSubTypes; +import com.fasterxml.jackson.annotation.JsonTypeInfo; + +@JsonIgnoreProperties(ignoreUnknown = true) +@JsonTypeInfo(use = JsonTypeInfo.Id.NAME, include = JsonTypeInfo.As.PROPERTY) +@JsonSubTypes({ + @JsonSubTypes.Type(value = ListSchemasResponse.class, name = "ListSchemasResponse"), + @JsonSubTypes.Type(value = ListTablesResponse.class, name = "ListTablesResponse"), + @JsonSubTypes.Type(value = GetTableResponse.class, name = "GetTableResponse"), + @JsonSubTypes.Type(value = GetTableLayoutResponse.class, name = "GetTableLayoutResponse"), + @JsonSubTypes.Type(value = GetSplitsResponse.class, name = "GetSplitsResponse"), + @JsonSubTypes.Type(value = ReadRecordsResponse.class, name = "ReadRecordsResponse"), + @JsonSubTypes.Type(value = RemoteReadRecordsResponse.class, name = "RemoteReadRecordsResponse"), + @JsonSubTypes.Type(value = UserDefinedFunctionResponse.class, name = "UserDefinedFunctionResponse"), + @JsonSubTypes.Type(value = PingResponse.class, name = "PingResponse") +}) +public abstract class FederationResponse implements AutoCloseable +{ +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/PingRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/PingRequest.java new file mode 100644 index 0000000000..18bcbc4972 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/PingRequest.java @@ -0,0 +1,74 @@ +package com.amazonaws.athena.connector.lambda.request; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import static java.util.Objects.requireNonNull; + +public class PingRequest + extends FederationRequest +{ + private final String catalogName; + private final String queryId; + + @JsonCreator + public PingRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("catalogName") String catalogName, + @JsonProperty("queryId") String queryId) + { + super(identity); + requireNonNull(catalogName, "catalogName is null"); + requireNonNull(queryId, "queryId is null"); + this.catalogName = catalogName; + this.queryId = queryId; + } + + @JsonProperty("catalogName") + public String getCatalogName() + { + return catalogName; + } + + @JsonProperty("queryId") + public String getQueryId() + { + return queryId; + } + + @Override + public void close() + throws Exception + { + //no-op + } + + @Override + public String toString() + { + return "PingRequest{" + + "catalogName='" + catalogName + '\'' + + ", queryId='" + queryId + '\'' + + '}'; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/PingResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/PingResponse.java new file mode 100644 index 0000000000..d52d4901cb --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/request/PingResponse.java @@ -0,0 +1,91 @@ +package com.amazonaws.athena.connector.lambda.request; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import static java.util.Objects.requireNonNull; + +public class PingResponse + extends FederationResponse +{ + private final int capabilities; + private final String catalogName; + private final String queryId; + private final String sourceType; + + @JsonCreator + public PingResponse(@JsonProperty("catalogName") String catalogName, + @JsonProperty("queryId") String queryId, + @JsonProperty("sourceType") String sourceType, + @JsonProperty("capabilities") int capabilities) + { + requireNonNull(catalogName, "catalogName is null"); + requireNonNull(queryId, "queryId is null"); + this.catalogName = catalogName; + this.queryId = queryId; + this.sourceType = sourceType; + this.capabilities = capabilities; + } + + @JsonProperty("catalogName") + public String getCatalogName() + { + return catalogName; + } + + @JsonProperty("queryId") + public String getQueryId() + { + return queryId; + } + + @JsonProperty("sourceType") + public String getSourceType() + { + return sourceType; + } + + @JsonProperty("capabilities") + public int getCapabilities() + { + return capabilities; + } + + @Override + public void close() + throws Exception + { + //no-op + } + + @Override + public String toString() + { + return "PingRequest{" + + "catalogName='" + catalogName + '\'' + + ", queryId='" + queryId + '\'' + + ", sourceType='" + sourceType + '\'' + + ", capabilities='" + capabilities + '\'' + + '}'; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/AesGcmBlockCrypto.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/AesGcmBlockCrypto.java new file mode 100644 index 0000000000..a0912dc5ae --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/AesGcmBlockCrypto.java @@ -0,0 +1,131 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.RecordBatchSerDe; +import org.apache.arrow.vector.types.pojo.Schema; +import org.bouncycastle.jce.provider.BouncyCastleProvider; + +import javax.crypto.BadPaddingException; +import javax.crypto.Cipher; +import javax.crypto.IllegalBlockSizeException; +import javax.crypto.NoSuchPaddingException; +import javax.crypto.spec.GCMParameterSpec; +import javax.crypto.spec.SecretKeySpec; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.security.InvalidAlgorithmParameterException; +import java.security.InvalidKeyException; +import java.security.NoSuchAlgorithmException; +import java.security.NoSuchProviderException; +import java.security.Security; + +public class AesGcmBlockCrypto + implements BlockCrypto +{ + protected static final int GCM_TAG_LENGTH_BITS = 16 * 8; + protected static final int NONCE_BYTES = 12; + protected static final int KEY_BYTES = 16; + protected static final String KEYSPEC = "AES"; + protected static final String ALGO = "AES/GCM/NoPadding"; + protected static final String ALGO_BC = "BC"; + + private final RecordBatchSerDe serDe; + private final BlockAllocator allocator; + + static { + Security.addProvider(new BouncyCastleProvider()); + } + + public AesGcmBlockCrypto(BlockAllocator allocator) + { + this.serDe = new RecordBatchSerDe(allocator); + this.allocator = allocator; + } + + public byte[] encrypt(EncryptionKey key, Block block) + { + try { + ByteArrayOutputStream out = new ByteArrayOutputStream(); + serDe.serialize(block.getRecordBatch(), out); + + Cipher cipher = makeCipher(Cipher.ENCRYPT_MODE, key); + return cipher.doFinal(out.toByteArray()); + } + catch (BadPaddingException | IllegalBlockSizeException | IOException ex) { + throw new RuntimeException(ex); + } + } + + public Block decrypt(EncryptionKey key, byte[] bytes, Schema schema) + { + try { + Cipher cipher = makeCipher(Cipher.DECRYPT_MODE, key); + byte[] clear = cipher.doFinal(bytes); + + Block resultBlock = allocator.createBlock(schema); + resultBlock.loadRecordBatch(serDe.deserialize(clear)); + + return resultBlock; + } + catch (BadPaddingException | IllegalBlockSizeException | IOException ex) { + throw new RuntimeException(ex); + } + } + + public byte[] decrypt(EncryptionKey key, byte[] bytes) + { + try { + Cipher cipher = makeCipher(Cipher.DECRYPT_MODE, key); + return cipher.doFinal(bytes); + } + catch (BadPaddingException | IllegalBlockSizeException ex) { + throw new RuntimeException(ex); + } + } + + private Cipher makeCipher(int mode, EncryptionKey key) + { + if (key.getNonce().length != NONCE_BYTES) { + throw new RuntimeException("Expected " + NONCE_BYTES + " nonce bytes but found " + key.getNonce().length); + } + + if (key.getKey().length != KEY_BYTES) { + throw new RuntimeException("Expected " + KEY_BYTES + " key bytes but found " + key.getKey().length); + } + + GCMParameterSpec spec = new GCMParameterSpec(GCM_TAG_LENGTH_BITS, key.getNonce()); + SecretKeySpec secretKeySpec = new SecretKeySpec(key.getKey(), KEYSPEC); + + try { + Cipher cipher = Cipher.getInstance(ALGO, ALGO_BC); + cipher.init(mode, secretKeySpec, spec); + return cipher; + } + catch (NoSuchAlgorithmException | InvalidKeyException | InvalidAlgorithmParameterException + | NoSuchProviderException | NoSuchPaddingException ex) { + throw new RuntimeException(ex); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/BlockCrypto.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/BlockCrypto.java new file mode 100644 index 0000000000..73bd56d1a3 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/BlockCrypto.java @@ -0,0 +1,33 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import org.apache.arrow.vector.types.pojo.Schema; + +public interface BlockCrypto +{ + byte[] encrypt(EncryptionKey key, Block block); + + Block decrypt(EncryptionKey key, byte[] bytes, Schema schema); + + byte[] decrypt(EncryptionKey key, byte[] bytes); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/CachableSecretsManager.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/CachableSecretsManager.java new file mode 100644 index 0000000000..2340e05f41 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/CachableSecretsManager.java @@ -0,0 +1,152 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.apache.arrow.util.VisibleForTesting; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Iterator; +import java.util.LinkedHashMap; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +public class CachableSecretsManager +{ + private static final Logger logger = LoggerFactory.getLogger(CachableSecretsManager.class); + + private static final long MAX_CACHE_AGE_MS = 60_000; + protected static final int MAX_CACHE_SIZE = 10; + + private static final String SECRET_PATTERN = "(\\$\\{[a-zA-Z0-9-_\\-]+\\})"; + private static final String SECRET_NAME_PATTERN = "\\$\\{([a-zA-Z0-9-_\\-]+)\\}"; + private static final Pattern PATTERN = Pattern.compile(SECRET_PATTERN); + private static final Pattern NAME_PATTERN = Pattern.compile(SECRET_NAME_PATTERN); + + private final LinkedHashMap cache = new LinkedHashMap<>(); + private final AWSSecretsManager secretsManager; + + public CachableSecretsManager(AWSSecretsManager secretsManager) + { + this.secretsManager = secretsManager; + } + + /** + * Resolves any secrets found in the supplied string, for example: MyString${WithSecret} would have ${WithSecret} + * by the corresponding value of the secret in AWS Secrets Manager with that name. If no such secret is found + * the function throws. + */ + public String resolveSecrets(String rawString) + { + if (rawString == null) { + return rawString; + } + + Matcher m = PATTERN.matcher(rawString); + String result = rawString; + while (m.find()) { + String nextSecret = m.group(1); + Matcher m1 = NAME_PATTERN.matcher(nextSecret); + m1.find(); + result = result.replace(nextSecret, getSecret(m1.group(1))); + } + return result; + } + + public String getSecret(String secretName) + { + CacheEntry cacheEntry = cache.get(secretName); + + if (cacheEntry == null || cacheEntry.getAge() > MAX_CACHE_AGE_MS) { + logger.info("getSecret: Resolving secret[{}].", secretName); + GetSecretValueResult secretValueResult = secretsManager.getSecretValue(new GetSecretValueRequest() + .withSecretId(secretName)); + cacheEntry = new CacheEntry(secretName, secretValueResult.getSecretString()); + evictCache(cache.size() >= MAX_CACHE_SIZE); + cache.put(secretName, cacheEntry); + } + + return cacheEntry.getValue(); + } + + private void evictCache(boolean force) + { + Iterator> itr = cache.entrySet().iterator(); + int removed = 0; + while (itr.hasNext()) { + CacheEntry entry = itr.next().getValue(); + if (entry.getAge() > MAX_CACHE_AGE_MS) { + itr.remove(); + removed++; + } + } + + if (removed == 0 && force) { + //Remove the oldest since we found no expired entries + itr = cache.entrySet().iterator(); + if (itr.hasNext()) { + itr.next(); + itr.remove(); + } + } + } + + @VisibleForTesting + protected void addCacheEntry(String name, String value, long createTime) + { + cache.put(name, new CacheEntry(name, value, createTime)); + } + + private class CacheEntry + { + private final String name; + private final String value; + private final long createTime; + + public CacheEntry(String name, String value) + { + this.value = value; + this.name = name; + this.createTime = System.currentTimeMillis(); + } + + public CacheEntry(String name, String value, long createTime) + { + this.value = value; + this.name = name; + this.createTime = createTime; + } + + public String getValue() + { + return value; + } + + public long getAge() + { + return System.currentTimeMillis() - createTime; + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/EncryptionKey.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/EncryptionKey.java new file mode 100644 index 0000000000..428eb3860f --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/EncryptionKey.java @@ -0,0 +1,73 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import java.util.Arrays; + +public class EncryptionKey +{ + private final byte[] key; + private final byte[] nonce; + + @JsonCreator + public EncryptionKey(@JsonProperty("key") byte[] key, @JsonProperty("nonce") byte[] nonce) + { + this.key = key; + this.nonce = nonce; + } + + @JsonProperty + public byte[] getKey() + { + return key; + } + + @JsonProperty + public byte[] getNonce() + { + return nonce; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + + EncryptionKey that = (EncryptionKey) o; + + return Arrays.equals(this.key, that.key) && + Arrays.equals(this.nonce, that.nonce); + } + + @Override + public int hashCode() + { + return Arrays.hashCode(key) + 31 + Arrays.hashCode(nonce); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/EncryptionKeyFactory.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/EncryptionKeyFactory.java new file mode 100644 index 0000000000..3bc6a8548c --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/EncryptionKeyFactory.java @@ -0,0 +1,29 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +public interface EncryptionKeyFactory +{ + /** + * @return A key that satisfies the specification defined in BlockCrypto + */ + EncryptionKey create(); +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/FederatedIdentity.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/FederatedIdentity.java new file mode 100644 index 0000000000..68223c5b26 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/FederatedIdentity.java @@ -0,0 +1,59 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +public class FederatedIdentity +{ + public final String id; + public final String principal; + public final String account; + + @JsonCreator + public FederatedIdentity(@JsonProperty("id") String id, + @JsonProperty("principal") String principal, + @JsonProperty("account") String account) + { + this.id = id; + this.principal = principal; + this.account = account; + } + + @JsonProperty("id") + public String getId() + { + return id; + } + + @JsonProperty("principal") + public String getPrincipal() + { + return principal; + } + + @JsonProperty("account") + public String getAccount() + { + return account; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/KmsKeyFactory.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/KmsKeyFactory.java new file mode 100644 index 0000000000..94456361d8 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/KmsKeyFactory.java @@ -0,0 +1,56 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.services.kms.AWSKMS; +import com.amazonaws.services.kms.model.DataKeySpec; +import com.amazonaws.services.kms.model.GenerateDataKeyRequest; +import com.amazonaws.services.kms.model.GenerateDataKeyResult; +import com.amazonaws.services.kms.model.GenerateRandomRequest; +import com.amazonaws.services.kms.model.GenerateRandomResult; + +public class KmsKeyFactory + implements EncryptionKeyFactory +{ + private final AWSKMS kmsClient; + private final String masterKeyId; + + public KmsKeyFactory(AWSKMS kmsClient, String masterKeyId) + { + this.kmsClient = kmsClient; + this.masterKeyId = masterKeyId; + } + + public EncryptionKey create() + { + GenerateDataKeyResult dataKeyResult = + kmsClient.generateDataKey( + new GenerateDataKeyRequest() + .withKeyId(masterKeyId) + .withKeySpec(DataKeySpec.AES_128)); + + GenerateRandomRequest randomRequest = new GenerateRandomRequest() + .withNumberOfBytes(AesGcmBlockCrypto.NONCE_BYTES); + GenerateRandomResult randomResult = kmsClient.generateRandom(randomRequest); + + return new EncryptionKey(dataKeyResult.getPlaintext().array(), randomResult.getPlaintext().array()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/LocalKeyFactory.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/LocalKeyFactory.java new file mode 100644 index 0000000000..5388e33936 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/LocalKeyFactory.java @@ -0,0 +1,47 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import javax.crypto.KeyGenerator; +import javax.crypto.SecretKey; + +import java.security.NoSuchAlgorithmException; +import java.security.SecureRandom; + +public class LocalKeyFactory + implements EncryptionKeyFactory +{ + public EncryptionKey create() + { + try { + SecureRandom random = SecureRandom.getInstanceStrong(); + KeyGenerator keyGen = KeyGenerator.getInstance(AesGcmBlockCrypto.KEYSPEC); + keyGen.init(AesGcmBlockCrypto.KEY_BYTES * 8, random); + SecretKey key = keyGen.generateKey(); + final byte[] nonce = new byte[AesGcmBlockCrypto.NONCE_BYTES]; + random.nextBytes(nonce); + return new EncryptionKey(key.getEncoded(), nonce); + } + catch (NoSuchAlgorithmException ex) { + throw new RuntimeException(ex); + } + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/NoOpBlockCrypto.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/NoOpBlockCrypto.java new file mode 100644 index 0000000000..15ff7de0c8 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/security/NoOpBlockCrypto.java @@ -0,0 +1,74 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.RecordBatchSerDe; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +public class NoOpBlockCrypto + implements BlockCrypto +{ + private final RecordBatchSerDe serDe; + private final BlockAllocator allocator; + + public NoOpBlockCrypto(BlockAllocator allocator) + { + this.serDe = new RecordBatchSerDe(allocator); + this.allocator = allocator; + } + + public byte[] encrypt(EncryptionKey key, Block block) + { + if (key != null) { + throw new RuntimeException("Real key provided to NoOpBlockCrypto, likely indicates you wanted real crypto."); + } + try { + ByteArrayOutputStream out = new ByteArrayOutputStream(); + serDe.serialize(block.getRecordBatch(), out); + return out.toByteArray(); + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + public Block decrypt(EncryptionKey key, byte[] bytes, Schema schema) + { + try { + Block resultBlock = allocator.createBlock(schema); + resultBlock.loadRecordBatch(serDe.deserialize(bytes)); + return resultBlock; + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + public byte[] decrypt(EncryptionKey key, byte[] bytes) + { + return bytes; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/BlockDeserializer.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/BlockDeserializer.java new file mode 100644 index 0000000000..fbe9551743 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/BlockDeserializer.java @@ -0,0 +1,110 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorRegistry; +import com.amazonaws.athena.connector.lambda.data.RecordBatchSerDe; +import com.amazonaws.athena.connector.lambda.data.SchemaSerDe; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.databind.DeserializationContext; +import com.fasterxml.jackson.databind.JsonNode; +import com.fasterxml.jackson.databind.deser.std.StdDeserializer; +import org.apache.arrow.vector.ipc.message.ArrowRecordBatch; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.io.ByteArrayInputStream; +import java.io.IOException; + +/** + * Uses either an explicit BlockAllocator or a BlockAllocatorRegistry to handle memory pooling associated with + * deserializing blocks. + */ +public class BlockDeserializer + extends StdDeserializer +{ + private final BlockAllocatorRegistry allocatorRegistry; + private final BlockAllocator allocator; + private final SchemaSerDe schemaSerDe; + private final RecordBatchSerDe recordBatchSerDe; + + public BlockDeserializer(BlockAllocator allocator) + { + super(Block.class); + this.schemaSerDe = new SchemaSerDe(); + this.recordBatchSerDe = new RecordBatchSerDe(allocator); + this.allocator = allocator; + this.allocatorRegistry = null; + } + + public BlockDeserializer(BlockAllocatorRegistry allocatorRegistry) + { + super(Block.class); + this.schemaSerDe = new SchemaSerDe(); + this.allocator = null; + this.recordBatchSerDe = null; + this.allocatorRegistry = allocatorRegistry; + } + + @Override + public Block deserialize(JsonParser jsonParser, DeserializationContext deserializationContext) + throws IOException + { + JsonNode node = jsonParser.getCodec().readTree(jsonParser); + String allocatorId = node.get(BlockSerializer.ALLOCATOR_ID_FIELD_NAME).asText(); + byte[] schemaBytes = node.get(BlockSerializer.SCHEMA_FIELD_NAME).binaryValue(); + byte[] batchBytes = node.get(BlockSerializer.BATCH_FIELD_NAME).binaryValue(); + + Schema schema = schemaSerDe.deserialize(new ByteArrayInputStream(schemaBytes)); + Block block = getOrCreateAllocator(allocatorId).createBlock(schema); + + if (batchBytes.length > 0) { + ArrowRecordBatch batch = deserializeBatch(allocatorId, batchBytes); + block.loadRecordBatch(batch); + } + return block; + } + + private ArrowRecordBatch deserializeBatch(String allocatorId, byte[] batchBytes) + throws IOException + { + return getOrCreateBatchSerde(allocatorId).deserialize(batchBytes); + } + + private RecordBatchSerDe getOrCreateBatchSerde(String allocatorId) + { + if (recordBatchSerDe != null) { + return recordBatchSerDe; + } + + return new RecordBatchSerDe(getOrCreateAllocator(allocatorId)); + } + + private BlockAllocator getOrCreateAllocator(String allocatorId) + { + if (allocator != null) { + return allocator; + } + + return allocatorRegistry.getOrCreateAllocator(allocatorId); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/BlockSerializer.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/BlockSerializer.java new file mode 100644 index 0000000000..b10b18c512 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/BlockSerializer.java @@ -0,0 +1,70 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.RecordBatchSerDe; +import com.amazonaws.athena.connector.lambda.data.SchemaSerDe; +import com.fasterxml.jackson.core.JsonGenerator; +import com.fasterxml.jackson.databind.SerializerProvider; +import com.fasterxml.jackson.databind.ser.std.StdSerializer; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +public class BlockSerializer + extends StdSerializer +{ + protected static final String ALLOCATOR_ID_FIELD_NAME = "aId"; + protected static final String SCHEMA_FIELD_NAME = "schema"; + protected static final String BATCH_FIELD_NAME = "records"; + private final SchemaSerDe schemaSerDe; + private final RecordBatchSerDe recordBatchSerDe; + + public BlockSerializer() + { + super(Block.class); + this.schemaSerDe = new SchemaSerDe(); + this.recordBatchSerDe = new RecordBatchSerDe(null); + } + + @Override + public void serialize(Block block, JsonGenerator jsonGenerator, SerializerProvider serializerProvider) + throws IOException + { + jsonGenerator.writeStartObject(); + + jsonGenerator.writeStringField(ALLOCATOR_ID_FIELD_NAME, block.getAllocatorId()); + + ByteArrayOutputStream schemaOut = new ByteArrayOutputStream(); + schemaSerDe.serialize(block.getSchema(), schemaOut); + jsonGenerator.writeBinaryField(SCHEMA_FIELD_NAME, schemaOut.toByteArray()); + schemaOut.close(); + + ByteArrayOutputStream batchOut = new ByteArrayOutputStream(); + if (block != null && block.getRowCount() > 0) { + recordBatchSerDe.serialize(block.getRecordBatch(), batchOut); + } + jsonGenerator.writeBinaryField(BATCH_FIELD_NAME, batchOut.toByteArray()); + + jsonGenerator.writeEndObject(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/ObjectMapperFactory.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/ObjectMapperFactory.java new file mode 100644 index 0000000000..f6b717bef2 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/ObjectMapperFactory.java @@ -0,0 +1,51 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.fasterxml.jackson.databind.DeserializationFeature; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.databind.module.SimpleModule; +import org.apache.arrow.vector.types.pojo.Schema; + +public class ObjectMapperFactory +{ + private ObjectMapperFactory() + { + } + + public static ObjectMapper create(BlockAllocator allocator) + { + ObjectMapper objectMapper = new ObjectMapper(); + SimpleModule module = new SimpleModule(); + module.addSerializer(Schema.class, new SchemaSerializer()); + module.addDeserializer(Schema.class, new SchemaDeserializer()); + module.addDeserializer(Block.class, new BlockDeserializer(allocator)); + module.addSerializer(Block.class, new BlockSerializer()); + + //todo provide a block serializer instead of batch serializer but only serialize the batch not the schema. + objectMapper.registerModule(module) + .disable(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES) + .enable(DeserializationFeature.ACCEPT_SINGLE_VALUE_AS_ARRAY); + return objectMapper; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/SchemaDeserializer.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/SchemaDeserializer.java new file mode 100644 index 0000000000..e45f6d3255 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/SchemaDeserializer.java @@ -0,0 +1,52 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.SchemaSerDe; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.core.JsonProcessingException; +import com.fasterxml.jackson.databind.DeserializationContext; +import com.fasterxml.jackson.databind.JsonNode; +import com.fasterxml.jackson.databind.deser.std.StdDeserializer; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.io.ByteArrayInputStream; +import java.io.IOException; + +public class SchemaDeserializer + extends StdDeserializer +{ + private final SchemaSerDe serDe = new SchemaSerDe(); + + public SchemaDeserializer() + { + super(Schema.class); + } + + @Override + public Schema deserialize(JsonParser jsonParser, DeserializationContext deserializationContext) + throws IOException, JsonProcessingException + { + JsonNode node = jsonParser.getCodec().readTree(jsonParser); + byte[] schemaBytes = node.get(SchemaSerializer.SCHEMA_FIELD_NAME).binaryValue(); + return serDe.deserialize(new ByteArrayInputStream(schemaBytes)); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/SchemaSerializer.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/SchemaSerializer.java new file mode 100644 index 0000000000..6ce1d43862 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/serde/SchemaSerializer.java @@ -0,0 +1,53 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.SchemaSerDe; +import com.fasterxml.jackson.core.JsonGenerator; +import com.fasterxml.jackson.databind.SerializerProvider; +import com.fasterxml.jackson.databind.ser.std.StdSerializer; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +public class SchemaSerializer + extends StdSerializer +{ + public static final String SCHEMA_FIELD_NAME = "schema"; + private final SchemaSerDe serDe = new SchemaSerDe(); + + public SchemaSerializer() + { + super(Schema.class); + } + + @Override + public void serialize(Schema schema, JsonGenerator jsonGenerator, SerializerProvider serializerProvider) + throws IOException + { + jsonGenerator.writeStartObject(); + ByteArrayOutputStream out = new ByteArrayOutputStream(); + serDe.serialize(schema, out); + jsonGenerator.writeBinaryField(SCHEMA_FIELD_NAME, out.toByteArray()); + jsonGenerator.writeEndObject(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionHandler.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionHandler.java new file mode 100644 index 0000000000..a9ba0b1011 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionHandler.java @@ -0,0 +1,227 @@ +package com.amazonaws.athena.connector.lambda.udf; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.projectors.ArrowValueProjector; +import com.amazonaws.athena.connector.lambda.data.projectors.ProjectorUtils; +import com.amazonaws.athena.connector.lambda.data.writers.ArrowValueWriter; +import com.amazonaws.athena.connector.lambda.data.writers.WriterUtils; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.amazonaws.services.lambda.runtime.Context; +import com.amazonaws.services.lambda.runtime.RequestStreamHandler; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.Lists; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.InputStream; +import java.io.OutputStream; +import java.lang.reflect.InvocationTargetException; +import java.lang.reflect.Method; +import java.util.Arrays; +import java.util.List; +import java.util.stream.Collectors; + +import static com.google.common.base.Preconditions.checkState; + +/** + * Athena UDF users are expected to extend this class to create UDFs. + */ +public abstract class UserDefinedFunctionHandler implements RequestStreamHandler +{ + private static final Logger logger = LoggerFactory.getLogger(UserDefinedFunctionHandler.class); + + private static final int RETURN_COLUMN_COUNT = 1; + + @Override + public final void handleRequest(InputStream inputStream, OutputStream outputStream, Context context) + { + try (BlockAllocator allocator = new BlockAllocatorImpl()) { + ObjectMapper objectMapper = ObjectMapperFactory.create(allocator); + try (FederationRequest rawRequest = objectMapper.readValue(inputStream, FederationRequest.class)) { + if (!(rawRequest instanceof UserDefinedFunctionRequest)) { + throw new RuntimeException("Expected a UserDefinedFunctionRequest but found " + + rawRequest.getClass()); + } + + UserDefinedFunctionRequest udfRequest = (UserDefinedFunctionRequest) rawRequest; + try (UserDefinedFunctionResponse udfResponse = processFunction(allocator, udfRequest)) { + objectMapper.writeValue(outputStream, udfResponse); + } + } + catch (Exception ex) { + throw (ex instanceof RuntimeException) ? (RuntimeException) ex : new RuntimeException(ex); + } + } + } + + @VisibleForTesting + UserDefinedFunctionResponse processFunction(BlockAllocator allocator, UserDefinedFunctionRequest req) + { + UserDefinedFunctionType functionType = req.getFunctionType(); + switch (functionType) { + case SCALAR: + return processScalarFunction(allocator, req); + default: + throw new UnsupportedOperationException("Unsupported function type " + functionType); + } + } + + private UserDefinedFunctionResponse processScalarFunction(BlockAllocator allocator, UserDefinedFunctionRequest req) + { + Method udfMethod = extractScalarFunctionMethod(req); + Block inputRecords = req.getInputRecords(); + Schema outputSchema = req.getOutputSchema(); + + Block outputRecords = processRows(allocator, udfMethod, inputRecords, outputSchema); + return new UserDefinedFunctionResponse(outputRecords, udfMethod.getName()); + } + + /** + * Processes a group by rows. This method takes in a block of data (containing multiple rows), process them and + * returns multiple rows of the output column in a block. + * + * UDF methods are invoked row-by-row in a for loop. Arrow values are converted to Java Objects and then passed into + * the UDF java method. This is not very efficient because we might potentially be doing a lot of data copying. + * Advanced users could choose to override this method and directly deal with Arrow data to achieve better + * performance. + * + * @param allocator arrow memory allocator + * @param udfMethod the extracted java method matching the User-Defined-Function defined in Athena. + * @param inputRecords input data in Arrow format + * @param outputSchema output data schema in Arrow format + * @return output data in Arrow format + */ + protected Block processRows(BlockAllocator allocator, Method udfMethod, Block inputRecords, Schema outputSchema) + { + int rowCount = inputRecords.getRowCount(); + + List valueProjectors = Lists.newArrayList(); + + for (Field field : inputRecords.getFields()) { + FieldReader fieldReader = inputRecords.getFieldReader(field.getName()); + ArrowValueProjector arrowValueProjector = ProjectorUtils.createArrowValueProjector(fieldReader); + valueProjectors.add(arrowValueProjector); + } + + Block outputRecords = allocator.createBlock(outputSchema); + outputRecords.setRowCount(rowCount); + + try { + String outputFieldName = outputSchema.getFields().get(0).getName(); + FieldVector outputVector = outputRecords.getFieldVector(outputFieldName); + ArrowValueWriter outputProjector = WriterUtils.createArrowValueWriter(outputVector); + Object[] arguments = new Object[valueProjectors.size()]; + + for (int rowNum = 0; rowNum < rowCount; ++rowNum) { + for (int col = 0; col < valueProjectors.size(); ++col) { + arguments[col] = valueProjectors.get(col).project(rowNum); + } + + try { + Object result = udfMethod.invoke(this, arguments); + outputProjector.write(rowNum, result); + } + catch (IllegalAccessException | InvocationTargetException e) { + throw new RuntimeException(e); + } + catch (IllegalArgumentException e) { + String msg = String.format("%s. Expected function types %s, got types %s", + e.getMessage(), + Arrays.stream(udfMethod.getParameterTypes()).map(clazz -> clazz.getName()).collect(Collectors.toList()), + Arrays.stream(arguments).map(arg -> arg.getClass().getName()).collect(Collectors.toList())); + throw new RuntimeException(msg, e); + } + } + } + catch (Throwable t) { + try { + outputRecords.close(); + } + catch (Exception e) { + logger.error("Error closing output block", e); + } + throw t; + } + + return outputRecords; + } + + /** + * Use reflection to find tha java method that maches the UDF function defined in Athena SQL. + * @param req UDF request + * @return java method matching the UDF defined in Athena query. + */ + private Method extractScalarFunctionMethod(UserDefinedFunctionRequest req) + { + String methodName = req.getMethodName(); + Class[] argumentTypes = extractJavaTypes(req.getInputRecords().getSchema()); + Class[] returnTypes = extractJavaTypes(req.getOutputSchema()); + checkState(returnTypes.length == RETURN_COLUMN_COUNT, + String.format("Expecting %d return columns, found %d in method signature.", + RETURN_COLUMN_COUNT, returnTypes.length)); + Class returnType = returnTypes[0]; + + Method udfMethod; + try { + udfMethod = this.getClass().getMethod(methodName, argumentTypes); + logger.info(String.format("Found UDF method %s with input types [%s] and output types [%s]", + methodName, Arrays.toString(argumentTypes), returnType.getName())); + } + catch (NoSuchMethodException e) { + String msg = "Failed to find UDF method. " + e.getMessage() + + " Please make sure the method name contains only lowercase and the method signature (name and" + + " argument types) in Lambda matches the function signature defined in SQL."; + throw new RuntimeException(msg, e); + } + + if (!returnType.equals(udfMethod.getReturnType())) { + throw new IllegalArgumentException("signature return type " + returnType + + " does not match udf implementation return type " + udfMethod.getReturnType()); + } + + return udfMethod; + } + + private Class[] extractJavaTypes(Schema schema) + { + Class[] types = new Class[schema.getFields().size()]; + + List fields = schema.getFields(); + for (int i = 0; i < fields.size(); ++i) { + Types.MinorType minorType = Types.getMinorTypeForArrowType(fields.get(i).getType()); + types[i] = BlockUtils.getJavaType(minorType); + } + + return types; + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionRequest.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionRequest.java new file mode 100644 index 0000000000..64951c0535 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionRequest.java @@ -0,0 +1,105 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.udf; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.arrow.vector.types.pojo.Schema; + +import java.util.Objects; + +import static java.util.Objects.requireNonNull; + +public class UserDefinedFunctionRequest extends FederationRequest +{ + private final Block inputRecords; + private final Schema outputSchema; + private final String methodName; + private final UserDefinedFunctionType functionType; + + @JsonCreator + public UserDefinedFunctionRequest(@JsonProperty("identity") FederatedIdentity identity, + @JsonProperty("inputRecords") Block inputRecords, + @JsonProperty("outputSchema") Schema outputSchema, + @JsonProperty("methodName") String methodName, + @JsonProperty("functionType") UserDefinedFunctionType functionType) + { + super(identity); + this.inputRecords = requireNonNull(inputRecords, "inputRecords is null"); + this.outputSchema = requireNonNull(outputSchema, "outputSchema is null"); + this.methodName = requireNonNull(methodName, "methodName is null"); + this.functionType = requireNonNull(functionType, "functionType is null"); + } + + @Override + public void close() throws Exception + { + inputRecords.close(); + } + + @JsonProperty("inputRecords") + public Block getInputRecords() + { + return inputRecords; + } + + @JsonProperty("outputSchema") + public Schema getOutputSchema() + { + return outputSchema; + } + + @JsonProperty("methodName") + public String getMethodName() + { + return methodName; + } + + @JsonProperty("functionType") + public UserDefinedFunctionType getFunctionType() + { + return functionType; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (!(o instanceof UserDefinedFunctionRequest)) { + return false; + } + UserDefinedFunctionRequest that = (UserDefinedFunctionRequest) o; + return getInputRecords().equals(that.getInputRecords()) && + getOutputSchema().equals(that.getOutputSchema()) && + getMethodName().equals(that.getMethodName()) && + getFunctionType() == that.getFunctionType(); + } + + @Override + public int hashCode() + { + return Objects.hash(getInputRecords(), getOutputSchema(), getMethodName(), getFunctionType()); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionResponse.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionResponse.java new file mode 100644 index 0000000000..466dad6033 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionResponse.java @@ -0,0 +1,59 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.udf; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.request.FederationResponse; +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; + +import static java.util.Objects.requireNonNull; + +public class UserDefinedFunctionResponse extends FederationResponse +{ + private final Block records; + private final String methodName; + + @JsonCreator + public UserDefinedFunctionResponse(@JsonProperty("records") Block records, + @JsonProperty("methodName") String methodName) + { + this.records = requireNonNull(records, "records is null"); + this.methodName = requireNonNull(methodName, "methodName is null"); + } + + @JsonProperty("records") + public Block getRecords() + { + return records; + } + + @JsonProperty("methodName") + public String getMethodName() + { + return methodName; + } + + @Override + public void close() throws Exception + { + records.close(); + } +} diff --git a/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionType.java b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionType.java new file mode 100644 index 0000000000..b0a108fac4 --- /dev/null +++ b/athena-federation-sdk/src/main/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionType.java @@ -0,0 +1,25 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.udf; + +public enum UserDefinedFunctionType +{ + SCALAR +} diff --git a/athena-federation-sdk/src/main/resources/log4j.properties b/athena-federation-sdk/src/main/resources/log4j.properties new file mode 100644 index 0000000000..15b502c445 --- /dev/null +++ b/athena-federation-sdk/src/main/resources/log4j.properties @@ -0,0 +1,26 @@ +### +# #%L +# Amazon Athena Query Federation SDK +# %% +# Copyright (C) 2019 Amazon Web Services +# %% +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# #L% +### +log = . +log4j.rootLogger = INFO, LAMBDA + +#Define the LAMBDA appender +log4j.appender.LAMBDA=com.amazonaws.services.lambda.runtime.log4j.LambdaAppender +log4j.appender.LAMBDA.layout=org.apache.log4j.PatternLayout +log4j.appender.LAMBDA.layout.conversionPattern=%d{yyyy-MM-dd HH:mm:ss} <%X{AWSRequestId}> %-5p %c{1}:%m%n diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/QueryStatusCheckerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/QueryStatusCheckerTest.java new file mode 100644 index 0000000000..599759b661 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/QueryStatusCheckerTest.java @@ -0,0 +1,119 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda; + +import com.amazonaws.AmazonServiceException; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.model.GetQueryExecutionRequest; +import com.amazonaws.services.athena.model.GetQueryExecutionResult; +import com.amazonaws.services.athena.model.InvalidRequestException; +import com.amazonaws.services.athena.model.QueryExecution; +import com.amazonaws.services.athena.model.QueryExecutionStatus; +import com.google.common.collect.ImmutableList; +import org.junit.BeforeClass; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; +import org.mockito.stubbing.OngoingStubbing; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.List; +import java.util.Random; + +import static com.amazonaws.athena.connector.lambda.handlers.AthenaExceptionFilter.ATHENA_EXCEPTION_FILTER; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class QueryStatusCheckerTest +{ + private final ThrottlingInvoker athenaInvoker = ThrottlingInvoker.newDefaultBuilder(ATHENA_EXCEPTION_FILTER).build(); + + @Mock + private AmazonAthena athena; + + @Test + public void testFastTermination() + throws InterruptedException + { + String queryId = "query0"; + GetQueryExecutionRequest request = new GetQueryExecutionRequest().withQueryExecutionId(queryId); + when(athena.getQueryExecution(request)).thenReturn(new GetQueryExecutionResult().withQueryExecution(new QueryExecution().withStatus(new QueryExecutionStatus().withState("FAILED")))); + QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, queryId); + assertTrue(queryStatusChecker.isQueryRunning()); + Thread.sleep(2000); + assertFalse(queryStatusChecker.isQueryRunning()); + verify(athena, times(1)).getQueryExecution(any()); + } + + @Test + public void testSlowTermination() + throws InterruptedException + { + String queryId = "query1"; + GetQueryExecutionRequest request = new GetQueryExecutionRequest().withQueryExecutionId(queryId); + GetQueryExecutionResult result1and2 = new GetQueryExecutionResult().withQueryExecution(new QueryExecution().withStatus(new QueryExecutionStatus().withState("RUNNING"))); + GetQueryExecutionResult result3 = new GetQueryExecutionResult().withQueryExecution(new QueryExecution().withStatus(new QueryExecutionStatus().withState("SUCCEEDED"))); + when(athena.getQueryExecution(request)).thenReturn(result1and2).thenReturn(result1and2).thenReturn(result3); + try (QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, queryId)) { + assertTrue(queryStatusChecker.isQueryRunning()); + Thread.sleep(2000); + assertTrue(queryStatusChecker.isQueryRunning()); + Thread.sleep(3000); + assertFalse(queryStatusChecker.isQueryRunning()); + verify(athena, times(3)).getQueryExecution(any()); + } + } + + @Test + public void testNotFound() + throws InterruptedException + { + String queryId = "query2"; + GetQueryExecutionRequest request = new GetQueryExecutionRequest().withQueryExecutionId(queryId); + when(athena.getQueryExecution(request)).thenThrow(new InvalidRequestException("")); + try (QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, queryId)) { + assertTrue(queryStatusChecker.isQueryRunning()); + Thread.sleep(2000); + assertTrue(queryStatusChecker.isQueryRunning()); + verify(athena, times(1)).getQueryExecution(any()); + } + } + + @Test + public void testOtherError() + throws InterruptedException + { + String queryId = "query3"; + GetQueryExecutionRequest request = new GetQueryExecutionRequest().withQueryExecutionId(queryId); + when(athena.getQueryExecution(request)).thenThrow(new AmazonServiceException("")); + try (QueryStatusChecker queryStatusChecker = new QueryStatusChecker(athena, athenaInvoker, queryId)) { + assertTrue(queryStatusChecker.isQueryRunning()); + Thread.sleep(3000); + assertTrue(queryStatusChecker.isQueryRunning()); + verify(athena, times(2)).getQueryExecution(any()); + } + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/ThrottlingInvokerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/ThrottlingInvokerTest.java new file mode 100644 index 0000000000..a15093c5d1 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/ThrottlingInvokerTest.java @@ -0,0 +1,122 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda; + +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.exceptions.FederationThrottleException; +import org.junit.Test; + +import java.sql.Time; +import java.util.concurrent.TimeoutException; +import java.util.concurrent.atomic.AtomicLong; + +import static org.junit.Assert.*; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class ThrottlingInvokerTest +{ + + @Test + public void invokeNoThrottle() + throws TimeoutException + { + ThrottlingInvoker invoker = ThrottlingInvoker.newBuilder() + .withDecrease(0.5) + .withIncrease(10) + .withInitialDelayMs(10) + .withMaxDelayMs(2_000) + .withFilter((Exception ex) -> ex instanceof FederationThrottleException) + .build(); + + for (int i = 0; i < 100; i++) { + //Make a call and validate that the state didn't change + int result = invoker.invoke(() -> 1 + 1, 10_000); + assertEquals(2, result); + assertEquals(ThrottlingInvoker.State.FAST_START, invoker.getState()); + assertEquals(0, invoker.getDelay()); + } + } + + @Test + public void invokeWithThrottle() + throws TimeoutException + { + ThrottlingInvoker invoker = ThrottlingInvoker.newBuilder() + .withDecrease(0.8) + .withIncrease(1) + .withInitialDelayMs(10) + .withMaxDelayMs(200) + .withFilter((Exception ex) -> ex instanceof FederationThrottleException) + .build(); + + for (int i = 0; i < 5; i++) { + //Make a call and validate that the state didn't change + final AtomicLong count = new AtomicLong(0); + final int val = i; + long result = invoker.invoke(() -> { + if (count.incrementAndGet() < 4) { + throw new FederationThrottleException(); + } + return val; + } + , 10_000); + assertEquals(val, result); + assertEquals(4, count.get()); + assertEquals(ThrottlingInvoker.State.AVOIDANCE, invoker.getState()); + assertTrue(invoker.getDelay() > 0); + } + + assertEquals(199, invoker.getDelay()); + } + + @Test(expected = TimeoutException.class) + public void invokeWithThrottleTimeout() + throws TimeoutException + { + ThrottlingInvoker invoker = ThrottlingInvoker.newBuilder() + .withDecrease(0.5) + .withIncrease(10) + .withInitialDelayMs(10) + .withMaxDelayMs(500) + .withFilter((Exception ex) -> ex instanceof FederationThrottleException) + .build(); + + invoker.invoke(() -> {throw new FederationThrottleException();}, 2_000); + } + + @Test(expected = FederationThrottleException.class) + public void invokeWithThrottleNoSpill() + throws TimeoutException + { + BlockSpiller spiller = mock(BlockSpiller.class); + ThrottlingInvoker invoker = ThrottlingInvoker.newBuilder() + .withDecrease(0.5) + .withIncrease(10) + .withInitialDelayMs(10) + .withMaxDelayMs(500) + .withFilter((Exception ex) -> ex instanceof RuntimeException) + .withSpiller(spiller) + .build(); + + when(spiller.spilled()).thenReturn(false); + invoker.invoke(() -> {throw new RuntimeException();}, 2_000); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/BlockTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/BlockTest.java new file mode 100644 index 0000000000..a226f6d581 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/BlockTest.java @@ -0,0 +1,858 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.auth.AWSCredentialsProvider; +import com.amazonaws.auth.BasicAWSCredentials; +import com.amazonaws.auth.BasicSessionCredentials; +import com.amazonaws.internal.StaticCredentialsProvider; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.apache.arrow.vector.BigIntVector; +import org.apache.arrow.vector.BitVector; +import org.apache.arrow.vector.DateDayVector; +import org.apache.arrow.vector.DateMilliVector; +import org.apache.arrow.vector.DecimalVector; +import org.apache.arrow.vector.Float4Vector; +import org.apache.arrow.vector.Float8Vector; +import org.apache.arrow.vector.IntVector; +import org.apache.arrow.vector.SmallIntVector; +import org.apache.arrow.vector.TinyIntVector; +import org.apache.arrow.vector.UInt1Vector; +import org.apache.arrow.vector.UInt2Vector; +import org.apache.arrow.vector.UInt4Vector; +import org.apache.arrow.vector.UInt8Vector; +import org.apache.arrow.vector.ValueVector; +import org.apache.arrow.vector.VarBinaryVector; +import org.apache.arrow.vector.VarCharVector; +import org.apache.arrow.vector.complex.ListVector; +import org.apache.arrow.vector.complex.StructVector; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.complex.reader.Float8Reader; +import org.apache.arrow.vector.complex.reader.IntReader; +import org.apache.arrow.vector.complex.reader.VarCharReader; +import org.apache.arrow.vector.ipc.message.ArrowRecordBatch; +import org.apache.arrow.vector.types.DateUnit; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.arrow.vector.util.Text; +import org.apache.commons.codec.Charsets; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import javax.activation.UnsupportedDataTypeException; +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.math.BigDecimal; +import java.math.RoundingMode; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +//TODO: Consider breaking up this test into 3 separate tests but the setup for the test would be error prone +// Also having them condensed like this gives a good example for how to use Apache Arrow. +public class BlockTest +{ + private static final Logger logger = LoggerFactory.getLogger(BlockTest.class); + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void constrainedBlockTest() + throws Exception + { + Schema schema = SchemaBuilder.newBuilder() + .addIntField("col1") + .addIntField("col2") + .build(); + + Block block = allocator.createBlock(schema); + + ValueSet col1Constraint = EquatableValueSet.newBuilder(allocator, Types.MinorType.INT.getType(), true, false) + .add(10).build(); + Constraints constraints = new Constraints(Collections.singletonMap("col1", col1Constraint)); + try (ConstraintEvaluator constraintEvaluator = new ConstraintEvaluator(allocator, schema, constraints)) { + block.constrain(constraintEvaluator); + assertTrue(block.setValue("col1", 0, 10)); + assertTrue(block.offerValue("col1", 0, 10)); + assertFalse(block.setValue("col1", 0, 11)); + assertFalse(block.offerValue("col1", 0, 11)); + assertTrue(block.offerValue("unkown_col", 0, 10)); + } + } + + //TODO: Break this into multiple smaller tests, probably primitive types vs. complex vs. nested complex + //TODO: List of Lists + //TODO: List of Structs + @Test + public void EndToEndBlockTest() + throws Exception + { + BlockAllocatorImpl expectedAllocator = new BlockAllocatorImpl(); + + int expectedRows = 20; + + Schema origSchema = generateTestSchema(); + Block expectedBlock = generateTestBlock(expectedAllocator, origSchema, expectedRows); + + RecordBatchSerDe expectSerDe = new RecordBatchSerDe(expectedAllocator); + ByteArrayOutputStream blockOut = new ByteArrayOutputStream(); + ArrowRecordBatch expectedBatch = expectedBlock.getRecordBatch(); + expectSerDe.serialize(expectedBatch, blockOut); + expectedBatch.close(); + expectedBlock.close(); + + ByteArrayOutputStream schemaOut = new ByteArrayOutputStream(); + SchemaSerDe schemaSerDe = new SchemaSerDe(); + schemaSerDe.serialize(origSchema, schemaOut); + Schema actualSchema = schemaSerDe.deserialize(new ByteArrayInputStream(schemaOut.toByteArray())); + + BlockAllocatorImpl actualAllocator = new BlockAllocatorImpl(); + RecordBatchSerDe actualSerDe = new RecordBatchSerDe(actualAllocator); + ArrowRecordBatch batch = actualSerDe.deserialize(blockOut.toByteArray()); + + /** + * Generate and write the block + */ + Block actualBlock = actualAllocator.createBlock(actualSchema); + actualBlock.loadRecordBatch(batch); + batch.close(); + + for (int i = 0; i < actualBlock.getRowCount(); i++) { + logger.info("EndToEndBlockTest: util {}", BlockUtils.rowToString(actualBlock, i)); + } + + assertEquals("Row count missmatch", expectedRows, actualBlock.getRowCount()); + int actualFieldCount = 1; + for (Field next : actualBlock.getFields()) { + FieldReader vector = actualBlock.getFieldReader(next.getName()); + switch (vector.getMinorType()) { + case INT: + IntReader intVector = vector; + for (int i = 0; i < actualBlock.getRowCount(); i++) { + intVector.setPosition(i); + assertEquals(i * actualFieldCount * 3, intVector.readInteger().intValue()); + } + break; + case FLOAT8: + Float8Reader fVector = vector; + for (int i = 0; i < actualBlock.getRowCount(); i++) { + fVector.setPosition(i); + assertEquals(i * actualFieldCount * 1.1, fVector.readDouble().doubleValue(), .1); + } + break; + case VARCHAR: + VarCharReader vVector = (VarCharReader) vector; + for (int i = 0; i < actualBlock.getRowCount(); i++) { + vVector.setPosition(i); + assertEquals(String.valueOf(i * actualFieldCount), vVector.readText().toString()); + } + break; + case STRUCT: + for (int i = 0; i < actualBlock.getRowCount(); i++) { + FieldReader effectiveVector = vector; + if (vector.getField().getName().equals("structFieldNested28")) { + //If this is our struct with a nested struct, then grab the nested struct and check it + effectiveVector = vector.reader("nestedStruct"); + } + effectiveVector.setPosition(i); + assertEquals(" name: " + effectiveVector.getField().getName(), Long.valueOf(i), effectiveVector.reader("nestedBigInt").readLong()); + assertEquals(String.valueOf(1000 + i), effectiveVector.reader("nestedString").readText().toString()); + } + break; + case LIST: + int actual = 0; + Field child = vector.getField().getChildren().get(0); + for (int i = 0; i < actualBlock.getRowCount(); i++) { + vector.setPosition(i); + int entryValues = 0; + while (vector.next()) { + if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.BIGINT) { + assertEquals(Long.valueOf(i + entryValues++), vector.reader().readLong()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.VARCHAR) { + assertEquals(String.valueOf(1000 + i + entryValues++), vector.reader().readText().toString()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.SMALLINT) { + entryValues++; + assertTrue((new Integer(i + 1)).shortValue() == vector.reader().readShort()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.INT) { + entryValues++; + assertTrue(i == vector.reader().readInteger()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.TINYINT) { + entryValues++; + assertTrue((byte) i == vector.reader().readByte()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.FLOAT4) { + entryValues++; + assertTrue(i * 1.0F == vector.reader().readFloat()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.FLOAT8) { + entryValues++; + assertTrue(i * 1.0D == vector.reader().readDouble()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.DECIMAL) { + entryValues++; + assertTrue(i * 100L == vector.reader().readBigDecimal().unscaledValue().longValue()); + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.VARBINARY) { + entryValues++; + //no comparing + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.BIT) { + entryValues++; + assertTrue((i % 2 == 1) == vector.reader().readBoolean()); + } + } + if (entryValues > 0) {actual++;} + } + + assertEquals("failed for " + vector.getField().getName(), actualBlock.getRowCount(), actual); + break; + default: + //todo: add more types here. + } + actualFieldCount++; + } + + /** + * Now check that we can unset a row properly + */ + BlockUtils.unsetRow(0, actualBlock); + + for (Field next : actualBlock.getFields()) { + FieldReader vector = actualBlock.getFieldReader(next.getName()); + switch (vector.getMinorType()) { + case DATEDAY: + case DATEMILLI: + case TINYINT: + case UINT1: + case SMALLINT: + case UINT2: + case UINT4: + case INT: + case UINT8: + case BIGINT: + case FLOAT4: + case FLOAT8: + case DECIMAL: + case VARBINARY: + case VARCHAR: + case BIT: + case STRUCT: + vector.setPosition(0); + assertFalse("Failed for " + vector.getMinorType() + " " + next.getName(), vector.isSet()); + break; + case LIST: + //no supported for unsetRow(...) this is a TODO to see if its possible some other way + break; + default: + throw new UnsupportedDataTypeException(next.getType().getTypeID() + " is not supported"); + } + actualFieldCount++; + } + + logger.info("EndToEndBlockTest: block size {}", actualAllocator.getUsage()); + actualBlock.close(); + } + + public static Schema generateTestSchema() { + /** + * Generate and write the schema + */ + SchemaBuilder schemaBuilder = new SchemaBuilder(); + schemaBuilder.addMetadata("meta1", "meta-value-1"); + schemaBuilder.addMetadata("meta2", "meta-value-2"); + schemaBuilder.addField("intfield1", new ArrowType.Int(32, true)); + schemaBuilder.addField("doublefield2", new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)); + schemaBuilder.addField("varcharfield3", new ArrowType.Utf8()); + + schemaBuilder.addField("datemillifield4", new ArrowType.Date(DateUnit.MILLISECOND)); + schemaBuilder.addField("tinyintfield5", new ArrowType.Int(8, true)); + schemaBuilder.addField("uint1field6", new ArrowType.Int(8, false)); + schemaBuilder.addField("smallintfield7", new ArrowType.Int(16, true)); + schemaBuilder.addField("uint2field8", new ArrowType.Int(16, false)); + schemaBuilder.addField("datedayfield9", new ArrowType.Date(DateUnit.DAY)); + schemaBuilder.addField("uint4field10", new ArrowType.Int(32, false)); + schemaBuilder.addField("bigintfield11", new ArrowType.Int(64, true)); + schemaBuilder.addField("decimalfield12", new ArrowType.Decimal(10, 2)); + schemaBuilder.addField("floatfield13", new ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)); + schemaBuilder.addField("varbinaryfield14", new ArrowType.Binary()); + schemaBuilder.addField("bitfield15", new ArrowType.Bool()); + + schemaBuilder.addListField("varcharlist16", Types.MinorType.VARCHAR.getType()); + schemaBuilder.addListField("intlist17", Types.MinorType.INT.getType()); + schemaBuilder.addListField("bigintlist18", Types.MinorType.BIGINT.getType()); + schemaBuilder.addListField("tinyintlist19", Types.MinorType.TINYINT.getType()); + schemaBuilder.addListField("smallintlist20", Types.MinorType.SMALLINT.getType()); + schemaBuilder.addListField("float4list21", Types.MinorType.FLOAT4.getType()); + schemaBuilder.addListField("float8list22", Types.MinorType.FLOAT8.getType()); + schemaBuilder.addListField("shortdeclist23", new ArrowType.Decimal(10, 2)); + schemaBuilder.addListField("londdeclist24", new ArrowType.Decimal(21, 2)); + schemaBuilder.addListField("varbinarylist25", Types.MinorType.VARBINARY.getType()); + schemaBuilder.addListField("bitlist26", Types.MinorType.BIT.getType()); + + schemaBuilder.addStructField("structField27"); + schemaBuilder.addChildField("structField27", "nestedBigInt", Types.MinorType.BIGINT.getType()); + schemaBuilder.addChildField("structField27", "nestedString", Types.MinorType.VARCHAR.getType()); + schemaBuilder.addChildField("structField27", "tinyintcol", Types.MinorType.TINYINT.getType()); + schemaBuilder.addChildField("structField27", "smallintcol", Types.MinorType.SMALLINT.getType()); + schemaBuilder.addChildField("structField27", "float4Col", Types.MinorType.FLOAT4.getType()); + schemaBuilder.addChildField("structField27", "float8Col", Types.MinorType.FLOAT8.getType()); + schemaBuilder.addChildField("structField27", "shortDecCol", new ArrowType.Decimal(10, 2)); + schemaBuilder.addChildField("structField27", "longDecCol", new ArrowType.Decimal(21, 2)); + schemaBuilder.addChildField("structField27", "binaryCol", Types.MinorType.VARBINARY.getType()); + schemaBuilder.addChildField("structField27", "bitCol", Types.MinorType.BIT.getType()); + schemaBuilder.addStructField("structFieldNested28"); + schemaBuilder.addChildField("structFieldNested28", "bitCol", Types.MinorType.BIT.getType()); + schemaBuilder.addChildField("structFieldNested28", + FieldBuilder.newBuilder("nestedStruct", new ArrowType.Struct()) + .addField("nestedString", Types.MinorType.VARCHAR.getType(), null) + .addField("nestedBigInt", Types.MinorType.BIGINT.getType(), null) + .addListField("nestedList", Types.MinorType.VARCHAR.getType()) + .addListField("nestedListDec", new ArrowType.Decimal(10, 2)) + .build()); + return schemaBuilder.build(); + } + + + public static Block generateTestBlock(BlockAllocatorImpl expectedAllocator, Schema origSchema, int expectedRows) throws UnsupportedDataTypeException { + /** + * Generate and write the block + */ + Block expectedBlock = expectedAllocator.createBlock(origSchema); + int fieldCount = 1; + for (Field next : origSchema.getFields()) { + ValueVector vector = expectedBlock.getFieldVector(next.getName()); + switch (vector.getMinorType()) { + case DATEDAY: + DateDayVector dateDayVector = (DateDayVector) vector; + for (int i = 0; i < expectedRows; i++) { + dateDayVector.setSafe(i, i * fieldCount); + } + break; + case UINT4: + UInt4Vector uInt4Vector = (UInt4Vector) vector; + for (int i = 0; i < expectedRows; i++) { + uInt4Vector.setSafe(i, i * fieldCount * 2); + } + break; + case INT: + IntVector intVector = (IntVector) vector; + for (int i = 0; i < expectedRows; i++) { + intVector.setSafe(i, i * fieldCount * 3); + } + break; + case FLOAT8: + Float8Vector fVector = (Float8Vector) vector; + for (int i = 0; i < expectedRows; i++) { + fVector.setSafe(i, i * fieldCount * 1.1); + } + break; + case VARCHAR: + VarCharVector vVector = (VarCharVector) vector; + for (int i = 0; i < expectedRows; i++) { + vVector.setSafe(i, String.valueOf(i * fieldCount).getBytes(Charsets.UTF_8)); + } + break; + case DATEMILLI: + DateMilliVector dateMilliVector = (DateMilliVector) vector; + for (int i = 0; i < expectedRows; i++) { + dateMilliVector.setSafe(i, i * fieldCount * 4); + } + break; + case TINYINT: + TinyIntVector tinyIntVector = (TinyIntVector) vector; + for (int i = 0; i < expectedRows; i++) { + tinyIntVector.setSafe(i, i * fieldCount * 5); + } + break; + case UINT1: + UInt1Vector uInt1Vector = (UInt1Vector) vector; + for (int i = 0; i < expectedRows; i++) { + uInt1Vector.setSafe(i, i * fieldCount * 6); + } + break; + case SMALLINT: + SmallIntVector smallIntVector = (SmallIntVector) vector; + for (int i = 0; i < expectedRows; i++) { + smallIntVector.setSafe(i, i * fieldCount * 7); + } + break; + case UINT2: + UInt2Vector uInt2Vector = (UInt2Vector) vector; + for (int i = 0; i < expectedRows; i++) { + uInt2Vector.setSafe(i, i * fieldCount * 8); + } + break; + case UINT8: + UInt8Vector uInt8Vector = (UInt8Vector) vector; + for (int i = 0; i < expectedRows; i++) { + uInt8Vector.setSafe(i, i * fieldCount * 9); + } + break; + case BIGINT: + BigIntVector bigIntVector = (BigIntVector) vector; + for (int i = 0; i < expectedRows; i++) { + bigIntVector.setSafe(i, i * fieldCount * 10); + } + break; + case DECIMAL: + DecimalVector decimalVector = (DecimalVector) vector; + for (int i = 0; i < expectedRows; i++) { + BigDecimal bigDecimal = new BigDecimal((double) (i * fieldCount) * 1.01); + bigDecimal = bigDecimal.setScale(2, RoundingMode.HALF_UP); + decimalVector.setSafe(i, bigDecimal); + } + break; + case FLOAT4: + Float4Vector float4Vector = (Float4Vector) vector; + for (int i = 0; i < expectedRows; i++) { + float4Vector.setSafe(i, i * fieldCount * 9); + } + break; + case VARBINARY: + VarBinaryVector varBinaryVector = (VarBinaryVector) vector; + for (int i = 0; i < expectedRows; i++) { + byte[] data = String.valueOf(i * fieldCount).getBytes(); + varBinaryVector.setSafe(i, data); + } + break; + case BIT: + BitVector bitVector = (BitVector) vector; + for (int i = 0; i < expectedRows; i++) { + bitVector.setSafe(i, i % 2); + } + break; + case STRUCT: + StructVector sVector = (StructVector) vector; + for (int i = 0; i < expectedRows; i++) { + final int seed = i; + BlockUtils.setComplexValue(sVector, i, (Field field, Object value) -> { + if (field.getName().equals("nestedBigInt")) { + return (long) seed; + } + if (field.getName().equals("nestedString")) { + return String.valueOf(1000 + seed); + } + if (field.getName().equals("tinyintcol")) { + return (byte) seed; + } + + if (field.getName().equals("smallintcol")) { + return (short) seed; + } + + if (field.getName().equals("nestedList")) { + List values = new ArrayList<>(); + values.add("val1"); + values.add("val2"); + return values; + } + + if (field.getName().equals("nestedListDec")) { + List values = new ArrayList<>(); + values.add(2.0D); + values.add(2.2D); + return values; + } + + if (field.getName().equals("float4Col")) { + return seed * 1.0F; + } + if (field.getName().equals("float8Col")) { + return seed * 2.0D; + } + if (field.getName().equals("shortDecCol")) { + return seed * 3.0D; + } + if (field.getName().equals("longDecCol")) { + return seed * 4.0D; + } + if (field.getName().equals("binaryCol")) { + return String.valueOf(seed).getBytes(Charsets.UTF_8); + } + if (field.getName().equals("bitCol")) { + return seed % 2 == 1; + } + if (field.getName().equals("nestedStruct")) { + //doesn't matter since we are generating the values for the struct + //it just needs to be non-null + return new Object(); + } + + throw new RuntimeException("Unexpected field " + field.getName()); + }, new Object()); + } + break; + case LIST: + Field child = vector.getField().getChildren().get(0); + + if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.BIGINT) { + for (int i = 0; i < expectedRows; i++) { + List values = new ArrayList<>(); + values.add(Long.valueOf(i)); + values.add(i + 1L); + values.add(i + 2L); + BlockUtils.setComplexValue((ListVector) vector, i, + FieldResolver.DEFAULT, + values); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.VARCHAR) { + for (int i = 0; i < expectedRows; i++) { + List values = new ArrayList<>(); + values.add(String.valueOf(1000 + i)); + values.add(String.valueOf(1000 + i + 1)); + values.add(String.valueOf(1000 + i + 2)); + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + values); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.SMALLINT) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList((short) (i + 1))); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.INT) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList(i)); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.TINYINT) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList((byte) i)); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.FLOAT4) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList((i * 1.0F))); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.FLOAT8) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList((i * 1.0D))); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.DECIMAL) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList((i * 1.0D))); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.VARBINARY) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList(String.valueOf(i).getBytes(Charsets.UTF_8))); + } + } + else if (Types.getMinorTypeForArrowType(child.getType()) == Types.MinorType.BIT) { + for (int i = 0; i < expectedRows; i++) { + BlockUtils.setComplexValue((ListVector) vector, + i, + FieldResolver.DEFAULT, + Collections.singletonList(i % 2 == 1)); + } + } + break; + default: + throw new UnsupportedDataTypeException(vector.getMinorType() + " is not supported"); + } + fieldCount++; + } + expectedBlock.setRowCount(expectedRows); + + return expectedBlock; + } + + @Test + public void ListOfListsTest() + throws Exception + { + BlockAllocatorImpl expectedAllocator = new BlockAllocatorImpl(); + + /** + * Generate and write the schema + */ + SchemaBuilder schemaBuilder = new SchemaBuilder(); + schemaBuilder.addField( + FieldBuilder.newBuilder("outerlist", new ArrowType.List()) + .addListField("innerList", Types.MinorType.VARCHAR.getType()) + .build()); + Schema origSchema = schemaBuilder.build(); + + /** + * Generate and write the block + */ + Block expectedBlock = expectedAllocator.createBlock(origSchema); + + int expectedRows = 20; + for (Field next : origSchema.getFields()) { + ValueVector vector = expectedBlock.getFieldVector(next.getName()); + switch (vector.getMinorType()) { + case LIST: + Field child = vector.getField().getChildren().get(0); + for (int i = 0; i < expectedRows; i++) { + //For each row + List> value = new ArrayList<>(); + switch (Types.getMinorTypeForArrowType(child.getType())) { + case LIST: + List values = new ArrayList<>(); + values.add(String.valueOf(1000)); + values.add(String.valueOf(1001)); + values.add(String.valueOf(1002)); + value.add(values); + break; + default: + throw new UnsupportedDataTypeException(vector.getMinorType() + " is not supported"); + } + BlockUtils.setComplexValue((ListVector) vector, i, FieldResolver.DEFAULT, value); + } + break; + default: + throw new UnsupportedDataTypeException(vector.getMinorType() + " is not supported"); + } + } + expectedBlock.setRowCount(expectedRows); + + RecordBatchSerDe expectSerDe = new RecordBatchSerDe(expectedAllocator); + ByteArrayOutputStream blockOut = new ByteArrayOutputStream(); + ArrowRecordBatch expectedBatch = expectedBlock.getRecordBatch(); + expectSerDe.serialize(expectedBatch, blockOut); + expectedBatch.close(); + expectedBlock.close(); + + ByteArrayOutputStream schemaOut = new ByteArrayOutputStream(); + SchemaSerDe schemaSerDe = new SchemaSerDe(); + schemaSerDe.serialize(origSchema, schemaOut); + Schema actualSchema = schemaSerDe.deserialize(new ByteArrayInputStream(schemaOut.toByteArray())); + + BlockAllocatorImpl actualAllocator = new BlockAllocatorImpl(); + RecordBatchSerDe actualSerDe = new RecordBatchSerDe(actualAllocator); + ArrowRecordBatch batch = actualSerDe.deserialize(blockOut.toByteArray()); + + /** + * Generate and write the block + */ + Block actualBlock = actualAllocator.createBlock(actualSchema); + actualBlock.loadRecordBatch(batch); + batch.close(); + + for (int i = 0; i < actualBlock.getRowCount(); i++) { + logger.info("ListOfList: util {}", BlockUtils.rowToString(actualBlock, i)); + } + + assertEquals("Row count missmatch", expectedRows, actualBlock.getRowCount()); + int actualFieldCount = 1; + for (Field next : actualBlock.getFields()) { + FieldReader vector = actualBlock.getFieldReader(next.getName()); + switch (vector.getMinorType()) { + case LIST: + int actual = 0; + for (int i = 0; i < actualBlock.getRowCount(); i++) { + vector.setPosition(i); + int entryValues = 0; + while (vector.next()) { + FieldReader innerReader = vector.reader(); + int j = 0; + while (innerReader.next()) { + entryValues++; + assertEquals(String.valueOf(1000 + j++), innerReader.reader().readText().toString()); + } + } + if (entryValues > 0) {actual++;} + } + + assertEquals("failed for " + vector.getField().getName(), actualBlock.getRowCount(), actual); + break; + default: + throw new UnsupportedDataTypeException(next.getType().getTypeID() + " is not supported"); + } + actualFieldCount++; + } + + actualBlock.close(); + } + + @Test + public void ListOfStructsTest() + throws Exception + { + BlockAllocatorImpl expectedAllocator = new BlockAllocatorImpl(); + + /** + * Generate and write the schema + */ + SchemaBuilder schemaBuilder = new SchemaBuilder(); + schemaBuilder.addField( + FieldBuilder.newBuilder("outerlist", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("innerStruct", Types.MinorType.STRUCT.getType()) + .addStringField("varchar") + .addBigIntField("bigint") + .build()) + .build()); + Schema origSchema = schemaBuilder.build(); + + /** + * Generate and write the block + */ + Block expectedBlock = expectedAllocator.createBlock(origSchema); + + int expectedRows = 20; + for (Field next : origSchema.getFields()) { + ValueVector vector = expectedBlock.getFieldVector(next.getName()); + switch (vector.getMinorType()) { + case LIST: + Field child = vector.getField().getChildren().get(0); + for (int i = 0; i < expectedRows; i++) { + //For each row + List> value = new ArrayList<>(); + switch (Types.getMinorTypeForArrowType(child.getType())) { + case STRUCT: + Map values = new HashMap<>(); + values.put("varchar", "chars"); + values.put("bigint", 100L); + value.add(values); + break; + default: + throw new UnsupportedDataTypeException(vector.getMinorType() + " is not supported"); + } + BlockUtils.setComplexValue((ListVector) vector, i, FieldResolver.DEFAULT, value); + } + break; + default: + throw new UnsupportedDataTypeException(vector.getMinorType() + " is not supported"); + } + } + expectedBlock.setRowCount(expectedRows); + + RecordBatchSerDe expectSerDe = new RecordBatchSerDe(expectedAllocator); + ByteArrayOutputStream blockOut = new ByteArrayOutputStream(); + ArrowRecordBatch expectedBatch = expectedBlock.getRecordBatch(); + expectSerDe.serialize(expectedBatch, blockOut); + expectedBatch.close(); + expectedBlock.close(); + + ByteArrayOutputStream schemaOut = new ByteArrayOutputStream(); + SchemaSerDe schemaSerDe = new SchemaSerDe(); + schemaSerDe.serialize(origSchema, schemaOut); + Schema actualSchema = schemaSerDe.deserialize(new ByteArrayInputStream(schemaOut.toByteArray())); + + BlockAllocatorImpl actualAllocator = new BlockAllocatorImpl(); + RecordBatchSerDe actualSerDe = new RecordBatchSerDe(actualAllocator); + ArrowRecordBatch batch = actualSerDe.deserialize(blockOut.toByteArray()); + + /** + * Generate and write the block + */ + Block actualBlock = actualAllocator.createBlock(actualSchema); + actualBlock.loadRecordBatch(batch); + batch.close(); + + for (int i = 0; i < actualBlock.getRowCount(); i++) { + logger.info("ListOfList: util {}", BlockUtils.rowToString(actualBlock, i)); + } + + assertEquals("Row count missmatch", expectedRows, actualBlock.getRowCount()); + int actualFieldCount = 1; + for (Field next : actualBlock.getFields()) { + FieldReader vector = actualBlock.getFieldReader(next.getName()); + switch (vector.getMinorType()) { + case LIST: + int actual = 0; + for (int i = 0; i < actualBlock.getRowCount(); i++) { + vector.setPosition(i); + int entryValues = 0; + while (vector.next()) { + entryValues++; + assertEquals("chars", vector.reader().reader("varchar").readText().toString()); + assertEquals(Long.valueOf(100), vector.reader().reader("bigint").readLong()); + } + if (entryValues > 0) {actual++;} + } + + assertEquals("failed for " + vector.getField().getName(), actualBlock.getRowCount(), actual); + break; + default: + throw new UnsupportedDataTypeException(next.getType().getTypeID() + " is not supported"); + } + actualFieldCount++; + } + + actualBlock.close(); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/BlockUtilsTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/BlockUtilsTest.java new file mode 100644 index 0000000000..ad3da41974 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/BlockUtilsTest.java @@ -0,0 +1,104 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.math.BigDecimal; +import java.time.LocalDate; +import java.time.LocalDateTime; +import java.util.List; +import java.util.Map; + +import static com.amazonaws.athena.connector.lambda.data.BlockUtils.UTC_ZONE_ID; +import static org.junit.Assert.*; + +public class BlockUtilsTest +{ + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void after() + { + allocator.close(); + } + + @Test + public void copyRows() + { + Schema schema = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .addField("col2", new ArrowType.Decimal(38, 9)) + .build(); + + //Make a block with 3 rows + Block src = allocator.createBlock(schema); + BlockUtils.setValue(src.getFieldVector("col1"), 0, 10); + BlockUtils.setValue(src.getFieldVector("col2"), 0, new BigDecimal(20)); + BlockUtils.setValue(src.getFieldVector("col1"), 1, 11); + BlockUtils.setValue(src.getFieldVector("col2"), 1, new BigDecimal(21)); + BlockUtils.setValue(src.getFieldVector("col1"), 2, 12); + BlockUtils.setValue(src.getFieldVector("col2"), 2, new BigDecimal(22)); + src.setRowCount(3); + + //Make the destination block + Block dst = allocator.createBlock(schema); + + assertEquals(3, BlockUtils.copyRows(src, dst, 0, 2)); + + assertEquals(src, dst); + } + + @Test + public void isNullRow() + { + Schema schema = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .addField("col2", new ArrowType.Int(32, true)) + .build(); + + //Make a block with 2 rows and no null rows + Block block = allocator.createBlock(schema); + BlockUtils.setValue(block.getFieldVector("col1"), 0, 10); + BlockUtils.setValue(block.getFieldVector("col2"), 0, 20); + BlockUtils.setValue(block.getFieldVector("col1"), 1, 11); + BlockUtils.setValue(block.getFieldVector("col2"), 1, 21); + block.setRowCount(2); + + assertFalse(BlockUtils.isNullRow(block, 1)); + + //now set a row to null + BlockUtils.unsetRow(1, block); + assertTrue(BlockUtils.isNullRow(block, 1)); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpillerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpillerTest.java new file mode 100644 index 0000000000..69f9dd7b51 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/S3BlockSpillerTest.java @@ -0,0 +1,207 @@ +package com.amazonaws.athena.connector.lambda.data; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.mockito.stubbing.Answer; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.reset; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.verifyNoMoreInteractions; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class S3BlockSpillerTest +{ + private static final Logger logger = LoggerFactory.getLogger(S3BlockSpillerTest.class); + + private String bucket = "MyBucket"; + private String prefix = "blocks/spill"; + private String requestId = "requestId"; + private String splitId = "splitId"; + + @Mock + private AmazonS3 mockS3; + + private S3BlockSpiller blockWriter; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private Block expected; + private BlockAllocatorImpl allocator; + private SpillConfig spillConfig; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + + Schema schema = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .addField("col2", new ArrowType.Utf8()) + .build(); + + spillConfig = SpillConfig.newBuilder().withEncryptionKey(keyFactory.create()) + .withRequestId(requestId) + .withSpillLocation(S3SpillLocation.newBuilder() + .withBucket(bucket) + .withPrefix(prefix) + .withQueryId(requestId) + .withSplitId(splitId) + .withIsDirectory(true) + .build()) + .withRequestId(requestId) + .build(); + + blockWriter = new S3BlockSpiller(mockS3, spillConfig, allocator, schema, ConstraintEvaluator.emptyEvaluator()); + + expected = allocator.createBlock(schema); + BlockUtils.setValue(expected.getFieldVector("col1"), 1, 100); + BlockUtils.setValue(expected.getFieldVector("col2"), 1, "VarChar"); + BlockUtils.setValue(expected.getFieldVector("col1"), 1, 101); + BlockUtils.setValue(expected.getFieldVector("col2"), 1, "VarChar1"); + expected.setRowCount(2); + } + + @After + public void tearDown() + throws Exception + { + expected.close(); + allocator.close(); + blockWriter.close(); + } + + @Test + public void spillTest() + throws IOException + { + logger.info("spillTest: enter"); + + logger.info("spillTest: starting write test"); + + final ByteHolder byteHolder = new ByteHolder(); + + when(mockS3.putObject(eq(bucket), anyString(), anyObject(), anyObject())) + .thenAnswer(new Answer() + { + @Override + public Object answer(InvocationOnMock invocationOnMock) + throws Throwable + { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + return mock(PutObjectResult.class); + } + }); + + SpillLocation blockLocation = blockWriter.write(expected); + + if (blockLocation instanceof S3SpillLocation) { + assertEquals(bucket, ((S3SpillLocation) blockLocation).getBucket()); + assertEquals(prefix + "/" + requestId + "/" + splitId + ".0", ((S3SpillLocation) blockLocation).getKey()); + } + + SpillLocation blockLocation2 = blockWriter.write(expected); + + if (blockLocation2 instanceof S3SpillLocation) { + assertEquals(bucket, ((S3SpillLocation) blockLocation2).getBucket()); + assertEquals(prefix + "/" + requestId + "/" + splitId + ".1", ((S3SpillLocation) blockLocation2).getKey()); + } + + verify(mockS3, times(1)) + .putObject(eq(bucket), eq(prefix + "/" + requestId + "/" + splitId + ".0"), anyObject(), anyObject()); + verify(mockS3, times(1)) + .putObject(eq(bucket), eq(prefix + "/" + requestId + "/" + splitId + ".1"), anyObject(), anyObject()); + + verifyNoMoreInteractions(mockS3); + reset(mockS3); + + logger.info("spillTest: Starting read test."); + + when(mockS3.getObject(eq(bucket), eq(prefix + "/" + requestId + "/" + splitId + ".1"))) + .thenAnswer(new Answer() + { + @Override + public Object answer(InvocationOnMock invocationOnMock) + throws Throwable + { + S3Object mockObject = mock(S3Object.class); + when(mockObject.getObjectContent()).thenReturn(new S3ObjectInputStream(new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + } + }); + + Block block = blockWriter.read((S3SpillLocation) blockLocation2, spillConfig.getEncryptionKey(), expected.getSchema()); + + assertEquals(expected, block); + + verify(mockS3, times(1)) + .getObject(eq(bucket), eq(prefix + "/" + requestId + "/" + splitId + ".1")); + + verifyNoMoreInteractions(mockS3); + + logger.info("spillTest: exit"); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/UnitTestBlockUtils.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/UnitTestBlockUtils.java new file mode 100644 index 0000000000..2707344116 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/data/UnitTestBlockUtils.java @@ -0,0 +1,147 @@ +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connector.lambda.data; + +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.util.Text; + +import java.time.Instant; +import java.time.LocalDate; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Objects; + +import static com.amazonaws.athena.connector.lambda.data.BlockUtils.UTC_ZONE_ID; + +public class UnitTestBlockUtils +{ + + /** + * Used to get values (Integer, Double, String, etc) from the Arrow values in the fieldReader + * @param fieldReader the field reader containing the arrow value + * @return the value object in java type + */ + public static Object getValue(FieldReader fieldReader, int pos) + { + fieldReader.setPosition(pos); + + Types.MinorType minorType = fieldReader.getMinorType(); + switch (minorType) { + case DATEMILLI: + if (Objects.isNull(fieldReader.readLocalDateTime())) { + return null; + } + long millis = fieldReader.readLocalDateTime().toDateTime(org.joda.time.DateTimeZone.UTC).getMillis(); + return Instant.ofEpochMilli(millis).atZone(UTC_ZONE_ID).toLocalDateTime(); + case TINYINT: + case UINT1: + return fieldReader.readByte(); + case UINT2: + return fieldReader.readCharacter(); + case SMALLINT: + return fieldReader.readShort(); + case DATEDAY: + Integer intVal = fieldReader.readInteger(); + if (Objects.isNull(intVal)) { + return null; + } + return LocalDate.ofEpochDay(intVal); + case INT: + case UINT4: + return fieldReader.readInteger(); + case UINT8: + case BIGINT: + return fieldReader.readLong(); + case DECIMAL: + return fieldReader.readBigDecimal(); + case FLOAT4: + return fieldReader.readFloat(); + case FLOAT8: + return fieldReader.readDouble(); + case VARCHAR: + Text text = fieldReader.readText(); + if (Objects.isNull(text)) { + return null; + } + return text.toString(); + case VARBINARY: + return fieldReader.readByteArray(); + case BIT: + return fieldReader.readBoolean(); + case LIST: + return readList(fieldReader); + case STRUCT: + return readStruct(fieldReader); + default: + throw new IllegalArgumentException("Unsupported type " + minorType); + } + } + + /** + * Recursively read the values of a complex list + * @param listReader + * @return + */ + private static List readList(FieldReader listReader) + { + if (!listReader.isSet()) { + return null; + } + + List list = new ArrayList<>(); + + while (listReader.next()) { + FieldReader subReader = listReader.reader(); + if (!subReader.isSet()) { + list.add(null); + continue; + } + list.add(getValue(subReader, subReader.getPosition())); + } + return list; + } + + /** + * Recursively reads the value of a complex struct object. + * @param structReader + * @return + */ + private static Map readStruct(FieldReader structReader) + { + if (!structReader.isSet()) { + return null; + } + + List fields = structReader.getField().getChildren(); + + Map nameToValues = new HashMap<>(); + for (Field child : fields) { + FieldReader subReader = structReader.reader(child.getName()); + nameToValues.put(child.getName(), getValue(subReader, subReader.getPosition())); + } + + return nameToValues; + } + +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/AllOrNoneValueSetTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/AllOrNoneValueSetTest.java new file mode 100644 index 0000000000..e5897769f6 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/AllOrNoneValueSetTest.java @@ -0,0 +1,177 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import org.apache.arrow.vector.types.Types; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.util.Collections; + +import static org.junit.Assert.*; + +public class AllOrNoneValueSetTest +{ + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void testAll() + throws Exception + { + AllOrNoneValueSet valueSet = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + assertEquals(valueSet.getType(), Types.MinorType.INT.getType()); + assertFalse(valueSet.isNone()); + assertTrue(valueSet.isAll()); + assertFalse(valueSet.isSingleValue()); + assertTrue(valueSet.containsValue(Marker.exactly(allocator, Types.MinorType.INT.getType(), 0))); + + try { + valueSet.getSingleValue(); + fail(); + } + catch (Exception ignored) { + } + } + + @Test + public void testNullability() + throws Exception + { + ValueSet notNull = AllOrNoneValueSet.notNull(Types.MinorType.INT.getType()); + assertTrue(notNull.containsValue(Marker.exactly(allocator, Types.MinorType.INT.getType(), 100))); + assertFalse(notNull.containsValue(Marker.nullMarker(allocator, Types.MinorType.INT.getType()))); + assertTrue(notNull.containsValue(Marker.exactly(allocator, Types.MinorType.INT.getType(), 101))); + + ValueSet onlyNull = AllOrNoneValueSet.onlyNull(Types.MinorType.INT.getType()); + assertFalse(onlyNull.containsValue(Marker.exactly(allocator, Types.MinorType.INT.getType(), 100))); + assertTrue(onlyNull.containsValue(Marker.nullMarker(allocator, Types.MinorType.INT.getType()))); + assertFalse(onlyNull.containsValue(Marker.exactly(allocator, Types.MinorType.INT.getType(), 101))); + } + + @Test + public void testNone() + throws Exception + { + AllOrNoneValueSet valueSet = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + assertEquals(valueSet.getType(), Types.MinorType.INT.getType()); + assertTrue(valueSet.isNone()); + assertFalse(valueSet.isAll()); + assertFalse(valueSet.isSingleValue()); + assertFalse(valueSet.containsValue(Marker.exactly(allocator, Types.MinorType.INT.getType(), 0))); + + try { + valueSet.getSingleValue(); + fail(); + } + catch (Exception ignored) { + } + } + + @Test + public void testIntersect() + throws Exception + { + AllOrNoneValueSet all = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + AllOrNoneValueSet none = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + + assertEquals(all.intersect(allocator, all), all); + assertEquals(all.intersect(allocator, none), none); + assertEquals(none.intersect(allocator, all), none); + assertEquals(none.intersect(allocator, none), none); + } + + @Test + public void testUnion() + throws Exception + { + AllOrNoneValueSet all = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + AllOrNoneValueSet none = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + + assertEquals(all.union(allocator, all), all); + assertEquals(all.union(allocator, none), all); + assertEquals(none.union(allocator, all), all); + assertEquals(none.union(allocator, none), none); + } + + @Test + public void testComplement() + throws Exception + { + AllOrNoneValueSet all = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + AllOrNoneValueSet none = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + + assertEquals(all.complement(allocator), none); + assertEquals(none.complement(allocator), all); + } + + @Test + public void testOverlaps() + throws Exception + { + AllOrNoneValueSet all = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + AllOrNoneValueSet none = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + + assertTrue(all.overlaps(allocator, all)); + assertFalse(all.overlaps(allocator, none)); + assertFalse(none.overlaps(allocator, all)); + assertFalse(none.overlaps(allocator, none)); + } + + @Test + public void testSubtract() + throws Exception + { + AllOrNoneValueSet all = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + AllOrNoneValueSet none = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + + assertEquals(all.subtract(allocator, all), none); + assertEquals(all.subtract(allocator, none), all); + assertEquals(none.subtract(allocator, all), none); + assertEquals(none.subtract(allocator, none), none); + } + + @Test + public void testContains() + throws Exception + { + AllOrNoneValueSet all = AllOrNoneValueSet.all(Types.MinorType.INT.getType()); + AllOrNoneValueSet none = AllOrNoneValueSet.none(Types.MinorType.INT.getType()); + + assertTrue(all.contains(allocator, all)); + assertTrue(all.contains(allocator, none)); + assertFalse(none.contains(allocator, all)); + assertTrue(none.contains(allocator, none)); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/EquatableValueSetTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/EquatableValueSetTest.java new file mode 100644 index 0000000000..fd3fa81c26 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/EquatableValueSetTest.java @@ -0,0 +1,328 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.util.Collections; + +import static org.junit.Assert.*; + +public class EquatableValueSetTest +{ + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + private ArrowType INT = Types.MinorType.INT.getType(); + + @Test + public void testEmptySet() + throws Exception + { + EquatableValueSet equatables = EquatableValueSet.none(allocator, INT); + assertEquals(equatables.getType(), INT); + assertTrue(equatables.isNone()); + assertFalse(equatables.isAll()); + assertFalse(equatables.isSingleValue()); + assertTrue(equatables.isWhiteList()); + assertEquals(equatables.getValues().getRowCount(), 0); + assertEquals(equatables.complement(allocator), EquatableValueSet.all(allocator, INT)); + assertFalse(equatables.containsValue(0)); + assertFalse(equatables.containsValue(1)); + } + + @Test + public void testEntireSet() + throws Exception + { + EquatableValueSet equatables = EquatableValueSet.all(allocator, INT); + assertEquals(equatables.getType(), INT); + assertFalse(equatables.isNone()); + assertTrue(equatables.isAll()); + assertFalse(equatables.isSingleValue()); + assertFalse(equatables.isWhiteList()); + assertEquals(equatables.getValues().getRowCount(), 0); + assertEquals(equatables.complement(allocator), EquatableValueSet.none(allocator, INT)); + assertTrue(equatables.containsValue(0)); + assertTrue(equatables.containsValue(1)); + } + + @Test + public void testSingleValue() + throws Exception + { + EquatableValueSet equatables = EquatableValueSet.of(allocator, INT, 10); + + EquatableValueSet complement = (EquatableValueSet) EquatableValueSet.all(allocator, INT).subtract(allocator, equatables); + + // Whitelist + assertEquals(equatables.getType(), INT); + assertFalse(equatables.isNone()); + assertFalse(equatables.isAll()); + assertTrue(equatables.isSingleValue()); + assertTrue(equatables.isWhiteList()); + assertEquals(equatables.getSingleValue(), 10); + assertEquals(equatables.complement(allocator), complement); + assertFalse(equatables.containsValue(0)); + assertFalse(equatables.containsValue(1)); + assertTrue(equatables.containsValue(10)); + + // Blacklist + assertEquals(complement.getType(), INT); + assertFalse(complement.isNone()); + assertFalse(complement.isAll()); + assertFalse(complement.isSingleValue()); + assertFalse(complement.isWhiteList()); + assertEquals(complement.toString(), complement.getValue(0), 10); + assertEquals(complement.complement(allocator), equatables); + assertTrue(complement.containsValue(0)); + assertTrue(complement.containsValue(1)); + assertFalse(complement.containsValue(10)); + } + + @Test + public void testMultipleValues() + throws Exception + { + EquatableValueSet equatables = EquatableValueSet.of(allocator, INT, 1, 2, 3, 1); + + EquatableValueSet complement = (EquatableValueSet) EquatableValueSet.all(allocator, INT).subtract(allocator, equatables); + + // Whitelist + assertEquals(equatables.getType(), INT); + assertFalse(equatables.isNone()); + assertFalse(equatables.isAll()); + assertFalse(equatables.isSingleValue()); + assertTrue(equatables.isWhiteList()); + assertEquals(equatables.complement(allocator), complement); + assertFalse(equatables.containsValue(0)); + assertTrue(equatables.containsValue(1)); + assertTrue(equatables.containsValue(2)); + assertTrue(equatables.containsValue(3)); + assertFalse(equatables.containsValue(4)); + + // Blacklist + assertEquals(complement.getType(), INT); + assertFalse(complement.isNone()); + assertFalse(complement.isAll()); + assertFalse(complement.isSingleValue()); + assertFalse(complement.isWhiteList()); + assertEquals(complement.complement(allocator), equatables); + assertTrue(complement.containsValue(0)); + assertFalse(complement.containsValue(1)); + assertFalse(complement.containsValue(2)); + assertFalse(complement.containsValue(3)); + assertTrue(complement.containsValue(4)); + } + + @Test + public void testGetSingleValue() + throws Exception + { + assertEquals(EquatableValueSet.of(allocator, INT, 0).getSingleValue(), 0); + try { + EquatableValueSet.all(allocator, INT).getSingleValue(); + fail(); + } + catch (IllegalStateException ignored) { + } + } + + @Test + public void testNullability() + throws Exception + { + ValueSet actual = EquatableValueSet.of(allocator, INT, true, Collections.singletonList(100)); + assertTrue(actual.containsValue(Marker.exactly(allocator, INT, 100))); + assertTrue(actual.containsValue(Marker.nullMarker(allocator, INT))); + assertFalse(actual.containsValue(Marker.exactly(allocator, INT, 101))); + } + + @Test + public void testOverlaps() + throws Exception + { + assertTrue(EquatableValueSet.all(allocator, INT).overlaps(allocator, EquatableValueSet.all(allocator, INT))); + assertFalse(EquatableValueSet.all(allocator, INT).overlaps(allocator, EquatableValueSet.none(allocator, INT))); + assertTrue(EquatableValueSet.all(allocator, INT).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertTrue(EquatableValueSet.all(allocator, INT).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertTrue(EquatableValueSet.all(allocator, INT).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + + assertFalse(EquatableValueSet.none(allocator, INT).overlaps(allocator, EquatableValueSet.all(allocator, INT))); + assertFalse(EquatableValueSet.none(allocator, INT).overlaps(allocator, EquatableValueSet.none(allocator, INT))); + assertFalse(EquatableValueSet.none(allocator, INT).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertFalse(EquatableValueSet.none(allocator, INT).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertFalse(EquatableValueSet.none(allocator, INT).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + + assertTrue(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.all(allocator, INT))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.none(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.of(allocator, INT, 1))); + assertTrue(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0).complement(allocator))); + assertTrue(EquatableValueSet.of(allocator, INT, 0).overlaps(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator))); + + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).overlaps(allocator, EquatableValueSet.all(allocator, INT))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).overlaps(allocator, EquatableValueSet.none(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).overlaps(allocator, EquatableValueSet.of(allocator, INT, -1))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).overlaps(allocator, EquatableValueSet.of(allocator, INT, -1).complement(allocator))); + + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).overlaps(allocator, EquatableValueSet.all(allocator, INT))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).overlaps(allocator, EquatableValueSet.none(allocator, INT))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).overlaps(allocator, EquatableValueSet.of(allocator, INT, -1))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).overlaps(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).overlaps(allocator, EquatableValueSet.of(allocator, INT, -1).complement(allocator))); + } + + @Test + public void testContains() + throws Exception + { + assertTrue(EquatableValueSet.all(allocator, INT).contains(allocator, EquatableValueSet.all(allocator, INT))); + assertTrue(EquatableValueSet.all(allocator, INT).contains(allocator, EquatableValueSet.none(allocator, INT))); + assertTrue(EquatableValueSet.all(allocator, INT).contains(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertTrue(EquatableValueSet.all(allocator, INT).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertTrue(EquatableValueSet.all(allocator, INT).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + + assertFalse(EquatableValueSet.none(allocator, INT).contains(allocator, EquatableValueSet.all(allocator, INT))); + assertTrue(EquatableValueSet.none(allocator, INT).contains(allocator, EquatableValueSet.none(allocator, INT))); + assertFalse(EquatableValueSet.none(allocator, INT).contains(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertFalse(EquatableValueSet.none(allocator, INT).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertFalse(EquatableValueSet.none(allocator, INT).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + + assertFalse(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.all(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.none(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.of(allocator, INT, 0).complement(allocator))); + assertFalse(EquatableValueSet.of(allocator, INT, 0).contains(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator))); + + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.all(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.none(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 2))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.of(allocator, INT, 0).complement(allocator))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).contains(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator))); + + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).contains(allocator, EquatableValueSet.all(allocator, INT))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).contains(allocator, EquatableValueSet.none(allocator, INT))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).contains(allocator, EquatableValueSet.of(allocator, INT, 0))); + assertTrue(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).contains(allocator, EquatableValueSet.of(allocator, INT, -1))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).contains(allocator, EquatableValueSet.of(allocator, INT, 0, 1))); + assertFalse(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).contains(allocator, EquatableValueSet.of(allocator, INT, -1).complement(allocator))); + } + + @Test + public void testIntersect() + throws Exception + { + assertEquals(EquatableValueSet.none(allocator, INT).intersect(allocator, EquatableValueSet.none(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.all(allocator, INT).intersect(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.all(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).intersect(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).intersect(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.all(allocator, INT).intersect(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).intersect(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).intersect(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).intersect(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).intersect(allocator, EquatableValueSet.of(allocator, INT, 1)), EquatableValueSet.of(allocator, INT, 1)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).intersect(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).intersect(allocator, EquatableValueSet.of(allocator, INT, 0, 2)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).intersect(allocator, EquatableValueSet.of(allocator, INT, 0, 2)), EquatableValueSet.of(allocator, INT, 2)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).intersect(allocator, EquatableValueSet.of(allocator, INT, 0, 2).complement(allocator)), EquatableValueSet.of(allocator, INT, 0, 1, 2).complement(allocator)); + } + + @Test + public void testUnion() + throws Exception + { + assertEquals(EquatableValueSet.none(allocator, INT).union(allocator, EquatableValueSet.none(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.all(allocator, INT).union(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.all(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).union(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.all(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).union(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.all(allocator, INT).union(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.all(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).union(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).union(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0, 1)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).union(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.all(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).union(allocator, EquatableValueSet.of(allocator, INT, 1)), EquatableValueSet.of(allocator, INT, 0).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).union(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator)), EquatableValueSet.of(allocator, INT, 1).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).union(allocator, EquatableValueSet.of(allocator, INT, 0, 2)), EquatableValueSet.of(allocator, INT, 0, 1, 2)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).union(allocator, EquatableValueSet.of(allocator, INT, 0, 2)), EquatableValueSet.of(allocator, INT, 1).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator).union(allocator, EquatableValueSet.of(allocator, INT, 0, 2).complement(allocator)), EquatableValueSet.of(allocator, INT, 0).complement(allocator)); + } + + @Test + public void testSubtract() + throws Exception + { + assertEquals(EquatableValueSet.all(allocator, INT).subtract(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.all(allocator, INT).subtract(allocator, EquatableValueSet.none(allocator, INT)), EquatableValueSet.all(allocator, INT)); + assertEquals(EquatableValueSet.all(allocator, INT).subtract(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0).complement(allocator)); + assertEquals(EquatableValueSet.all(allocator, INT).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1)), EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)); + assertEquals(EquatableValueSet.all(allocator, INT).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)), EquatableValueSet.of(allocator, INT, 0, 1)); + + assertEquals(EquatableValueSet.none(allocator, INT).subtract(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).subtract(allocator, EquatableValueSet.none(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).subtract(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.none(allocator, INT).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)), EquatableValueSet.none(allocator, INT)); + + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.none(allocator, INT)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.of(allocator, INT, 0).complement(allocator)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.of(allocator, INT, 1)), EquatableValueSet.of(allocator, INT, 0)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)), EquatableValueSet.of(allocator, INT, 0)); + + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.all(allocator, INT)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.none(allocator, INT)), EquatableValueSet.of(allocator, INT, 0).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.of(allocator, INT, 0)), EquatableValueSet.of(allocator, INT, 0).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.of(allocator, INT, 0).complement(allocator)), EquatableValueSet.none(allocator, INT)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.of(allocator, INT, 1)), EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.of(allocator, INT, 1).complement(allocator)), EquatableValueSet.of(allocator, INT, 1)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1)), EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)); + assertEquals(EquatableValueSet.of(allocator, INT, 0).complement(allocator).subtract(allocator, EquatableValueSet.of(allocator, INT, 0, 1).complement(allocator)), EquatableValueSet.of(allocator, INT, 1)); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/MarkerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/MarkerTest.java new file mode 100644 index 0000000000..d89b30434a --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/MarkerTest.java @@ -0,0 +1,176 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import com.google.common.collect.Ordering; +import org.apache.arrow.vector.types.Types; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.util.Map; + +import static org.junit.Assert.*; + +public class MarkerTest +{ + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void testTypes() + throws Exception + { + assertEquals(Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()).getType(), Types.MinorType.INT.getType()); + assertEquals(Marker.below(allocator, Types.MinorType.INT.getType(), 1).getType(), Types.MinorType.INT.getType()); + assertEquals(Marker.exactly(allocator, Types.MinorType.INT.getType(), 1).getType(), Types.MinorType.INT.getType()); + assertEquals(Marker.above(allocator, Types.MinorType.INT.getType(), 1).getType(), Types.MinorType.INT.getType()); + assertEquals(Marker.upperUnbounded(allocator, Types.MinorType.INT.getType()).getType(), Types.MinorType.INT.getType()); + } + + @Test + public void testUnbounded() + throws Exception + { + assertTrue(Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()).isLowerUnbounded()); + assertFalse(Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()).isUpperUnbounded()); + assertTrue(Marker.upperUnbounded(allocator, Types.MinorType.INT.getType()).isUpperUnbounded()); + assertFalse(Marker.upperUnbounded(allocator, Types.MinorType.INT.getType()).isLowerUnbounded()); + + assertFalse(Marker.below(allocator, Types.MinorType.INT.getType(), 1).isLowerUnbounded()); + assertFalse(Marker.below(allocator, Types.MinorType.INT.getType(), 1).isUpperUnbounded()); + assertFalse(Marker.exactly(allocator, Types.MinorType.INT.getType(), 1).isLowerUnbounded()); + assertFalse(Marker.exactly(allocator, Types.MinorType.INT.getType(), 1).isUpperUnbounded()); + assertFalse(Marker.above(allocator, Types.MinorType.INT.getType(), 1).isLowerUnbounded()); + assertFalse(Marker.above(allocator, Types.MinorType.INT.getType(), 1).isUpperUnbounded()); + } + + @Test + public void testComparisons() + throws Exception + { + ImmutableList markers = ImmutableList.of( + Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()), + Marker.above(allocator, Types.MinorType.INT.getType(), 0), + Marker.below(allocator, Types.MinorType.INT.getType(), 1), + Marker.exactly(allocator, Types.MinorType.INT.getType(), 1), + Marker.above(allocator, Types.MinorType.INT.getType(), 1), + Marker.below(allocator, Types.MinorType.INT.getType(), 2), + Marker.upperUnbounded(allocator, Types.MinorType.INT.getType())); + + assertTrue(Ordering.natural().isStrictlyOrdered(markers)); + + // Compare every marker with every other marker + // Since the markers are strictly ordered, the value of the comparisons should be equivalent to the comparisons + // of their indexes. + for (int i = 0; i < markers.size(); i++) { + for (int j = 0; j < markers.size(); j++) { + assertTrue(markers.get(i).compareTo(markers.get(j)) == Integer.compare(i, j)); + } + } + } + + @Test + public void testAdjacency() + throws Exception + { + ImmutableMap markers = ImmutableMap.builder() + .put(Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()), -1000) + .put(Marker.above(allocator, Types.MinorType.INT.getType(), 0), -100) + .put(Marker.below(allocator, Types.MinorType.INT.getType(), 1), -1) + .put(Marker.exactly(allocator, Types.MinorType.INT.getType(), 1), 0) + .put(Marker.above(allocator, Types.MinorType.INT.getType(), 1), 1) + .put(Marker.below(allocator, Types.MinorType.INT.getType(), 2), 100) + .put(Marker.upperUnbounded(allocator, Types.MinorType.INT.getType()), 1000) + .build(); + + // Compare every marker with every other marker + // Map values of distance 1 indicate expected adjacency + for (Map.Entry entry1 : markers.entrySet()) { + for (Map.Entry entry2 : markers.entrySet()) { + boolean adjacent = entry1.getKey().isAdjacent(entry2.getKey()); + boolean distanceIsOne = Math.abs(entry1.getValue() - entry2.getValue()) == 1; + assertEquals(adjacent, distanceIsOne); + } + } + + assertEquals(Marker.below(allocator, Types.MinorType.INT.getType(), 1).greaterAdjacent(), Marker.exactly(allocator, Types.MinorType.INT.getType(), 1)); + assertEquals(Marker.exactly(allocator, Types.MinorType.INT.getType(), 1).greaterAdjacent(), Marker.above(allocator, Types.MinorType.INT.getType(), 1)); + assertEquals(Marker.above(allocator, Types.MinorType.INT.getType(), 1).lesserAdjacent(), Marker.exactly(allocator, Types.MinorType.INT.getType(), 1)); + assertEquals(Marker.exactly(allocator, Types.MinorType.INT.getType(), 1).lesserAdjacent(), Marker.below(allocator, Types.MinorType.INT.getType(), 1)); + + try { + Marker.below(allocator, Types.MinorType.INT.getType(), 1).lesserAdjacent(); + fail(); + } + catch (IllegalStateException e) { + } + + try { + Marker.above(allocator, Types.MinorType.INT.getType(), 1).greaterAdjacent(); + fail(); + } + catch (IllegalStateException e) { + } + + try { + Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()).lesserAdjacent(); + fail(); + } + catch (IllegalStateException e) { + } + + try { + Marker.lowerUnbounded(allocator, Types.MinorType.INT.getType()).greaterAdjacent(); + fail(); + } + catch (IllegalStateException e) { + } + + try { + Marker.upperUnbounded(allocator, Types.MinorType.INT.getType()).lesserAdjacent(); + fail(); + } + catch (IllegalStateException e) { + } + + try { + Marker.upperUnbounded(allocator, Types.MinorType.INT.getType()).greaterAdjacent(); + fail(); + } + catch (IllegalStateException e) { + } + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/RangeTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/RangeTest.java new file mode 100644 index 0000000000..b525ecb667 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/RangeTest.java @@ -0,0 +1,297 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import static org.apache.arrow.vector.types.Types.MinorType.BIGINT; +import static org.apache.arrow.vector.types.Types.MinorType.BIT; +import static org.apache.arrow.vector.types.Types.MinorType.FLOAT8; +import static org.apache.arrow.vector.types.Types.MinorType.VARCHAR; +import static org.junit.Assert.*; + +public class RangeTest +{ + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @SuppressWarnings({"unchecked", "rawtypes"}) + @Test(expected = IllegalArgumentException.class) + public void testMismatchedTypes() + throws Exception + { + // NEVER DO THIS + new Range(Marker.exactly(allocator, BIGINT.getType(), 1L), Marker.exactly(allocator, VARCHAR.getType(), "a")); + } + + @Test(expected = IllegalArgumentException.class) + public void testInvertedBounds() + throws Exception + { + new Range(Marker.exactly(allocator, BIGINT.getType(), 1L), Marker.exactly(allocator, BIGINT.getType(), 0L)); + } + + @Test(expected = IllegalArgumentException.class) + public void testLowerUnboundedOnly() + throws Exception + { + new Range(Marker.lowerUnbounded(allocator, BIGINT.getType()), Marker.lowerUnbounded(allocator, BIGINT.getType())); + } + + @Test(expected = IllegalArgumentException.class) + public void testUpperUnboundedOnly() + throws Exception + { + new Range(Marker.upperUnbounded(allocator, BIGINT.getType()), Marker.upperUnbounded(allocator, BIGINT.getType())); + } + + @Test + public void testSingleValue() + throws Exception + { + assertTrue(Range.range(allocator, BIGINT.getType(), 1L, true, 1L, true).isSingleValue()); + assertFalse(Range.range(allocator, BIGINT.getType(), 1L, true, 2L, true).isSingleValue()); + assertTrue(Range.range(allocator, FLOAT8.getType(), 1.1, true, 1.1, true).isSingleValue()); + assertTrue(Range.range(allocator, VARCHAR.getType(), "a", true, "a", true).isSingleValue()); + assertTrue(Range.range(allocator, BIT.getType(), true, true, true, true).isSingleValue()); + assertFalse(Range.range(allocator, BIT.getType(), false, true, true, true).isSingleValue()); + } + + @Test + public void testAllRange() + throws Exception + { + Range range = Range.all(allocator, BIGINT.getType()); + assertEquals(range.getLow(), Marker.lowerUnbounded(allocator, BIGINT.getType())); + assertEquals(range.getHigh(), Marker.upperUnbounded(allocator, BIGINT.getType())); + assertFalse(range.isSingleValue()); + assertTrue(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertTrue(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertTrue(range.includes(Marker.below(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.above(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testGreaterThanRange() + throws Exception + { + Range range = Range.greaterThan(allocator, BIGINT.getType(), 1L); + assertEquals(range.getLow(), Marker.above(allocator, BIGINT.getType(), 1L)); + assertEquals(range.getHigh(), Marker.upperUnbounded(allocator, BIGINT.getType())); + assertFalse(range.isSingleValue()); + assertFalse(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertFalse(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 2L))); + assertTrue(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testGreaterThanOrEqualRange() + throws Exception + { + Range range = Range.greaterThanOrEqual(allocator, BIGINT.getType(), 1L); + assertEquals(range.getLow(), Marker.exactly(allocator, BIGINT.getType(), 1L)); + assertEquals(range.getHigh(), Marker.upperUnbounded(allocator, BIGINT.getType())); + assertFalse(range.isSingleValue()); + assertFalse(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertFalse(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 2L))); + assertTrue(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testLessThanRange() + throws Exception + { + Range range = Range.lessThan(allocator, BIGINT.getType(), 1L); + assertEquals(range.getLow(), Marker.lowerUnbounded(allocator, BIGINT.getType())); + assertEquals(range.getHigh(), Marker.below(allocator, BIGINT.getType(), 1L)); + assertFalse(range.isSingleValue()); + assertFalse(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertTrue(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertFalse(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testLessThanOrEqualRange() + throws Exception + { + Range range = Range.lessThanOrEqual(allocator, BIGINT.getType(), 1L); + assertEquals(range.getLow(), Marker.lowerUnbounded(allocator, BIGINT.getType())); + assertEquals(range.getHigh(), Marker.exactly(allocator, BIGINT.getType(), 1L)); + assertFalse(range.isSingleValue()); + assertFalse(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertTrue(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 2L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertFalse(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testEqualRange() + throws Exception + { + Range range = Range.equal(allocator, BIGINT.getType(), 1L); + assertEquals(range.getLow(), Marker.exactly(allocator, BIGINT.getType(), 1L)); + assertEquals(range.getHigh(), Marker.exactly(allocator, BIGINT.getType(), 1L)); + assertTrue(range.isSingleValue()); + assertFalse(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertFalse(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 2L))); + assertFalse(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testRange() + throws Exception + { + Range range = Range.range(allocator, BIGINT.getType(), 0L, false, 2L, true); + assertEquals(range.getLow(), Marker.above(allocator, BIGINT.getType(), 0L)); + assertEquals(range.getHigh(), Marker.exactly(allocator, BIGINT.getType(), 2L)); + assertFalse(range.isSingleValue()); + assertFalse(range.isAll()); + assertEquals(range.getType(), BIGINT.getType()); + assertFalse(range.includes(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(range.includes(Marker.exactly(allocator, BIGINT.getType(), 2L))); + assertFalse(range.includes(Marker.exactly(allocator, BIGINT.getType(), 3L))); + assertFalse(range.includes(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testGetSingleValue() + throws Exception + { + assertEquals(Range.equal(allocator, BIGINT.getType(), 0L).getSingleValue(), 0L); + try { + Range.lessThan(allocator, BIGINT.getType(), 0L).getSingleValue(); + fail(); + } + catch (IllegalStateException e) { + } + } + + @Test + public void testContains() + throws Exception + { + assertTrue(Range.all(allocator, BIGINT.getType()).contains(Range.all(allocator, BIGINT.getType()))); + assertTrue(Range.all(allocator, BIGINT.getType()).contains(Range.equal(allocator, BIGINT.getType(), 0L))); + assertTrue(Range.all(allocator, BIGINT.getType()).contains(Range.greaterThan(allocator, BIGINT.getType(), 0L))); + assertTrue(Range.equal(allocator, BIGINT.getType(), 0L).contains(Range.equal(allocator, BIGINT.getType(), 0L))); + assertFalse(Range.equal(allocator, BIGINT.getType(), 0L).contains(Range.greaterThan(allocator, BIGINT.getType(), 0L))); + assertFalse(Range.equal(allocator, BIGINT.getType(), 0L).contains(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L))); + assertFalse(Range.equal(allocator, BIGINT.getType(), 0L).contains(Range.all(allocator, BIGINT.getType()))); + assertTrue(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L).contains(Range.greaterThan(allocator, BIGINT.getType(), 0L))); + assertTrue(Range.greaterThan(allocator, BIGINT.getType(), 0L).contains(Range.greaterThan(allocator, BIGINT.getType(), 1L))); + assertFalse(Range.greaterThan(allocator, BIGINT.getType(), 0L).contains(Range.lessThan(allocator, BIGINT.getType(), 0L))); + assertTrue(Range.range(allocator, BIGINT.getType(), 0L, true, 2L, true).contains(Range.range(allocator, BIGINT.getType(), 1L, true, 2L, true))); + assertFalse(Range.range(allocator, BIGINT.getType(), 0L, true, 2L, true).contains(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false))); + } + + @Test + public void testSpan() + throws Exception + { + assertEquals(Range.greaterThan(allocator, BIGINT.getType(), 1L).span(Range.lessThanOrEqual(allocator, BIGINT.getType(), 2L)), Range.all(allocator, BIGINT.getType())); + assertEquals(Range.greaterThan(allocator, BIGINT.getType(), 2L).span(Range.lessThanOrEqual(allocator, BIGINT.getType(), 0L)), Range.all(allocator, BIGINT.getType())); + assertEquals(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).span(Range.equal(allocator, BIGINT.getType(), 2L)), Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false)); + assertEquals(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).span(Range.range(allocator, BIGINT.getType(), 2L, false, 10L, false)), Range.range(allocator, BIGINT.getType(), 1L, true, 10L, false)); + assertEquals(Range.greaterThan(allocator, BIGINT.getType(), 1L).span(Range.equal(allocator, BIGINT.getType(), 0L)), Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)); + assertEquals(Range.greaterThan(allocator, BIGINT.getType(), 1L).span(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 10L)), Range.greaterThan(allocator, BIGINT.getType(), 1L)); + assertEquals(Range.lessThan(allocator, BIGINT.getType(), 1L).span(Range.lessThanOrEqual(allocator, BIGINT.getType(), 1L)), Range.lessThanOrEqual(allocator, BIGINT.getType(), 1L)); + assertEquals(Range.all(allocator, BIGINT.getType()).span(Range.lessThanOrEqual(allocator, BIGINT.getType(), 1L)), Range.all(allocator, BIGINT.getType())); + } + + @Test + public void testOverlaps() + throws Exception + { + assertTrue(Range.greaterThan(allocator, BIGINT.getType(), 1L).overlaps(Range.lessThanOrEqual(allocator, BIGINT.getType(), 2L))); + assertFalse(Range.greaterThan(allocator, BIGINT.getType(), 2L).overlaps(Range.lessThan(allocator, BIGINT.getType(), 2L))); + assertTrue(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).overlaps(Range.equal(allocator, BIGINT.getType(), 2L))); + assertTrue(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).overlaps(Range.range(allocator, BIGINT.getType(), 2L, false, 10L, false))); + assertFalse(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).overlaps(Range.range(allocator, BIGINT.getType(), 3L, true, 10L, false))); + assertTrue(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, true).overlaps(Range.range(allocator, BIGINT.getType(), 3L, true, 10L, false))); + assertTrue(Range.all(allocator, BIGINT.getType()).overlaps(Range.equal(allocator, BIGINT.getType(), Long.MAX_VALUE))); + } + + @Test + public void testIntersect() + throws Exception + { + assertEquals(Range.greaterThan(allocator, BIGINT.getType(), 1L).intersect(Range.lessThanOrEqual(allocator, BIGINT.getType(), 2L)), Range.range(allocator, BIGINT.getType(), 1L, false, 2L, true)); + assertEquals(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).intersect(Range.equal(allocator, BIGINT.getType(), 2L)), Range.equal(allocator, BIGINT.getType(), 2L)); + assertEquals(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).intersect(Range.range(allocator, BIGINT.getType(), 2L, false, 10L, false)), Range.range(allocator, BIGINT.getType(), 2L, false, 3L, false)); + assertEquals(Range.range(allocator, BIGINT.getType(), 1L, true, 3L, true).intersect(Range.range(allocator, BIGINT.getType(), 3L, true, 10L, false)), Range.equal(allocator, BIGINT.getType(), 3L)); + assertEquals(Range.all(allocator, BIGINT.getType()).intersect(Range.equal(allocator, BIGINT.getType(), Long.MAX_VALUE)), Range.equal(allocator, BIGINT.getType(), Long.MAX_VALUE)); + } + + @Test + public void testExceptionalIntersect() + throws Exception + { + try { + Range.greaterThan(allocator, BIGINT.getType(), 2L).intersect(Range.lessThan(allocator, BIGINT.getType(), 2L)); + fail(); + } + catch (IllegalArgumentException e) { + } + + try { + Range.range(allocator, BIGINT.getType(), 1L, true, 3L, false).intersect(Range.range(allocator, BIGINT.getType(), 3L, true, 10L, false)); + fail(); + } + catch (IllegalArgumentException e) { + } + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/SortedRangeSetTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/SortedRangeSetTest.java new file mode 100644 index 0000000000..a02862f80b --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/domain/predicate/SortedRangeSetTest.java @@ -0,0 +1,457 @@ +package com.amazonaws.athena.connector.lambda.domain.predicate; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Iterables; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.util.Collections; +import java.util.stream.Collectors; + +import static org.apache.arrow.vector.types.Types.MinorType.BIGINT; +import static org.junit.Assert.*; + +public class SortedRangeSetTest +{ + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void testEmptySet() + throws Exception + { + SortedRangeSet rangeSet = SortedRangeSet.none(BIGINT.getType()); + assertEquals(rangeSet.getType(), BIGINT.getType()); + assertTrue(rangeSet.isNone()); + assertFalse(rangeSet.isAll()); + assertFalse(rangeSet.isSingleValue()); + assertTrue(Iterables.isEmpty(rangeSet.getOrderedRanges())); + assertEquals(rangeSet.getRangeCount(), 0); + assertEquals(rangeSet.complement(allocator), SortedRangeSet.all(allocator, BIGINT.getType())); + assertFalse(rangeSet.includesMarker(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertFalse(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertFalse(rangeSet.includesMarker(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testEntireSet() + throws Exception + { + SortedRangeSet rangeSet = SortedRangeSet.all(allocator, BIGINT.getType()); + assertEquals(rangeSet.getType(), BIGINT.getType()); + assertFalse(rangeSet.isNone()); + assertTrue(rangeSet.isAll()); + assertFalse(rangeSet.isSingleValue()); + assertEquals(rangeSet.getRangeCount(), 1); + assertEquals(rangeSet.complement(allocator), SortedRangeSet.none(BIGINT.getType())); + assertTrue(rangeSet.includesMarker(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertTrue(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertTrue(rangeSet.includesMarker(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testNullability() + throws Exception + { + SortedRangeSet rangeSet = SortedRangeSet.of(allocator, BIGINT.getType(), true, 10L, Collections.singletonList(10L)); + assertTrue(rangeSet.containsValue(Marker.nullMarker(allocator, BIGINT.getType()))); + assertFalse(rangeSet.containsValue(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertTrue(rangeSet.containsValue(Marker.exactly(allocator, BIGINT.getType(), 10L))); + } + + @Test + public void testSingleValue() + throws Exception + { + SortedRangeSet rangeSet = SortedRangeSet.of(allocator, BIGINT.getType(), 10L); + + SortedRangeSet complement = SortedRangeSet.of(true, + Range.greaterThan(allocator, BIGINT.getType(), 10L), + Range.lessThan(allocator, BIGINT.getType(), 10L)); + + assertEquals(rangeSet.getType(), BIGINT.getType()); + assertFalse(rangeSet.isNone()); + assertFalse(rangeSet.isAll()); + assertTrue(rangeSet.isSingleValue()); + assertTrue(Iterables.elementsEqual(rangeSet.getOrderedRanges(), ImmutableList.of(Range.equal(allocator, BIGINT.getType(), 10L)))); + assertEquals(rangeSet.getRangeCount(), 1); + assertEquals(rangeSet.complement(allocator), complement); + assertFalse(rangeSet.includesMarker(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertTrue(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 10L))); + assertFalse(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 9L))); + assertFalse(rangeSet.includesMarker(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testBoundedSet() + throws Exception + { + SortedRangeSet rangeSet = SortedRangeSet.of( + Range.equal(allocator, BIGINT.getType(), 10L), + Range.equal(allocator, BIGINT.getType(), 0L), + Range.range(allocator, BIGINT.getType(), 9L, true, 11L, false), + Range.equal(allocator, BIGINT.getType(), 0L), + Range.range(allocator, BIGINT.getType(), 2L, true, 4L, true), + Range.range(allocator, BIGINT.getType(), 4L, false, 5L, true)); + + ImmutableList normalizedResult = ImmutableList.of( + Range.equal(allocator, BIGINT.getType(), 0L), + Range.range(allocator, BIGINT.getType(), 2L, true, 5L, true), + Range.range(allocator, BIGINT.getType(), 9L, true, 11L, false)); + + SortedRangeSet complement = SortedRangeSet.of(true, + Range.lessThan(allocator, BIGINT.getType(), 0L), + Range.range(allocator, BIGINT.getType(), 0L, false, 2L, false), + Range.range(allocator, BIGINT.getType(), 5L, false, 9L, false), + Range.greaterThanOrEqual(allocator, BIGINT.getType(), 11L)); + + assertEquals(rangeSet.getType(), BIGINT.getType()); + assertFalse(rangeSet.isNone()); + assertFalse(rangeSet.isAll()); + assertFalse(rangeSet.isSingleValue()); + assertTrue(Iterables.elementsEqual(rangeSet.getOrderedRanges(), normalizedResult)); + assertEquals(rangeSet, SortedRangeSet.copyOf(BIGINT.getType(), normalizedResult, false)); + assertEquals(rangeSet.getRangeCount(), 3); + assertEquals(rangeSet.complement(allocator), complement); + assertFalse(rangeSet.includesMarker(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertTrue(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertFalse(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 1L))); + assertFalse(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 7L))); + assertTrue(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 9L))); + assertFalse(rangeSet.includesMarker(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testUnboundedSet() + throws Exception + { + SortedRangeSet rangeSet = SortedRangeSet.of( + Range.greaterThan(allocator, BIGINT.getType(), 10L), + Range.lessThanOrEqual(allocator, BIGINT.getType(), 0L), + Range.range(allocator, BIGINT.getType(), 2L, true, 4L, false), + Range.range(allocator, BIGINT.getType(), 4L, true, 6L, false), + Range.range(allocator, BIGINT.getType(), 1L, false, 2L, false), + Range.range(allocator, BIGINT.getType(), 9L, false, 11L, false)); + + ImmutableList normalizedResult = ImmutableList.of( + Range.lessThanOrEqual(allocator, BIGINT.getType(), 0L), + Range.range(allocator, BIGINT.getType(), 1L, false, 6L, false), + Range.greaterThan(allocator, BIGINT.getType(), 9L)); + + SortedRangeSet complement = SortedRangeSet.of(true, + Range.range(allocator, BIGINT.getType(), 0L, false, 1L, true), + Range.range(allocator, BIGINT.getType(), 6L, true, 9L, true)); + + assertEquals(rangeSet.getType(), BIGINT.getType()); + assertFalse(rangeSet.isNone()); + assertFalse(rangeSet.isAll()); + assertFalse(rangeSet.isSingleValue()); + assertTrue(Iterables.elementsEqual(rangeSet.getOrderedRanges(), normalizedResult)); + assertEquals(rangeSet, SortedRangeSet.copyOf(BIGINT.getType(), normalizedResult, false)); + assertEquals(rangeSet.getRangeCount(), 3); + assertEquals(rangeSet.complement(allocator), complement); + assertTrue(rangeSet.includesMarker(Marker.lowerUnbounded(allocator, BIGINT.getType()))); + assertTrue(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 0L))); + assertTrue(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 4L))); + assertFalse(rangeSet.includesMarker(Marker.exactly(allocator, BIGINT.getType(), 7L))); + assertTrue(rangeSet.includesMarker(Marker.upperUnbounded(allocator, BIGINT.getType()))); + } + + @Test + public void testGetSingleValue() + throws Exception + { + assertEquals(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).getSingleValue(), 0L); + try { + SortedRangeSet.all(allocator, BIGINT.getType()).getSingleValue(); + fail(); + } + catch (IllegalStateException e) { + } + } + + @Test + public void testSpan() + throws Exception + { + try { + SortedRangeSet.none(BIGINT.getType()).getSpan(); + fail(); + } + catch (IllegalStateException e) { + } + + assertEquals(SortedRangeSet.all(allocator, BIGINT.getType()).getSpan(), Range.all(allocator, BIGINT.getType())); + assertEquals(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).getSpan(), Range.equal(allocator, BIGINT.getType(), 0L)); + assertEquals(SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).getSpan(), Range.range(allocator, BIGINT.getType(), 0L, true, 1L, true)); + assertEquals(SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.greaterThan(allocator, BIGINT.getType(), 1L)).getSpan(), Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)); + assertEquals(SortedRangeSet.of(Range.lessThan(allocator, BIGINT.getType(), 0L), Range.greaterThan(allocator, BIGINT.getType(), 1L)).getSpan(), Range.all(allocator, BIGINT.getType())); + } + + @Test + public void testOverlaps() + throws Exception + { + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).overlaps(allocator, SortedRangeSet.all(allocator, BIGINT.getType()))); + assertFalse(SortedRangeSet.all(allocator, BIGINT.getType()).overlaps(allocator, SortedRangeSet.none(BIGINT.getType()))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L), Range.lessThan(allocator, BIGINT.getType(), 0L)))); + + assertFalse(SortedRangeSet.none(BIGINT.getType()).overlaps(allocator, SortedRangeSet.all(allocator, BIGINT.getType()))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).overlaps(allocator, SortedRangeSet.none(BIGINT.getType()))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L), Range.lessThan(allocator, BIGINT.getType(), 0L)))); + + assertTrue(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).overlaps(allocator, SortedRangeSet.all(allocator, BIGINT.getType()))); + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).overlaps(allocator, SortedRangeSet.none(BIGINT.getType()))); + assertTrue(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).overlaps(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L))); + assertTrue(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).overlaps(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)))); + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L), Range.lessThan(allocator, BIGINT.getType(), 0L)))); + + assertTrue(SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).overlaps(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 1L)))); + assertFalse(SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).overlaps(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L)))); + assertTrue(SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertTrue(SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).overlaps(allocator, SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.of(Range.lessThan(allocator, BIGINT.getType(), 0L)).overlaps(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + } + + @Test + public void testContains() + throws Exception + { + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).contains(allocator, SortedRangeSet.all(allocator, BIGINT.getType()))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).contains(allocator, SortedRangeSet.none(BIGINT.getType()))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).contains(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).contains(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertTrue(SortedRangeSet.all(allocator, BIGINT.getType()).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L), Range.lessThan(allocator, BIGINT.getType(), 0L)))); + + assertFalse(SortedRangeSet.none(BIGINT.getType()).contains(allocator, SortedRangeSet.all(allocator, BIGINT.getType()))); + assertTrue(SortedRangeSet.none(BIGINT.getType()).contains(allocator, SortedRangeSet.none(BIGINT.getType()))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).contains(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).contains(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.none(BIGINT.getType()).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L), Range.lessThan(allocator, BIGINT.getType(), 0L)))); + + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).contains(allocator, SortedRangeSet.all(allocator, BIGINT.getType()))); + assertTrue(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).contains(allocator, SortedRangeSet.none(BIGINT.getType()))); + assertTrue(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).contains(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L))); + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).contains(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)))); + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.of(allocator, BIGINT.getType(), 0L).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L), Range.lessThan(allocator, BIGINT.getType(), 0L)))); + + assertTrue(SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).contains(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 1L)))); + assertFalse(SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).contains(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 1L), Range.equal(allocator, BIGINT.getType(), 2L)))); + assertTrue(SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).contains(allocator, SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)))); + assertFalse(SortedRangeSet.of(Range.lessThan(allocator, BIGINT.getType(), 0L)).contains(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)))); + } + + @Test + public void testIntersect() + throws Exception + { + assertEquals( + SortedRangeSet.none(BIGINT.getType()).intersect(allocator, + SortedRangeSet.none(BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).intersect(allocator, + SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.all(allocator, BIGINT.getType())); + + assertEquals( + SortedRangeSet.none(BIGINT.getType()).intersect(allocator, + SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + + assertEquals( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 1L), Range.equal(allocator, BIGINT.getType(), 2L), Range.equal(allocator, BIGINT.getType(), 3L)).intersect(allocator, + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L), Range.equal(allocator, BIGINT.getType(), 4L))), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L))); + + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).intersect(allocator, + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L), Range.equal(allocator, BIGINT.getType(), 4L))), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L), Range.equal(allocator, BIGINT.getType(), 4L))); + + assertEquals( + SortedRangeSet.of(Range.range(allocator, BIGINT.getType(), 0L, true, 4L, false)).intersect(allocator, + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L), Range.greaterThan(allocator, BIGINT.getType(), 3L))), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L), Range.range(allocator, BIGINT.getType(), 3L, false, 4L, false))); + + assertEquals( + SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)).intersect(allocator, + SortedRangeSet.of(Range.lessThanOrEqual(allocator, BIGINT.getType(), 0L))), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L))); + + assertEquals( + SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), -1L)).intersect(allocator, + SortedRangeSet.of(Range.lessThanOrEqual(allocator, BIGINT.getType(), 1L))), + SortedRangeSet.of(Range.range(allocator, BIGINT.getType(), -1L, true, 1L, true))); + } + + @Test + public void testUnion() + throws Exception + { + assertUnion(SortedRangeSet.none(BIGINT.getType()), SortedRangeSet.none(BIGINT.getType()), SortedRangeSet.none(BIGINT.getType())); + assertUnion(SortedRangeSet.all(allocator, BIGINT.getType()), SortedRangeSet.all(allocator, BIGINT.getType()), SortedRangeSet.all(allocator, BIGINT.getType())); + assertUnion(SortedRangeSet.none(BIGINT.getType()), SortedRangeSet.all(allocator, BIGINT.getType()), SortedRangeSet.all(allocator, BIGINT.getType())); + + assertUnion( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 1L), Range.equal(allocator, BIGINT.getType(), 2L)), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 2L), Range.equal(allocator, BIGINT.getType(), 3L)), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 1L), Range.equal(allocator, BIGINT.getType(), 2L), Range.equal(allocator, BIGINT.getType(), 3L))); + + assertUnion(SortedRangeSet.all(allocator, BIGINT.getType()), SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L)), SortedRangeSet.all(allocator, BIGINT.getType())); + + assertUnion( + SortedRangeSet.of(Range.range(allocator, BIGINT.getType(), 0L, true, 4L, false)), + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 3L)), + SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L))); + + assertUnion( + SortedRangeSet.of(Range.greaterThanOrEqual(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(Range.lessThanOrEqual(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(Range.all(allocator, BIGINT.getType()))); + + assertUnion( + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(true, Range.lessThan(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).complement(allocator)); + } + + @Test + public void testSubtract() + throws Exception + { + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).subtract(allocator, SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).subtract(allocator, SortedRangeSet.none(BIGINT.getType())), + SortedRangeSet.all(allocator, BIGINT.getType())); + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).subtract(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).complement(allocator)); + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).subtract(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L))), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).complement(allocator)); + assertEquals( + SortedRangeSet.all(allocator, BIGINT.getType()).subtract(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))), + SortedRangeSet.of(true, Range.lessThanOrEqual(allocator, BIGINT.getType(), 0L))); + + assertEquals( + SortedRangeSet.none(BIGINT.getType()).subtract(allocator, SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.none(BIGINT.getType()).subtract(allocator, SortedRangeSet.none(BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.none(BIGINT.getType()).subtract(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.none(BIGINT.getType()).subtract(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L))), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.none(BIGINT.getType()).subtract(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))), + SortedRangeSet.none(BIGINT.getType())); + + assertEquals( + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).subtract(allocator, SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).subtract(allocator, SortedRangeSet.none(BIGINT.getType())), + SortedRangeSet.of(allocator, BIGINT.getType(), 0L)); + assertEquals( + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).subtract(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).subtract(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L))), + SortedRangeSet.none(BIGINT.getType())); + + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).subtract(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))); + + assertEquals( + SortedRangeSet.of(allocator, BIGINT.getType(), 0L).subtract(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))), + SortedRangeSet.of(allocator, BIGINT.getType(), 0L)); + + assertEquals( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).subtract(allocator, SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).subtract(allocator, SortedRangeSet.none(BIGINT.getType())), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L))); + assertEquals( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).subtract(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(allocator, BIGINT.getType(), 1L)); + assertEquals( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).subtract(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L))), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L)).subtract(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))), + SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L))); + + assertEquals( + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).subtract(allocator, SortedRangeSet.all(allocator, BIGINT.getType())), + SortedRangeSet.none(BIGINT.getType())); + assertEquals( + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).subtract(allocator, SortedRangeSet.none(BIGINT.getType())), + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))); + assertEquals( + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).subtract(allocator, SortedRangeSet.of(allocator, BIGINT.getType(), 0L)), + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))); + assertEquals( + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).subtract(allocator, SortedRangeSet.of(Range.equal(allocator, BIGINT.getType(), 0L), Range.equal(allocator, BIGINT.getType(), 1L))), + SortedRangeSet.of(Range.range(allocator, BIGINT.getType(), 0L, false, 1L, false), Range.greaterThan(allocator, BIGINT.getType(), 1L))); + assertEquals( + SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L)).subtract(allocator, SortedRangeSet.of(Range.greaterThan(allocator, BIGINT.getType(), 0L))), + SortedRangeSet.none(BIGINT.getType())); + } + + private void assertUnion(SortedRangeSet first, SortedRangeSet second, SortedRangeSet expected) + { + assertEquals(first.union(allocator, second), expected); + assertEquals(first.union(allocator, ImmutableList.of(first, second)), expected); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/examples/ExampleMetadataHandlerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/examples/ExampleMetadataHandlerTest.java new file mode 100644 index 0000000000..fb8e5122f5 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/examples/ExampleMetadataHandlerTest.java @@ -0,0 +1,326 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataService; +import com.amazonaws.athena.connector.lambda.security.IdentityUtil; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperUtil; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.lambda.invoke.LambdaFunctionException; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import static com.amazonaws.athena.connector.lambda.examples.ExampleMetadataHandler.MAX_SPLITS_PER_REQUEST; +import static org.junit.Assert.*; +import static org.mockito.Mockito.mock; + +public class ExampleMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleMetadataHandlerTest.class); + + private BlockAllocatorImpl allocator; + private ExampleMetadataHandler metadataHandler; + + @Before + public void setUp() + { + logger.info("setUpBefore - enter"); + allocator = new BlockAllocatorImpl(); + metadataHandler = new ExampleMetadataHandler(new LocalKeyFactory(), + mock(AWSSecretsManager.class), + mock(AmazonAthena.class), + "spill-bucket", + "spill-prefix"); + logger.info("setUpBefore - exit"); + } + + @After + public void after() + { + allocator.close(); + } + + @After + public void tearDown() + throws Exception + { + } + + @Test + public void doListSchemas() + { + logger.info("doListSchemas - enter"); + ListSchemasRequest req = new ListSchemasRequest(IdentityUtil.fakeIdentity(), "queryId", "default"); + ObjectMapperUtil.assertSerialization(req, req.getClass()); + ListSchemasResponse res = metadataHandler.doListSchemaNames(allocator, req); + ObjectMapperUtil.assertSerialization(res, res.getClass()); + logger.info("doListSchemas - {}", res.getSchemas()); + assertFalse(res.getSchemas().isEmpty()); + logger.info("doListSchemas - exit"); + } + + @Test + public void doListTables() + { + logger.info("doListTables - enter"); + ListTablesRequest req = new ListTablesRequest(IdentityUtil.fakeIdentity(), "queryId", "default", null); + ObjectMapperUtil.assertSerialization(req, req.getClass()); + ListTablesResponse res = metadataHandler.doListTables(allocator, req); + ObjectMapperUtil.assertSerialization(res, res.getClass()); + logger.info("doListTables - {}", res.getTables()); + assertFalse(res.getTables().isEmpty()); + logger.info("doListTables - exit"); + } + + @Test + public void doGetTable() + { + logger.info("doGetTable - enter"); + GetTableRequest req = new GetTableRequest(IdentityUtil.fakeIdentity(), "queryId", "default", + new TableName("custom_source", "fake_table")); + ObjectMapperUtil.assertSerialization(req, req.getClass()); + GetTableResponse res = metadataHandler.doGetTable(allocator, req); + ObjectMapperUtil.assertSerialization(res, res.getClass()); + assertTrue(res.getSchema().getFields().size() > 0); + assertTrue(res.getSchema().getCustomMetadata().size() > 0); + logger.info("doGetTable - {}", res); + logger.info("doGetTable - exit"); + } + + @Test(expected = LambdaFunctionException.class) + public void doGetTableFail() + { + try { + logger.info("doGetTableFail - enter"); + GetTableRequest req = new GetTableRequest(IdentityUtil.fakeIdentity(), "queryId", "default", + new TableName("lambda", "fake")); + metadataHandler.doGetTable(allocator, req); + } + catch (Exception ex) { + logger.info("doGetTableFail: ", ex); + throw new LambdaFunctionException(ex.getMessage(), false, "repackaged"); + } + } + + /** + * 200,000,000 million partitions pruned down to 38,000 and transmitted in 25 seconds + * + * @throws Exception + */ + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + Schema tableSchema = SchemaBuilder.newBuilder() + .addIntField("day") + .addIntField("month") + .addIntField("year") + .build(); + + Set partitionCols = new HashSet<>(); + partitionCols.add("day"); + partitionCols.add("month"); + partitionCols.add("year"); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put("day", SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 20)), false)); + + constraintsMap.put("month", SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 2)), false)); + + constraintsMap.put("year", SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 1900)), false)); + + GetTableLayoutRequest req = null; + GetTableLayoutResponse res = null; + try { + + req = new GetTableLayoutRequest(IdentityUtil.fakeIdentity(), "queryId", "default", + new TableName("schema1", "table1"), + new Constraints(constraintsMap), + tableSchema, + partitionCols); + ObjectMapperUtil.assertSerialization(req, req.getClass()); + + res = metadataHandler.doGetTableLayout(allocator, req); + ObjectMapperUtil.assertSerialization(res, res.getClass()); + + logger.info("doGetTableLayout - {}", res); + Block partitions = res.getPartitions(); + for (int row = 0; row < partitions.getRowCount() && row < 10; row++) { + logger.info("doGetTableLayout:{} {}", row, BlockUtils.rowToString(partitions, row)); + } + assertTrue(partitions.getRowCount() > 0); + logger.info("doGetTableLayout: partitions[{}]", partitions.getRowCount()); + } + finally { + try { + req.close(); + res.close(); + } + catch (Exception ex) { + logger.error("doGetTableLayout: ", ex); + } + } + + logger.info("doGetTableLayout - exit"); + } + + /** + * The goal of this test is to test happy case for getting splits and also to exercise the continuation token + * logic specifically. + */ + @Test + public void doGetSplits() + { + logger.info("doGetSplits: enter"); + + String yearCol = "year"; + String monthCol = "month"; + String dayCol = "day"; + + //This is the schema that ExampleMetadataHandler has layed out for a 'Partition' so we need to populate this + //minimal set of info here. + Schema schema = SchemaBuilder.newBuilder() + .addField(yearCol, new ArrowType.Int(16, false)) + .addField(monthCol, new ArrowType.Int(16, false)) + .addField(dayCol, new ArrowType.Int(16, false)) + .addField(ExampleMetadataHandler.PARTITION_LOCATION, new ArrowType.Utf8()) + .addField(ExampleMetadataHandler.SERDE, new ArrowType.Utf8()) + .build(); + + List partitionCols = new ArrayList<>(); + partitionCols.add(yearCol); + partitionCols.add(monthCol); + partitionCols.add(dayCol); + + Map constraintsMap = new HashMap<>(); + + constraintsMap.put(dayCol, SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 20)), false)); + + Block partitions = allocator.createBlock(schema); + + int num_partitions = 100; + for (int i = 0; i < num_partitions; i++) { + BlockUtils.setValue(partitions.getFieldVector(yearCol), i, 2016 + i); + BlockUtils.setValue(partitions.getFieldVector(monthCol), i, (i % 12) + 1); + BlockUtils.setValue(partitions.getFieldVector(dayCol), i, (i % 28) + 1); + BlockUtils.setValue(partitions.getFieldVector(ExampleMetadataHandler.PARTITION_LOCATION), i, String.valueOf(i)); + BlockUtils.setValue(partitions.getFieldVector(ExampleMetadataHandler.SERDE), i, "TextInputType"); + } + partitions.setRowCount(num_partitions); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(IdentityUtil.fakeIdentity(), "queryId", "catalog_name", + new TableName("schema", "table_name"), + partitions, + partitionCols, + new Constraints(constraintsMap), + continuationToken); + int numContinuations = 0; + do { + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + ObjectMapperUtil.assertSerialization(req, req.getClass()); + + logger.info("doGetSplits: req[{}]", req); + metadataHandler.setEncryption(numContinuations % 2 == 0); + logger.info("doGetSplits: Toggle encryption " + (numContinuations % 2 == 0)); + + MetadataResponse rawResponse = metadataHandler.doGetSplits(allocator, req); + ObjectMapperUtil.assertSerialization(rawResponse, rawResponse.getClass()); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}] - maxSplits[{}]", + new Object[] {continuationToken, response.getSplits().size(), MAX_SPLITS_PER_REQUEST}); + + for (Split nextSplit : response.getSplits()) { + if (numContinuations % 2 == 0) { + assertNotNull(nextSplit.getEncryptionKey()); + } + else { + assertNull(nextSplit.getEncryptionKey()); + } + assertNotNull(nextSplit.getProperty(SplitProperties.LOCATION.getId())); + assertNotNull(nextSplit.getProperty(SplitProperties.SERDE.getId())); + assertNotNull(nextSplit.getProperty(SplitProperties.SPLIT_PART.getId())); + } + + assertTrue("Continuation criteria violated", (response.getSplits().size() == MAX_SPLITS_PER_REQUEST && + response.getContinuationToken() != null) || response.getSplits().size() < MAX_SPLITS_PER_REQUEST); + + if (continuationToken != null) { + numContinuations++; + } + } + while (continuationToken != null); + + assertTrue(numContinuations > 0); + + logger.info("doGetSplits: exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/examples/ExampleRecordHandlerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/examples/ExampleRecordHandlerTest.java new file mode 100644 index 0000000000..f71a31c32e --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/examples/ExampleRecordHandlerTest.java @@ -0,0 +1,340 @@ +package com.amazonaws.athena.connector.lambda.examples; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.AllOrNoneValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordRequest; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.records.RecordService; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKey; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.IdentityUtil; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperUtil; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.stubbing.Answer; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class ExampleRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleMetadataHandlerTest.class); + + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private RecordService recordService; + private List mockS3Storage = new ArrayList<>(); + private AmazonS3 amazonS3; + private AWSSecretsManager awsSecretsManager; + private AmazonAthena athena; + private S3BlockSpillReader spillReader; + private BlockAllocatorImpl allocator; + private Schema schemaForRead; + + @Before + public void setUp() + { + logger.info("setUpBefore - enter"); + + schemaForRead = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .addField("col2", new ArrowType.Utf8()) + .addField("col3", new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)) + .addField("int", Types.MinorType.INT.getType()) + .addField("tinyint", Types.MinorType.TINYINT.getType()) + .addField("smallint", Types.MinorType.SMALLINT.getType()) + .addField("bigint", Types.MinorType.BIGINT.getType()) + .addField("uint1", Types.MinorType.UINT1.getType()) + .addField("uint2", Types.MinorType.UINT2.getType()) + .addField("uint4", Types.MinorType.UINT4.getType()) + .addField("uint8", Types.MinorType.UINT8.getType()) + .addField("float4", Types.MinorType.FLOAT4.getType()) + .addField("float8", Types.MinorType.FLOAT8.getType()) + .addField("bit", Types.MinorType.BIT.getType()) + .addField("varchar", Types.MinorType.VARCHAR.getType()) + .addField("varbinary", Types.MinorType.VARBINARY.getType()) + .addField("datemilli", Types.MinorType.DATEMILLI.getType()) + .addField("dateday", Types.MinorType.DATEDAY.getType()) + .addField("decimal", new ArrowType.Decimal(10, 2)) + .addField("decimalLong", new ArrowType.Decimal(36, 2)) //Example of a List of Structs + .addField( + FieldBuilder.newBuilder("list", new ArrowType.List()) + .addField( + FieldBuilder.newBuilder("innerStruct", Types.MinorType.STRUCT.getType()) + .addStringField("varchar") + .addBigIntField("bigint") + .build()) + .build()) + //Example of a List Of Lists + .addField( + FieldBuilder.newBuilder("outerlist", new ArrowType.List()) + .addListField("innerList", Types.MinorType.VARCHAR.getType()) + .build()) + .addMetadata("partitionCols", "col1") + .build(); + + allocator = new BlockAllocatorImpl(); + + amazonS3 = mock(AmazonS3.class); + awsSecretsManager = mock(AWSSecretsManager.class); + athena = mock(AmazonAthena.class); + + when(amazonS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer(new Answer() + { + @Override + public Object answer(InvocationOnMock invocationOnMock) + throws Throwable + { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + } + }); + + when(amazonS3.getObject(anyString(), anyString())) + .thenAnswer(new Answer() + { + @Override + public Object answer(InvocationOnMock invocationOnMock) + throws Throwable + { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + } + }); + + recordService = new LocalHandler(allocator, amazonS3, awsSecretsManager, athena); + spillReader = new S3BlockSpillReader(amazonS3, allocator); + + logger.info("setUpBefore - exit"); + } + + @After + public void after() + { + allocator.close(); + } + + @Test + public void doReadRecordsNoSpill() + { + logger.info("doReadRecordsNoSpill: enter"); + for (int i = 0; i < 2; i++) { + EncryptionKey encryptionKey = (i % 2 == 0) ? keyFactory.create() : null; + logger.info("doReadRecordsNoSpill: Using encryptionKey[" + encryptionKey + "]"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("col3", SortedRangeSet.copyOf(Types.MinorType.FLOAT8.getType(), + ImmutableList.of(Range.equal(allocator, Types.MinorType.FLOAT8.getType(), 22.0D)), false)); + + ReadRecordsRequest request = new ReadRecordsRequest(IdentityUtil.fakeIdentity(), + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("schema", "table"), + schemaForRead, + Split.newBuilder(makeSpillLocation(), encryptionKey).add("col1", "10").build(), + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + ObjectMapperUtil.assertSerialization(request, request.getClass()); + + RecordResponse rawResponse = recordService.readRecords(request); + ObjectMapperUtil.assertSerialization(rawResponse, rawResponse.getClass()); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsNoSpill: rows[{}]", response.getRecordCount()); + + assertTrue(response.getRecords().getRowCount() == 1); + logger.info("doReadRecordsNoSpill: {}", BlockUtils.rowToString(response.getRecords(), 0)); + } + logger.info("doReadRecordsNoSpill: exit"); + } + + @Test + public void doReadRecordsSpill() + throws Exception + { + logger.info("doReadRecordsSpill: enter"); + for (int i = 0; i < 2; i++) { + EncryptionKey encryptionKey = (i % 2 == 0) ? keyFactory.create() : null; + logger.info("doReadRecordsSpill: Using encryptionKey[" + encryptionKey + "]"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("col3", SortedRangeSet.copyOf(Types.MinorType.FLOAT8.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.FLOAT8.getType(), -10000D)), false)); + constraintsMap.put("unknown", EquatableValueSet.newBuilder(allocator, Types.MinorType.FLOAT8.getType(), false, true).add(1.1D).build()); + constraintsMap.put("unknown2", new AllOrNoneValueSet(Types.MinorType.FLOAT8.getType(), false, true)); + + ReadRecordsRequest request = new ReadRecordsRequest(IdentityUtil.fakeIdentity(), + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("schema", "table"), + schemaForRead, + Split.newBuilder(makeSpillLocation(), encryptionKey).add("col1", "1").build(), + new Constraints(constraintsMap), + 1_600_000L, //~1.5MB so we should see some spill + 1000L + ); + ObjectMapperUtil.assertSerialization(request, request.getClass()); + + RecordResponse rawResponse = recordService.readRecords(request); + ObjectMapperUtil.assertSerialization(rawResponse, rawResponse.getClass()); + + assertTrue(rawResponse instanceof RemoteReadRecordsResponse); + + try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) { + logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size()); + + assertTrue(response.getNumberBlocks() > 1); + + int blockNum = 0; + for (SpillLocation next : response.getRemoteBlocks()) { + S3SpillLocation spillLocation = (S3SpillLocation) next; + try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) { + + logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount()); + // assertTrue(++blockNum < response.getRemoteBlocks().size() && block.getRowCount() > 10_000); + + logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0)); + assertNotNull(BlockUtils.rowToString(block, 0)); + } + } + } + } + logger.info("doReadRecordsSpill: exit"); + } + + private static class LocalHandler + implements RecordService + { + private ExampleRecordHandler handler; + private final BlockAllocatorImpl allocator; + + public LocalHandler(BlockAllocatorImpl allocator, AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena) + { + handler = new ExampleRecordHandler(amazonS3, secretsManager, athena); + handler.setNumRows(20_000);//lower number for faster unit tests vs integ tests + this.allocator = allocator; + } + + @Override + public RecordResponse readRecords(RecordRequest request) + { + + try { + switch (request.getRequestType()) { + case READ_RECORDS: + ReadRecordsRequest req = (ReadRecordsRequest) request; + RecordResponse response = handler.doReadRecords(allocator, req); + return response; + default: + throw new RuntimeException("Unknown request type " + request.getRequestType()); + } + } + catch (Exception ex) { + throw new RuntimeException(ex); + } + } + } + + private SpillLocation makeSpillLocation() + { + return S3SpillLocation.newBuilder() + .withBucket("athena-virtuoso-test") + .withPrefix("lambda-spill") + .withQueryId(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/handlers/CompositeHandlerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/handlers/CompositeHandlerTest.java new file mode 100644 index 0000000000..001bbf1e9c --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/handlers/CompositeHandlerTest.java @@ -0,0 +1,229 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.examples.ExampleMetadataHandlerTest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.amazonaws.athena.connector.lambda.request.PingResponse; +import com.amazonaws.athena.connector.lambda.security.IdentityUtil; +import com.amazonaws.athena.connector.lambda.serde.ObjectMapperFactory; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayOutputStream; +import java.util.Collections; +import java.util.HashMap; +import java.util.UUID; + +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +public class CompositeHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(ExampleMetadataHandlerTest.class); + + private MetadataHandler mockMetadataHandler; + private RecordHandler mockRecordHandler; + private CompositeHandler compositeHandler; + private BlockAllocatorImpl allocator; + private ObjectMapper objectMapper; + private Schema schemaForRead; + + @Before + public void setUp() + throws Exception + { + allocator = new BlockAllocatorImpl(); + objectMapper = ObjectMapperFactory.create(allocator); + mockMetadataHandler = mock(MetadataHandler.class); + mockRecordHandler = mock(RecordHandler.class); + + schemaForRead = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .build(); + + when(mockMetadataHandler.doGetTableLayout(any(BlockAllocatorImpl.class), any(GetTableLayoutRequest.class))) + .thenReturn(new GetTableLayoutResponse("catalog", + new TableName("schema", "table"), + BlockUtils.newBlock(allocator, "col1", Types.MinorType.BIGINT.getType(), 1L))); + + when(mockMetadataHandler.doListTables(any(BlockAllocatorImpl.class), any(ListTablesRequest.class))) + .thenReturn(new ListTablesResponse("catalog", + Collections.singletonList(new TableName("schema", "table")))); + + when(mockMetadataHandler.doGetTable(any(BlockAllocatorImpl.class), any(GetTableRequest.class))) + .thenReturn(new GetTableResponse("catalog", + new TableName("schema", "table"), + SchemaBuilder.newBuilder().addStringField("col1").build())); + + when(mockMetadataHandler.doListSchemaNames(any(BlockAllocatorImpl.class), any(ListSchemasRequest.class))) + .thenReturn(new ListSchemasResponse("catalog", Collections.singleton("schema1"))); + + when(mockMetadataHandler.doGetSplits(any(BlockAllocatorImpl.class), any(GetSplitsRequest.class))) + .thenReturn(new GetSplitsResponse("catalog", Split.newBuilder(null, null).build())); + + when(mockMetadataHandler.doPing(any(PingRequest.class))) + .thenReturn(new PingResponse("catalog", "queryId", "type", 23)); + + when(mockRecordHandler.doReadRecords(any(BlockAllocatorImpl.class), any(ReadRecordsRequest.class))) + .thenReturn(new ReadRecordsResponse("catalog", + BlockUtils.newEmptyBlock(allocator, "col", new ArrowType.Int(32, true)))); + + compositeHandler = new CompositeHandler(mockMetadataHandler, mockRecordHandler); + } + + @After + public void after() + { + allocator.close(); + } + + @Test + public void doReadRecords() + throws Exception + { + logger.info("doReadRecords - enter"); + ReadRecordsRequest req = new ReadRecordsRequest(IdentityUtil.fakeIdentity(), + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("schema", "table"), + schemaForRead, + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket("athena-virtuoso-test") + .withPrefix("lambda-spill") + .withQueryId(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), null).build(), + new Constraints(new HashMap<>()), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockRecordHandler, times(1)) + .doReadRecords(any(BlockAllocator.class), any(ReadRecordsRequest.class)); + logger.info("readRecords - exit"); + } + + @Test + public void doListSchemaNames() + throws Exception + { + logger.info("doListSchemas - enter"); + ListSchemasRequest req = mock(ListSchemasRequest.class); + when(req.getRequestType()).thenReturn(MetadataRequestType.LIST_SCHEMAS); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockMetadataHandler, times(1)).doListSchemaNames(any(BlockAllocatorImpl.class), any(ListSchemasRequest.class)); + logger.info("doListSchemas - exit"); + } + + @Test + public void doListTables() + throws Exception + { + logger.info("doListTables - enter"); + ListTablesRequest req = mock(ListTablesRequest.class); + when(req.getRequestType()).thenReturn(MetadataRequestType.LIST_TABLES); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockMetadataHandler, times(1)).doListTables(any(BlockAllocatorImpl.class), any(ListTablesRequest.class)); + logger.info("doListTables - exit"); + } + + @Test + public void doGetTable() + throws Exception + { + logger.info("doGetTable - enter"); + GetTableRequest req = mock(GetTableRequest.class); + when(req.getRequestType()).thenReturn(MetadataRequestType.GET_TABLE); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockMetadataHandler, times(1)).doGetTable(any(BlockAllocatorImpl.class), any(GetTableRequest.class)); + logger.info("doGetTable - exit"); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + GetTableLayoutRequest req = mock(GetTableLayoutRequest.class); + when(req.getRequestType()).thenReturn(MetadataRequestType.GET_TABLE_LAYOUT); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockMetadataHandler, times(1)).doGetTableLayout(any(BlockAllocatorImpl.class), any(GetTableLayoutRequest.class)); + logger.info("doGetTableLayout - exit"); + } + + @Test + public void doGetSplits() + throws Exception + { + logger.info("doGetSplits - enter"); + GetSplitsRequest req = mock(GetSplitsRequest.class); + when(req.getRequestType()).thenReturn(MetadataRequestType.GET_SPLITS); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockMetadataHandler, times(1)).doGetSplits(any(BlockAllocatorImpl.class), any(GetSplitsRequest.class)); + logger.info("doGetSplits - exit"); + } + + @Test + public void doPing() + throws Exception + { + logger.info("doPing - enter"); + PingRequest req = mock(PingRequest.class); + when(req.getCatalogName()).thenReturn("catalog"); + when(req.getQueryId()).thenReturn("queryId"); + compositeHandler.handleRequest(allocator, req, new ByteArrayOutputStream(), objectMapper); + verify(mockMetadataHandler, times(1)).doPing(any(PingRequest.class)); + logger.info("doPing - exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/handlers/GlueMetadataHandlerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/handlers/GlueMetadataHandlerTest.java new file mode 100644 index 0000000000..7f0d0e7d47 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/handlers/GlueMetadataHandlerTest.java @@ -0,0 +1,285 @@ +package com.amazonaws.athena.connector.lambda.handlers; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.IdentityUtil; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.model.Column; +import com.amazonaws.services.glue.model.Database; +import com.amazonaws.services.glue.model.GetDatabasesRequest; +import com.amazonaws.services.glue.model.GetDatabasesResult; +import com.amazonaws.services.glue.model.GetTableResult; +import com.amazonaws.services.glue.model.GetTablesRequest; +import com.amazonaws.services.glue.model.GetTablesResult; +import com.amazonaws.services.glue.model.StorageDescriptor; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.stream.Collectors; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class GlueMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(MockitoJUnitRunner.class); + + private String accountId = IdentityUtil.fakeIdentity().getAccount(); + private String queryId = "queryId"; + private String catalog = "default"; + private String schema = "database1"; + private String table = "table1"; + + private GlueMetadataHandler handler; + + private BlockAllocatorImpl allocator; + + @Mock + private AWSGlue mockGlue; + + @Before + public void setUp() + throws Exception + { + handler = new GlueMetadataHandler(mockGlue, + new LocalKeyFactory(), + mock(AWSSecretsManager.class), + mock(AmazonAthena.class), + "glue-test", + "spill-bucket", + "spill-prefix") + { + @Override + public GetTableLayoutResponse doGetTableLayout(BlockAllocator blockAllocator, GetTableLayoutRequest request) + { + throw new UnsupportedOperationException(); + } + + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + throw new UnsupportedOperationException(); + } + + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest request) + { + throw new UnsupportedOperationException(); + } + + @Override + protected Field convertField(String name, String type) + { + if ("int".equals(type)) { + return FieldBuilder.newBuilder(name, Types.MinorType.INT.getType()).build(); + } + else if ("bigint".equals(type)) { + return FieldBuilder.newBuilder(name, Types.MinorType.BIGINT.getType()).build(); + } + else if ("string".equals(type)) { + return FieldBuilder.newBuilder(name, Types.MinorType.VARCHAR.getType()).build(); + } + throw new IllegalArgumentException("Unsupported type " + type); + } + }; + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doListSchemaNames() + throws Exception + { + logger.info("doListSchemaNames: enter"); + + List databases = new ArrayList<>(); + databases.add(new Database().withName("db1")); + databases.add(new Database().withName("db2")); + + when(mockGlue.getDatabases(any(GetDatabasesRequest.class))) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + GetDatabasesRequest request = (GetDatabasesRequest) invocationOnMock.getArguments()[0]; + assertEquals(accountId, request.getCatalogId()); + GetDatabasesResult mockResult = mock(GetDatabasesResult.class); + if (request.getNextToken() == null) { + when(mockResult.getDatabaseList()).thenReturn(databases); + when(mockResult.getNextToken()).thenReturn("next"); + } + else { + //only return real info on 1st call + when(mockResult.getDatabaseList()).thenReturn(new ArrayList<>()); + when(mockResult.getNextToken()).thenReturn(null); + } + return mockResult; + }); + + ListSchemasRequest req = new ListSchemasRequest(IdentityUtil.fakeIdentity(), queryId, catalog); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + + logger.info("doListSchemas - {}", res.getSchemas()); + + assertEquals(databases.stream().map(next -> next.getName()).collect(Collectors.toList()), + new ArrayList<>(res.getSchemas())); + + verify(mockGlue, times(2)).getDatabases(any(GetDatabasesRequest.class)); + logger.info("doListSchemaNames: exit"); + } + + @Test + public void doListTables() + throws Exception + { + logger.info("doListTables - enter"); + + List

tables = new ArrayList<>(); + tables.add(new Table().withName("table1")); + tables.add(new Table().withName("table2")); + + when(mockGlue.getTables(any(GetTablesRequest.class))) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + GetTablesRequest request = (GetTablesRequest) invocationOnMock.getArguments()[0]; + assertEquals(accountId, request.getCatalogId()); + assertEquals(schema, request.getDatabaseName()); + GetTablesResult mockResult = mock(GetTablesResult.class); + if (request.getNextToken() == null) { + when(mockResult.getTableList()).thenReturn(tables); + when(mockResult.getNextToken()).thenReturn("next"); + } + else { + //only return real info on 1st call + when(mockResult.getTableList()).thenReturn(new ArrayList<>()); + when(mockResult.getNextToken()).thenReturn(null); + } + return mockResult; + }); + + ListTablesRequest req = new ListTablesRequest(IdentityUtil.fakeIdentity(), queryId, catalog, schema); + ListTablesResponse res = handler.doListTables(allocator, req); + logger.info("doListTables - {}", res.getTables()); + + Set tableNames = tables.stream().map(next -> next.getName()).collect(Collectors.toSet()); + for (TableName next : res.getTables()) { + assertEquals(schema, next.getSchemaName()); + assertTrue(tableNames.contains(next.getTableName())); + } + assertEquals(tableNames.size(), res.getTables().size()); + + logger.info("doListTables - exit"); + } + + @Test + public void doGetTable() + throws Exception + { + logger.info("doGetTable - enter"); + + Map expectedParams = new HashMap<>(); + expectedParams.put("param1", "val1"); + expectedParams.put("param2", "val2"); + + List columns = new ArrayList<>(); + columns.add(new Column().withName("col1").withType("int").withComment("comment")); + columns.add(new Column().withName("col2").withType("bigint").withComment("comment")); + columns.add(new Column().withName("col3").withType("string").withComment("comment")); + + Table mockTable = mock(Table.class); + StorageDescriptor mockSd = mock(StorageDescriptor.class); + + when(mockTable.getName()).thenReturn(table); + when(mockTable.getStorageDescriptor()).thenReturn(mockSd); + when(mockTable.getParameters()).thenReturn(expectedParams); + when(mockSd.getColumns()).thenReturn(columns); + + when(mockGlue.getTable(any(com.amazonaws.services.glue.model.GetTableRequest.class))) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + com.amazonaws.services.glue.model.GetTableRequest request = + (com.amazonaws.services.glue.model.GetTableRequest) invocationOnMock.getArguments()[0]; + + assertEquals(accountId, request.getCatalogId()); + assertEquals(schema, request.getDatabaseName()); + assertEquals(table, request.getName()); + + GetTableResult mockResult = mock(GetTableResult.class); + when(mockResult.getTable()).thenReturn(mockTable); + return mockResult; + }); + + GetTableRequest req = new GetTableRequest(IdentityUtil.fakeIdentity(), queryId, catalog, new TableName(schema, table)); + GetTableResponse res = handler.doGetTable(allocator, req); + + logger.info("doGetTable - {}", res); + + assertTrue(res.getSchema().getFields().size() > 0); + assertTrue(res.getSchema().getCustomMetadata().size() > 0); + + logger.info("doGetTable - exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueFieldLexerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueFieldLexerTest.java new file mode 100644 index 0000000000..7ac78b65c5 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueFieldLexerTest.java @@ -0,0 +1,108 @@ +package com.amazonaws.athena.connector.lambda.metadata.glue; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.List; + +import static org.junit.Assert.*; + +public class GlueFieldLexerTest +{ + private static final Logger logger = LoggerFactory.getLogger(GlueFieldLexerTest.class); + + private static final String INPUT1 = "STRUCT < street_address: STRUCT < street_number: INT, street_name: STRING, street_type: STRING >, country: STRING, postal_code: ARRAY>"; + + private static final String INPUT2 = "ARRAY"; + + private static final String INPUT3 = "INT"; + + @Test + public void basicLexTest() + { + logger.info("basicLexTest: enter"); + + Field field = GlueFieldLexer.lex("testField", INPUT2); + assertEquals("testField", field.getName()); + assertEquals(Types.MinorType.LIST, Types.getMinorTypeForArrowType(field.getType())); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(field.getChildren().get(0).getType())); + + logger.info("basicLexTest: exit"); + } + + @Test + public void baseLexTest() + { + logger.info("baseLexTest: enter"); + + Field field = GlueFieldLexer.lex("testField", INPUT3); + assertEquals("testField", field.getName()); + assertEquals(Types.MinorType.INT, Types.getMinorTypeForArrowType(field.getType())); + assertEquals(0, field.getChildren().size()); + + logger.info("baseLexTest: exit"); + } + + @Test + public void lexTest() + { + logger.info("lexTest: enter"); + + Field field = GlueFieldLexer.lex("testField", INPUT1); + + logger.info("lexTest: {}", field); + assertEquals("testField", field.getName()); + assertEquals(Types.MinorType.STRUCT, Types.getMinorTypeForArrowType(field.getType())); + assertEquals(3, field.getChildren().size()); + + List level1 = field.getChildren(); + assertEquals("street_address", level1.get(0).getName()); + assertEquals(Types.MinorType.STRUCT, Types.getMinorTypeForArrowType(level1.get(0).getType())); + assertEquals(3, level1.get(0).getChildren().size()); + + List level2 = level1.get(0).getChildren(); + assertEquals("street_number", level2.get(0).getName()); + assertEquals(Types.MinorType.INT, Types.getMinorTypeForArrowType(level2.get(0).getType())); + assertEquals(0, level2.get(0).getChildren().size()); + assertEquals("street_name", level2.get(1).getName()); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(level2.get(1).getType())); + assertEquals(0, level2.get(1).getChildren().size()); + assertEquals("street_type", level2.get(2).getName()); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(level2.get(2).getType())); + assertEquals(0, level2.get(2).getChildren().size()); + + assertEquals("country", level1.get(1).getName()); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(level1.get(1).getType())); + assertEquals(0, level1.get(1).getChildren().size()); + + assertEquals("postal_code", level1.get(2).getName()); + assertEquals(Types.MinorType.LIST, Types.getMinorTypeForArrowType(level1.get(2).getType())); + assertEquals(1, level1.get(2).getChildren().size()); + assertEquals(Types.MinorType.VARCHAR, Types.getMinorTypeForArrowType(level1.get(2).getChildren().get(0).getType())); + + logger.info("lexTest: exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueTypeParserTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueTypeParserTest.java new file mode 100644 index 0000000000..39e2ceb972 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/metadata/glue/GlueTypeParserTest.java @@ -0,0 +1,73 @@ +package com.amazonaws.athena.connector.lambda.metadata.glue; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.List; + +import static org.junit.Assert.*; + +public class GlueTypeParserTest +{ + private static final Logger logger = LoggerFactory.getLogger(GlueTypeParserTest.class); + + private static final String INPUT = "STRUCT < street_address: STRUCT < street_number: INT, street_name: STRING, street_type: STRING >, country: STRING, postal_code: ARRAY>"; + + private static final List expectedTokens = new ArrayList<>(); + + static { + expectedTokens.add(new GlueTypeParser.Token("STRUCT", GlueTypeParser.FIELD_START, 8)); + expectedTokens.add(new GlueTypeParser.Token("street_address", GlueTypeParser.FIELD_DIV, 25)); + expectedTokens.add(new GlueTypeParser.Token("STRUCT", GlueTypeParser.FIELD_START, 34)); + expectedTokens.add(new GlueTypeParser.Token("street_number", GlueTypeParser.FIELD_DIV, 52)); + expectedTokens.add(new GlueTypeParser.Token("INT", GlueTypeParser.FIELD_SEP, 57)); + expectedTokens.add(new GlueTypeParser.Token("street_name", GlueTypeParser.FIELD_DIV, 73)); + expectedTokens.add(new GlueTypeParser.Token("STRING", GlueTypeParser.FIELD_SEP, 81)); + expectedTokens.add(new GlueTypeParser.Token("street_type", GlueTypeParser.FIELD_DIV, 97)); + expectedTokens.add(new GlueTypeParser.Token("STRING", GlueTypeParser.FIELD_END, 107)); + expectedTokens.add(new GlueTypeParser.Token("", GlueTypeParser.FIELD_SEP, 108)); + expectedTokens.add(new GlueTypeParser.Token("country", GlueTypeParser.FIELD_DIV, 118)); + expectedTokens.add(new GlueTypeParser.Token("STRING", GlueTypeParser.FIELD_SEP, 126)); + expectedTokens.add(new GlueTypeParser.Token("postal_code", GlueTypeParser.FIELD_DIV, 140)); + expectedTokens.add(new GlueTypeParser.Token("ARRAY", GlueTypeParser.FIELD_START, 147)); + expectedTokens.add(new GlueTypeParser.Token("STRING", GlueTypeParser.FIELD_END, 154)); + expectedTokens.add(new GlueTypeParser.Token("", GlueTypeParser.FIELD_END, 155)); + } + + private GlueTypeParser parser = new GlueTypeParser(INPUT); + + @Test + public void parseTest() + { + logger.info("parseTest: enter"); + int pos = 0; + while (parser.hasNext()) { + GlueTypeParser.Token next = parser.next(); + logger.info("parseTest: {} => {}", next.getValue(), next.getMarker()); + assertEquals(expectedTokens.get(pos++), next); + } + logger.info("parseTest: exits"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/BlockCryptoTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/BlockCryptoTest.java new file mode 100644 index 0000000000..9f078be045 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/BlockCryptoTest.java @@ -0,0 +1,74 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import static org.junit.Assert.*; + +public class BlockCryptoTest +{ + private final EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void test() + { + Schema schema = SchemaBuilder.newBuilder() + .addField("col1", new ArrowType.Int(32, true)) + .addField("col2", new ArrowType.Utf8()) + .build(); + + Block expected = allocator.createBlock(schema); + BlockUtils.setValue(expected.getFieldVector("col1"), 1, 100); + BlockUtils.setValue(expected.getFieldVector("col2"), 1, "VarChar"); + BlockUtils.setValue(expected.getFieldVector("col1"), 1, 101); + BlockUtils.setValue(expected.getFieldVector("col2"), 1, "VarChar1"); + expected.setRowCount(2); + + AesGcmBlockCrypto crypto = new AesGcmBlockCrypto(new BlockAllocatorImpl()); + EncryptionKey key = keyFactory.create(); + + byte[] cypher = crypto.encrypt(key, expected); + Block actual = crypto.decrypt(key, cypher, schema); + assertEquals(expected, actual); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/CacheableSecretsManagerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/CacheableSecretsManagerTest.java new file mode 100644 index 0000000000..d1b719ccf1 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/CacheableSecretsManagerTest.java @@ -0,0 +1,134 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.mockito.invocation.InvocationOnMock; + +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.reset; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.verifyNoMoreInteractions; +import static org.mockito.Mockito.when; + +public class CacheableSecretsManagerTest +{ + private AWSSecretsManager mockSecretsManager; + + private CachableSecretsManager cachableSecretsManager; + + @Before + public void setup() + { + mockSecretsManager = mock(AWSSecretsManager.class); + cachableSecretsManager = new CachableSecretsManager(mockSecretsManager); + } + + @After + public void after() + { + reset(mockSecretsManager); + } + + @Test + public void expirationTest() + { + cachableSecretsManager.addCacheEntry("test", "value", System.currentTimeMillis()); + assertEquals("value", cachableSecretsManager.getSecret("test")); + verifyNoMoreInteractions(mockSecretsManager); + reset(mockSecretsManager); + + when(mockSecretsManager.getSecretValue(any(GetSecretValueRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + GetSecretValueRequest request = invocation.getArgumentAt(0, GetSecretValueRequest.class); + if (request.getSecretId().equalsIgnoreCase("test")) { + return new GetSecretValueResult().withSecretString("value2"); + } + throw new RuntimeException(); + }); + + cachableSecretsManager.addCacheEntry("test", "value", 0); + assertEquals("value2", cachableSecretsManager.getSecret("test")); + } + + @Test + public void evictionTest() + { + for (int i = 0; i < CachableSecretsManager.MAX_CACHE_SIZE; i++) { + cachableSecretsManager.addCacheEntry("test" + i, "value" + i, System.currentTimeMillis()); + } + when(mockSecretsManager.getSecretValue(any(GetSecretValueRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + GetSecretValueRequest request = invocation.getArgumentAt(0, GetSecretValueRequest.class); + return new GetSecretValueResult().withSecretString(request.getSecretId() + "_value"); + }); + + assertEquals("test_value", cachableSecretsManager.getSecret("test")); + assertEquals("test0_value", cachableSecretsManager.getSecret("test0")); + + verify(mockSecretsManager, times(2)).getSecretValue(any(GetSecretValueRequest.class)); + } + + @Test + public void resolveSecrets() + { + when(mockSecretsManager.getSecretValue(any(GetSecretValueRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + GetSecretValueRequest request = invocation.getArgumentAt(0, GetSecretValueRequest.class); + String result = request.getSecretId(); + if (result.equalsIgnoreCase("unknown")) { + throw new RuntimeException("Unknown secret!"); + } + return new GetSecretValueResult().withSecretString(result); + }); + + String oneSecret = "${OneSecret}"; + String oneExpected = "OneSecret"; + assertEquals(oneExpected, cachableSecretsManager.resolveSecrets(oneSecret)); + + String twoSecrets = "ThisIsMyStringWith${TwoSecret}SuperSecret${Secrets}"; + String twoExpected = "ThisIsMyStringWithTwoSecretSuperSecretSecrets"; + assertEquals(twoExpected, cachableSecretsManager.resolveSecrets(twoSecrets)); + + String noSecrets = "ThisIsMyStringWithTwoSecretSuperSecretSecrets"; + String noSecretsExpected = "ThisIsMyStringWithTwoSecretSuperSecretSecrets"; + assertEquals(noSecretsExpected, cachableSecretsManager.resolveSecrets(noSecrets)); + + String commonErrors = "ThisIsM}yStringWi${thTwoSecretS{uperSecretSecrets"; + String commonErrorsExpected = "ThisIsM}yStringWi${thTwoSecretS{uperSecretSecrets"; + assertEquals(commonErrorsExpected, cachableSecretsManager.resolveSecrets(commonErrors)); + + String unknownSecret = "This${Unknown}"; + try { + cachableSecretsManager.resolveSecrets(unknownSecret); + fail("Should not see this!"); + } + catch (RuntimeException ex) {} + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/IdentityUtil.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/IdentityUtil.java new file mode 100644 index 0000000000..baab50d562 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/security/IdentityUtil.java @@ -0,0 +1,31 @@ +package com.amazonaws.athena.connector.lambda.security; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +public class IdentityUtil +{ + private IdentityUtil() {} + + public static FederatedIdentity fakeIdentity() + { + return new FederatedIdentity("access_key_id", "principle", "account"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/BlockSerializationTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/BlockSerializationTest.java new file mode 100644 index 0000000000..1af6037464 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/BlockSerializationTest.java @@ -0,0 +1,83 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.arrow.vector.types.Types; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import static org.junit.Assert.assertEquals; + +public class BlockSerializationTest +{ + private static final Logger logger = LoggerFactory.getLogger(BlockSerializationTest.class); + + //We use two allocators to test moving data across allocators, some operations only fail when done across. + //because of how Arrow does zero copy buffer reuse. + private BlockAllocatorImpl allocator; + private BlockAllocatorImpl otherAllocator; + private ObjectMapper objectMapper; + + @Before + public void setup() + { + otherAllocator = new BlockAllocatorImpl(); + allocator = new BlockAllocatorImpl(); + objectMapper = ObjectMapperFactory.create(allocator); + } + + @After + public void tearDown() + { + otherAllocator.close(); + allocator.close(); + } + + @Test + public void serializationTest() + throws IOException + { + logger.info("serializationTest - enter"); + + Block expected = BlockUtils.newBlock(otherAllocator, "col1", Types.MinorType.INT.getType(), 21); + + ObjectMapper serializer = ObjectMapperFactory.create(new BlockAllocatorImpl()); + ByteArrayOutputStream out = new ByteArrayOutputStream(); + serializer.writeValue(out, expected); + + Block actual = serializer.readValue(new ByteArrayInputStream(out.toByteArray()), Block.class); + + assertEquals(expected, actual); + + logger.info("serializationTest - exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/ConstraintSerializationTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/ConstraintSerializationTest.java new file mode 100644 index 0000000000..a3d1bd096e --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/ConstraintSerializationTest.java @@ -0,0 +1,96 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.security.IdentityUtil; +import com.google.common.collect.ImmutableList; +import org.apache.arrow.vector.types.Types; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.HashMap; +import java.util.HashSet; +import java.util.Map; + +public class ConstraintSerializationTest +{ + private static final Logger logger = LoggerFactory.getLogger(ConstraintSerializationTest.class); + + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void serializationTest() + throws Exception + { + logger.info("serializationTest - enter"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("col2", SortedRangeSet.copyOf(Types.MinorType.BIGINT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.BIGINT.getType(), 950L)), false)); + + constraintsMap.put("col3", SortedRangeSet.copyOf(Types.MinorType.BIT.getType(), + ImmutableList.of(Range.equal(allocator, Types.MinorType.BIT.getType(), false)), false)); + + constraintsMap.put("col4", SortedRangeSet.copyOf(Types.MinorType.FLOAT8.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.FLOAT8.getType(), 950.0D)), false)); + + constraintsMap.put("col5", SortedRangeSet.copyOf(Types.MinorType.VARCHAR.getType(), + ImmutableList.of(Range.equal(allocator, Types.MinorType.VARCHAR.getType(), "8"), + Range.equal(allocator, Types.MinorType.VARCHAR.getType(), "9")), false)); + + try ( + GetTableLayoutRequest req = new GetTableLayoutRequest(IdentityUtil.fakeIdentity(), + "queryId", + "default", + new TableName("schema1", "table1"), + new Constraints(constraintsMap), + SchemaBuilder.newBuilder().build(), + new HashSet<>()) + ) { + ObjectMapperUtil.assertSerialization(req, req.getClass()); + } + + logger.info("serializationTest - exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/MarkerSerializationTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/MarkerSerializationTest.java new file mode 100644 index 0000000000..300623e6d9 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/MarkerSerializationTest.java @@ -0,0 +1,107 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.domain.predicate.Marker; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.arrow.vector.types.Types; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import static org.junit.Assert.*; + +public class MarkerSerializationTest +{ + private static final Logger logger = LoggerFactory.getLogger(MarkerSerializationTest.class); + private BlockAllocatorImpl allocator; + + @Before + public void setup() + { + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void serializationTest() + throws IOException + { + logger.info("serializationTest - enter"); + + ObjectMapper serializer = ObjectMapperFactory.create(new BlockAllocatorImpl()); + + int expectedValue = 1024; + Marker expectedMarker = Marker.exactly(allocator, Types.MinorType.INT.getType(), expectedValue); + + ByteArrayOutputStream out = new ByteArrayOutputStream(); + serializer.writeValue(out, expectedMarker); + + ObjectMapper deserializer = ObjectMapperFactory.create(allocator); + + Marker actualMarker = deserializer.readValue(new ByteArrayInputStream(out.toByteArray()), Marker.class); + + assertEquals(expectedMarker.getSchema().getCustomMetadata(), actualMarker.getSchema().getCustomMetadata()); + assertEquals(expectedMarker.getSchema().getFields(), actualMarker.getSchema().getFields()); + assertEquals(expectedMarker.getBound(), actualMarker.getBound()); + assertEquals(expectedMarker.getValue(), actualMarker.getValue()); + assertEquals(expectedValue, actualMarker.getValue()); + assertEquals(false, actualMarker.isNullValue()); + + logger.info("serializationTest - exit"); + } + + @Test + public void nullableSerializationTest() + throws IOException + { + logger.info("nullableSerializationTest - enter"); + + ObjectMapper serializer = ObjectMapperFactory.create(new BlockAllocatorImpl()); + Marker expectedMarker = Marker.nullMarker(allocator, Types.MinorType.INT.getType()); + + ByteArrayOutputStream out = new ByteArrayOutputStream(); + serializer.writeValue(out, expectedMarker); + + ObjectMapper deserializer = ObjectMapperFactory.create(allocator); + + Marker actualMarker = deserializer.readValue(new ByteArrayInputStream(out.toByteArray()), Marker.class); + + assertEquals(expectedMarker.getSchema().getCustomMetadata(), actualMarker.getSchema().getCustomMetadata()); + assertEquals(expectedMarker.getSchema().getFields(), actualMarker.getSchema().getFields()); + assertEquals(expectedMarker.getBound(), actualMarker.getBound()); + assertEquals(true, actualMarker.isNullValue()); + + logger.info("nullableSerializationTest - exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/ObjectMapperUtil.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/ObjectMapperUtil.java new file mode 100644 index 0000000000..d55abc1395 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/ObjectMapperUtil.java @@ -0,0 +1,50 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.fasterxml.jackson.databind.ObjectMapper; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import static org.junit.Assert.assertEquals; + +public class ObjectMapperUtil +{ + private ObjectMapperUtil() {} + + public static void assertSerialization(Object object, Class clazz) + { + Object actual = null; + try (BlockAllocatorImpl allocator = new BlockAllocatorImpl()) { + ObjectMapper mapper = ObjectMapperFactory.create(allocator); + ByteArrayOutputStream out = new ByteArrayOutputStream(); + mapper.writeValue(out, object); + actual = mapper.readValue(new ByteArrayInputStream(out.toByteArray()), clazz); + assertEquals(object, actual); + } + catch (IOException | AssertionError ex) { + throw new RuntimeException(ex); + } + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/SchemaSerializationTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/SchemaSerializationTest.java new file mode 100644 index 0000000000..d13532ac10 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/SchemaSerializationTest.java @@ -0,0 +1,78 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaSerDe; +import com.fasterxml.jackson.databind.ObjectMapper; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.io.IOException; + +import static org.junit.Assert.*; + +public class SchemaSerializationTest +{ + private static final Logger logger = LoggerFactory.getLogger(SchemaSerializationTest.class); + + private final ObjectMapper objectMapper = ObjectMapperFactory.create(new BlockAllocatorImpl()); + + @Test + public void serializationTest() + throws IOException + { + logger.info("serializationTest - enter"); + SchemaBuilder schemaBuilder = new SchemaBuilder(); + schemaBuilder.addMetadata("meta1", "meta-value-1"); + schemaBuilder.addMetadata("meta2", "meta-value-2"); + schemaBuilder.addField("intfield1", new ArrowType.Int(32, true)); + schemaBuilder.addField("doublefield2", new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)); + schemaBuilder.addField("varcharfield3", new ArrowType.Utf8()); + Schema expectedSchema = schemaBuilder.build(); + + SchemaSerDe serDe = new SchemaSerDe(); + ByteArrayOutputStream schemaOut = new ByteArrayOutputStream(); + serDe.serialize(expectedSchema, schemaOut); + + TestPojo expected = new TestPojo(expectedSchema); + + ByteArrayOutputStream out = new ByteArrayOutputStream(); + objectMapper.writeValue(out, expected); + TestPojo actual = objectMapper.readValue(new ByteArrayInputStream(out.toByteArray()), TestPojo.class); + + Schema actualSchema = actual.getSchema(); + logger.info("serializationTest - fields[{}]", actualSchema.getFields()); + logger.info("serializationTest - meta[{}]", actualSchema.getCustomMetadata()); + + assertEquals(expectedSchema.getFields(), actualSchema.getFields()); + assertEquals(expectedSchema.getCustomMetadata(), actualSchema.getCustomMetadata()); + + logger.info("serializationTest - exit"); + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/TestPojo.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/TestPojo.java new file mode 100644 index 0000000000..5aaf0e9db2 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/serde/TestPojo.java @@ -0,0 +1,41 @@ +package com.amazonaws.athena.connector.lambda.serde; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.fasterxml.jackson.annotation.JsonCreator; +import com.fasterxml.jackson.annotation.JsonProperty; +import org.apache.arrow.vector.types.pojo.Schema; + +public class TestPojo +{ + private final Schema schema; + + @JsonCreator + public TestPojo(@JsonProperty("schema") Schema schema) + { + this.schema = schema; + } + + public Schema getSchema() + { + return schema; + } +} diff --git a/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionHandlerTest.java b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionHandlerTest.java new file mode 100644 index 0000000000..fd44c8ae15 --- /dev/null +++ b/athena-federation-sdk/src/test/java/com/amazonaws/athena/connector/lambda/udf/UserDefinedFunctionHandlerTest.java @@ -0,0 +1,357 @@ +package com.amazonaws.athena.connector.lambda.udf; + +/*- + * #%L + * Amazon Athena Query Federation SDK + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import com.amazonaws.athena.connector.lambda.data.UnitTestBlockUtils; +import com.amazonaws.athena.connector.lambda.request.FederationRequest; +import com.amazonaws.athena.connector.lambda.request.PingRequest; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.FieldVector; +import org.apache.arrow.vector.Float4Vector; +import org.apache.arrow.vector.Float8Vector; +import org.apache.arrow.vector.IntVector; +import org.apache.arrow.vector.VarCharVector; +import org.apache.arrow.vector.complex.ListVector; +import org.apache.arrow.vector.complex.StructVector; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.FloatingPointPrecision; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.FieldType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.arrow.vector.util.Text; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.io.ByteArrayInputStream; +import java.io.ByteArrayOutputStream; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +import static com.amazonaws.athena.connector.lambda.udf.UserDefinedFunctionType.SCALAR; +import static junit.framework.TestCase.assertTrue; +import static org.junit.Assert.*; + +public class UserDefinedFunctionHandlerTest +{ + private static final String COLUMN_PREFIX = "col_"; + + private TestUserDefinedFunctionHandler handler; + + private BlockAllocatorImpl allocator; + + @Before + public void setUp() + { + handler = new TestUserDefinedFunctionHandler(); + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + { + allocator.close(); + } + + @Test + public void testInvocationWithBasicType() + { + int rowCount = 20; + UserDefinedFunctionRequest udfRequest = createUDFRequest(rowCount, Integer.class, "testScalarUDF", true, Integer.class, Integer.class); + + UserDefinedFunctionResponse udfResponse = handler.processFunction(allocator, udfRequest); + Block responseBlock = udfResponse.getRecords(); + + assertEquals(1, responseBlock.getFieldReaders().size()); + assertEquals(rowCount, responseBlock.getRowCount()); + + FieldReader fieldReader = responseBlock.getFieldReaders().get(0); + + for (int pos = 0; pos < rowCount; ++pos) { + fieldReader.setPosition(pos); + int val = (int) UnitTestBlockUtils.getValue(fieldReader, pos); + int expected = handler.testScalarUDF(pos + 100, pos + 100); + assertEquals(expected, val); + } + } + + @Test + public void testInvocationWithListType() + { + int rowCount = 20; + UserDefinedFunctionRequest udfRequest = createUDFRequest(rowCount, List.class, "testListType", true, List.class); + + UserDefinedFunctionResponse udfResponse = handler.processFunction(allocator, udfRequest); + Block responseBlock = udfResponse.getRecords(); + + assertEquals(1, responseBlock.getFieldReaders().size()); + assertEquals(rowCount, responseBlock.getRowCount()); + + FieldReader fieldReader = responseBlock.getFieldReaders().get(0); + + for (int pos = 0; pos < rowCount; ++pos) { + fieldReader.setPosition(pos); + List result = (List) UnitTestBlockUtils.getValue(fieldReader, pos); + List expected = handler.testListType(ImmutableList.of(pos + 100, pos + 200, pos + 300)); + assertArrayEquals(expected.toArray(), result.toArray()); + } + } + + @Test + public void testInvocationWithStructType() + { + int rowCount = 20; + UserDefinedFunctionRequest udfRequest = createUDFRequest(rowCount, Map.class, "testRowType", true, Map.class); + + UserDefinedFunctionResponse udfResponse = handler.processFunction(allocator, udfRequest); + Block responseBlock = udfResponse.getRecords(); + + assertEquals(1, responseBlock.getFieldReaders().size()); + assertEquals(rowCount, responseBlock.getRowCount()); + + FieldReader fieldReader = responseBlock.getFieldReaders().get(0); + + for (int pos = 0; pos < rowCount; ++pos) { + fieldReader.setPosition(pos); + Map actual = (Map) UnitTestBlockUtils.getValue(fieldReader, pos); + + Map input = ImmutableMap.of("intVal", pos + 100, "doubleVal", pos + 200.2); + Map expected = handler.testRowType(input); + + for (Map.Entry entry : expected.entrySet()) { + String key = entry.getKey(); + assertTrue(actual.containsKey(key)); + assertEquals(expected.get(key), actual.get(key)); + } + } + } + + @Test + public void testInvocationWithNullVAlue() + { + int rowCount = 20; + UserDefinedFunctionRequest udfRequest = createUDFRequest(rowCount, Boolean.class, "testScalarUDFWithNullCheck", false, Integer.class); + + UserDefinedFunctionResponse udfResponse = handler.processFunction(allocator, udfRequest); + Block responseBlock = udfResponse.getRecords(); + + assertEquals(1, responseBlock.getFieldReaders().size()); + assertEquals(rowCount, responseBlock.getRowCount()); + + FieldReader fieldReader = responseBlock.getFieldReaders().get(0); + + for (int pos = 0; pos < rowCount; ++pos) { + fieldReader.setPosition(pos); + assertTrue(fieldReader.isSet()); + Boolean expected = handler.testScalarUDFWithNullCheck(null); + Boolean actual = fieldReader.readBoolean(); + assertEquals(expected, actual); + } + } + + @Test + public void testRequestTypeValidation() throws Exception + { + FederationRequest federationRequest = new PingRequest(null, "dummy_catalog", "dummy_qid"); + + ObjectMapper objectMapper = new ObjectMapper(); + + ByteArrayOutputStream byteArrayOutputStream = new ByteArrayOutputStream(); + objectMapper.writeValue(byteArrayOutputStream, federationRequest); + byte[] inputData = byteArrayOutputStream.toByteArray(); + ByteArrayInputStream byteArrayInputStream = new ByteArrayInputStream(inputData); + ByteArrayOutputStream outputStream = new ByteArrayOutputStream(); + + try { + handler.handleRequest(byteArrayInputStream, outputStream, null); + fail(); + } + catch (Exception e) { + assertTrue(e.getMessage().contains("Expected a UserDefinedFunctionRequest but found")); + } + } + + private UserDefinedFunctionRequest createUDFRequest(int rowCount, Class returnType, String methodName, boolean nonNullData, Class... argumentTypes) + { + Schema inputSchema = buildSchema(argumentTypes); + Schema outputSchema = buildSchema(returnType); + + Block block = allocator.createBlock(inputSchema); + block.setRowCount(rowCount); + if (nonNullData) { + writeData(block, rowCount); + } + + return new UserDefinedFunctionRequest(null, block, outputSchema, methodName, SCALAR); + } + + private void writeData(Block block, int numOfRows) + { + for (FieldVector fieldVector : block.getFieldVectors()) { + fieldVector.setInitialCapacity(numOfRows); + fieldVector.allocateNew(); + fieldVector.setValueCount(numOfRows); + + for (int idx = 0; idx < numOfRows; ++idx) { + writeColumn(fieldVector, idx); + } + } + } + + private void writeColumn(FieldVector fieldVector, int idx) + { + if (fieldVector instanceof IntVector) { + IntVector intVector = (IntVector) fieldVector; + intVector.setSafe(idx, idx + 100); + return; + } + + if (fieldVector instanceof Float4Vector) { + Float4Vector float4Vector = (Float4Vector) fieldVector; + float4Vector.setSafe(idx, idx + 100.1f); + return; + } + + if (fieldVector instanceof Float8Vector) { + Float8Vector float8Vector = (Float8Vector) fieldVector; + float8Vector.setSafe(idx, idx + 100.2); + return; + } + + if (fieldVector instanceof VarCharVector) { + VarCharVector varCharVector = (VarCharVector) fieldVector; + varCharVector.setSafe(idx, new Text(idx + "-my-varchar")); + return; + } + + if (fieldVector instanceof ListVector) { + BlockUtils.setComplexValue(fieldVector, + idx, + FieldResolver.DEFAULT, + ImmutableList.of(idx + 100, idx + 200, idx + 300)); + return; + } + + if (fieldVector instanceof StructVector) { + Map input = ImmutableMap.of("intVal", idx + 100, "doubleVal", idx + 200.2); + BlockUtils.setComplexValue(fieldVector, + idx, + FieldResolver.DEFAULT, + input); + return; + } + + throw new IllegalArgumentException("Unsupported fieldVector " + fieldVector.getClass().getCanonicalName()); + } + + private Schema buildSchema(Class... types) + { + ImmutableList.Builder fieldsBuilder = ImmutableList.builder(); + for (int i = 0; i < types.length; ++i) { + String columnName = COLUMN_PREFIX + i; + Field field = getArrowField(types[i], columnName); + fieldsBuilder.add(field); + } + return new Schema(fieldsBuilder.build(), null); + } + + private Field getArrowField(Class type, String columnName) + { + if (type == Integer.class) { + return new Field(columnName, FieldType.nullable(new ArrowType.Int(32, true)), null); + } + + if (type == Float.class) { + return new Field(columnName, FieldType.nullable(new ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)), null); + } + + if (type == Double.class) { + return new Field(columnName, FieldType.nullable(new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)), null); + } + + if (type == String.class) { + return new Field(columnName, FieldType.nullable(new ArrowType.Utf8()), null); + } + + if (type == Boolean.class) { + return new Field(columnName, FieldType.nullable(new ArrowType.Bool()), null); + } + + if (type == List.class) { + Field childField = new Field(columnName, FieldType.nullable(new ArrowType.Int(32, true)), null); + return new Field(columnName, FieldType.nullable(Types.MinorType.LIST.getType()), + Collections.singletonList(childField)); + } + + if (type == Map.class) { + FieldBuilder fieldBuilder = FieldBuilder.newBuilder(columnName, Types.MinorType.STRUCT.getType()); + + Field childField1 = new Field("intVal", FieldType.nullable(new ArrowType.Int(32, true)), null); + Field childField2 = new Field("doubleVal", FieldType.nullable(new ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)), null);; + + fieldBuilder.addField(childField1); + fieldBuilder.addField(childField2); + + return fieldBuilder.build(); + } + + throw new IllegalArgumentException("Unsupported type " + type); + } + + private static class TestUserDefinedFunctionHandler extends UserDefinedFunctionHandler + { + public Integer testScalarUDF(Integer col1, Integer col2) + { + return col1 + col2; + } + + public Boolean testScalarUDFWithNullCheck(Integer col1) { + if (col1 == null) { + return true; + } + return false; + } + + public List testListType(List input) + { + return input.stream().map(val -> val + 1).collect(Collectors.toList()); + } + + public Map testRowType(Map input) + { + Integer intVal = (Integer) input.get("intVal"); + Double doubleVal = (Double) input.get("doubleVal"); + + return ImmutableMap.of("intVal", intVal + 1, "doubleVal", doubleVal + 1.0); + } + } +} diff --git a/athena-federation-sdk/src/test/resources/log4j.properties b/athena-federation-sdk/src/test/resources/log4j.properties new file mode 100644 index 0000000000..15b502c445 --- /dev/null +++ b/athena-federation-sdk/src/test/resources/log4j.properties @@ -0,0 +1,26 @@ +### +# #%L +# Amazon Athena Query Federation SDK +# %% +# Copyright (C) 2019 Amazon Web Services +# %% +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. +# #L% +### +log = . +log4j.rootLogger = INFO, LAMBDA + +#Define the LAMBDA appender +log4j.appender.LAMBDA=com.amazonaws.services.lambda.runtime.log4j.LambdaAppender +log4j.appender.LAMBDA.layout=org.apache.log4j.PatternLayout +log4j.appender.LAMBDA.layout.conversionPattern=%d{yyyy-MM-dd HH:mm:ss} <%X{AWSRequestId}> %-5p %c{1}:%m%n diff --git a/athena-hbase/LICENSE.txt b/athena-hbase/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-hbase/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-hbase/README.md b/athena-hbase/README.md new file mode 100644 index 0000000000..060cb4fa7f --- /dev/null +++ b/athena-hbase/README.md @@ -0,0 +1,87 @@ +# Amazon Athena HBase Connector + +This connector enables Amazon Athena to communicate with your HBase instance(s), making your HBase data accessible via SQL. + +Unlike traditional relational data stores, HBase tables do not have set schema. Each entry can have different fields and data types. While we are investigating the best way to support schema-on-read usecases for this connector, it presently supports two mechanisms for generating traditional table schema information. The default mechanism is for the connector to scan a small number of documents in your collection in order to form a union of all fields and coerce fields with non-overlapping data types. This basic schema inference works well for collections that have mostly uniform entries. For more diverse collections, the connector supports retrieving meta-data from the Glue Data Catalog. If the connector sees a Glue database and table which match your HBase namespace and collection names it will use the corresponding Glue table for schema. We recommend creating your Glue table such that it is a superset of all fields you may want to access from your HBase table. + +## Usage + +### Parameters + +The Athena HBase Connector supports several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) +5. **disable_glue** - (Optional) If present, with any valye, the connector will no longer attempt to retrieve supplemental metadata from Glue. +6. **glue_catalog** - (Optional) Can be used to target a cross-account Glue catalog. By default the connector will attempt to get metadata from its own Glue account. +7. **default_hbase** If present, this HBase connection string (e.g. master_hostname:zookeeper_port:hbase_port) is used when there is not a catalog specific environment variable (as explained below). + +You can also provide one or more properties which define the HBase connection details for the HBase instance(s) you'd like this connector to use. You can do this by setting a Lambda environment variable that corresponds to the catalog name you'd like to use in Athena. For example, if I'd like to query two different HBase instances from Athena in the below queries: + +```sql + select * from "hbase_instance_1".database.table + select * from "hbase_instance_2".database.table + ``` + +To support these two SQL statements we'd need to add two environment variables to our Lambda function: + +1. **hbase_instance_1** - The value should be the HBase connection details in the format of: master_hostname:zookeeper_port:hbase_port +2. **hbase_instance_2** - The value should be the HBase connection details in the format of: master_hostname:zookeeper_port:hbase_port + +You can also optionally use SecretsManager for part or all of the value for the preceeding connection details. For example, if I set a Lambda environment variable for **hbase_instance_1** to be "${hbase_host_1}:${hbase_zookeeper_port_1}:${hbase_master_port_1}" the Athena Federation SDK will automatically attempt to retrieve a secret from AWS SecretsManager named "hbase_host_1" and inject that value in place of "${hbase_host_1}". It wil do the same for the other secrets: hbase_zookeeper_port_1, hbase_master_port_1. Basically anything between ${...} is attempted as a secret in SecretsManager. If no such secret exists, the text isn't replaced. + + +### Setting Up Databases & Tables + +To enable a Glue Table for use with HBase, you simply need to have a Glue database and table that matches any HBase Namespace and Table that you'd like to supply supplemental metadata for (instead of relying on the HBase Connector's ability to infer schema). The connector's in built schema inference only supports values serialized in HBase as Strings (e.g. String.valueOf(int)). You can enable a Glue table to be used for supplemental metadata by seting the below table properties from the Glue Console when editing the Table in question. The only other thing you need to do ensure you use the appropriate data types and, optionally, HBase column family naming conventions. + +1. **hbase-metadata-flag** - Flag indicating that the table can be used for supplemental meta-data by the Athena HBase Connector. The value is unimportant as long as this key is present in the properties of the table. +1. **hbase-native-storage-flag** - This flag toggles the two modes of value serialization supported by the connector. By default (when this field is not present) the connector assumes all values are stored in HBase as strings. As such it will attempt to parse INT, BIGINT, DOUBLE, etc.. from HBase as Strings. If this field is set (the value of the table property doesn't matter, only its presence) on the table in Glue, the connector will switch to 'native' storage mode and attempt to read INT, BIGINT, BIT, and DOUBLE as bytes by using ByteBuffer.wrap(value).getInt(), ByteBuffer.wrap(value).getLong(), ByteBuffer.wrap(value).get(), and ByteBuffer.wrap(value).getDouble(). + +When it comes to setting your columns, you have two choices for how you model HBase column families. The Athena HBase connector supports fully qualified (aka flattened) naming like "family:column" as well as using STRUCTS to model your column families. In the STRUCT model the name of the STRUCT field should match the column family and then any children of that STRUCT should match the names of the columns in that family. Since predicate push down and columnar reads are not yet fully supported for complex types like STRUCTs we recommend against using the STRUCT approach unless your usecase specifically requires the use of STRUCTS. The below image shows how we've configured a table in Glue using a combination of these approaches. + + ![Glue Example Image](https://github.com/awslabs/aws-athena-query-federation/blob/master/docs/img/hbase_glue_example.png?raw=true) + +### Data Types + +All HBase values are retrieved as the basic byte type. From there they are converted to one of the below Apache Arrow data types used by the Athena Query Federation SDK based on how you've defined your table(s) in Glue's DataCatalog. If you are not using Glue to supplement your metedata and instead depending on the connector's schema inference capabilities, only a subset of the below data types will be used, namely: BIGINT, FLOAT8, VARCHAR. + +|Glue DataType|Apache Arrow Type| +|-------------|-----------------| +|int|INT| +|bigint|BIGINT| +|double|FLOAT8| +|float|FLOAT4| +|boolean|BIT| +|binary|VARBINARY| +|string|VARCHAR| + + +### Required Permissions + +Review the "Policies" section of the athena-hbase.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +2. SecretsManager Read Access - If you choose to store HBase endpoint details in SecretsManager you will need to grant the connector access to those secrets. +3. Glue Data Catalog - Since HBase does not have a meta-data store, the connector requires Read-Only access to Glue's DataCatalog for obtaining HBase key to table/column mappings. +4. VPC Access - In order to connect to your VPC for the purposes of communicating with your HBase instance(s), the connector needs the ability to attach/detach an interface to the VPC. +5. CloudWatch Logs - This is a somewhat implicit permission when deploying a Lambda function but it needs access to cloudwatch logs for storing logs. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use the Amazon Athena HBase Connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-hbase dir, run `mvn clean install`. +3. From the athena-hbase dir, run `../tools/publish.sh S3_BUCKET_NAME athena-hbase` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) + +## Performance + +The Athena HBase Connector will attempt to parallelize queries against your HBase instance by reading each region server in parallel. Predicate Pushdown is performed within the Lambda function and, where possible, push down into HBase using filters. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-hbase/athena-hbase.yaml b/athena-hbase/athena-hbase.yaml new file mode 100644 index 0000000000..6cce9c041a --- /dev/null +++ b/athena-hbase/athena-hbase.yaml @@ -0,0 +1,97 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaHBaseConnector + Description: 'This connector enables Amazon Athena to communicate with your HBase instance(s), making your HBase data accessible via SQL.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + Default: athena-federation-spill + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String + SecurityGroupIds: + Description: 'One or more SecurityGroup IDs corresponding to the SecurityGroup that should be applied to the Lambda function. (e.g. sg1,sg2,sg3)' + Type: 'List' + SubnetIds: + Description: 'One or more Subnet IDs corresponding to the Subnet that the Lambda function can use to access you data source. (e.g. subnet1,subnet2)' + Type: 'List' + SecretNameOrPrefix: + Description: 'The name or prefix of a set of names within Secrets Manager that this function should have access to. (e.g. hbase-*).' + Type: String + HBaseConnectionString: + Description: 'The HBase connection details to use by default in the format: master_hostname:zookeeper_port:hbase_port and optionally using SecretsManager (e.g. ${secret_name}).' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + default_hbase: !Ref HBaseConnectionString + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.hbase.HbaseCompositeHandler" + CodeUri: "./target/athena-hbase-1.0.jar" + Description: "Enables Amazon Athena to communicate with HBase, making your HBase data accessible via SQL" + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - secretsmanager:GetSecretValue + Effect: Allow + Resource: !Sub 'arn:aws:secretsmanager:*:*:secret:${SecretNameOrPrefix}' + Version: '2012-10-17' + - Statement: + - Action: + - glue:GetTableVersions + - glue:GetPartitions + - glue:GetTables + - glue:GetTableVersion + - glue:GetDatabases + - glue:GetTable + - glue:GetPartition + - glue:GetDatabase + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket + #VPCAccessPolicy allows our connector to run in a VPC so that it can access your data source. + - VPCAccessPolicy: {} + VpcConfig: + SecurityGroupIds: !Ref SecurityGroupIds + SubnetIds: !Ref SubnetIds \ No newline at end of file diff --git a/athena-hbase/pom.xml b/athena-hbase/pom.xml new file mode 100644 index 0000000000..55624efb26 --- /dev/null +++ b/athena-hbase/pom.xml @@ -0,0 +1,76 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-hbase + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.amazonaws + aws-java-sdk-glue + 1.11.490 + + + org.apache.hbase + hbase-client + 1.4.10 + + + + org.apache.httpcomponents + httpclient + 4.5.6 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + + package + + shade + + + + + *:* + + org/apache/hadoop/yarn/webapp/** + org/apache/hadoop/mapred/** + org/apache/hadoop/mapreduce/** + org/apache/hadoop/yarn/** + org/apache/curator/** + org/apache/directory/** + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + + + + \ No newline at end of file diff --git a/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseCompositeHandler.java b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseCompositeHandler.java new file mode 100644 index 0000000000..9dc9eb7f4c --- /dev/null +++ b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose HbaseMetadataHandler and HbaseRecordHandler. + */ +public class HbaseCompositeHandler + extends CompositeHandler +{ + public HbaseCompositeHandler() + { + super(new HbaseMetadataHandler(), new HbaseRecordHandler()); + } +} diff --git a/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactory.java b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactory.java new file mode 100644 index 0000000000..ebe2eaa1eb --- /dev/null +++ b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactory.java @@ -0,0 +1,156 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import org.apache.arrow.util.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.HBaseConfiguration; +import org.apache.hadoop.hbase.client.Connection; +import org.apache.hadoop.hbase.client.ConnectionFactory; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +/** + * Creates and Caches HBase Connection Instances, using the connection string as the cache key. + * + * @Note Connection String format is expected to be host:zookeeper_port:master_port + */ +public class HbaseConnectionFactory +{ + private static final Logger logger = LoggerFactory.getLogger(HbaseConnectionFactory.class); + + private final Map clientCache = new HashMap<>(); + + private final Map defaultClientConfig = new HashMap<>(); + + public HbaseConnectionFactory() + { + setClientConfig("hbase.rpc.timeout", "2000"); + setClientConfig("hbase.client.retries.number", "3"); + setClientConfig("hbase.client.pause", "500"); + setClientConfig("zookeeper.recovery.retry", "2"); + } + + /** + * Used to set HBase client config options that should be applied to all future connections. + * + * @param name The name of the property (e.g. hbase.rpc.timeout). + * @param value The value of the property to set on the HBase client config object before construction. + */ + public synchronized void setClientConfig(String name, String value) + { + defaultClientConfig.put(name, value); + } + + /** + * Provides access to the current HBase client config options used during connection construction. + * + * @return Map where the Key is the config name and the value is the config value. + * @note This can be helpful when logging diagnostic info. + */ + public synchronized Map getClientConfigs() + { + return Collections.unmodifiableMap(defaultClientConfig); + } + + /** + * Gets or Creates an HBase connection for the given connection string. + * + * @param conStr HBase connection details, format is expected to be host:zookeeper_port:master_port + * @return An HBase connection if the connection succeeded, else the function will throw. + */ + public synchronized Connection getOrCreateConn(String conStr) + { + logger.info("getOrCreateConn: enter"); + Connection conn = clientCache.get(conStr); + + if (conn == null || !connectionTest(conn)) { + String[] endpointParts = conStr.split(":"); + if (endpointParts.length == 3) { + conn = createConnection(endpointParts[0], endpointParts[1], endpointParts[2]); + clientCache.put(conStr, conn); + } + else { + throw new IllegalArgumentException("Hbase endpoint format error."); + } + } + + logger.info("getOrCreateConn: exit"); + return conn; + } + + private Connection createConnection(String host, String masterPort, String zookeeperPort) + { + try { + logger.info("createConnection: enter"); + Configuration config = HBaseConfiguration.create(); + config.set("hbase.zookeeper.quorum", host); + config.set("hbase.zookeeper.property.clientPort", zookeeperPort); + config.set("hbase.master", host + ":" + masterPort); + for (Map.Entry nextConfig : defaultClientConfig.entrySet()) { + logger.info("createConnection: applying client config {}:{}", nextConfig.getKey(), nextConfig.getValue()); + config.set(nextConfig.getKey(), nextConfig.getValue()); + } + Connection conn = ConnectionFactory.createConnection(config); + logger.info("createConnection: hbase.zookeeper.quorum:" + config.get("hbase.zookeeper.quorum")); + logger.info("createConnection: hbase.zookeeper.property.clientPort:" + config.get("hbase.zookeeper.property.clientPort")); + logger.info("createConnection: hbase.master:" + config.get("hbase.master")); + return conn; + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + /** + * Runs a 'quick' test on the connection and then returns it if it passes. + */ + private boolean connectionTest(Connection conn) + { + try { + logger.info("connectionTest: Testing connection started."); + conn.getAdmin().listTableNames(); + logger.info("connectionTest: Testing connection completed - success."); + return true; + } + catch (RuntimeException | IOException ex) { + logger.warn("getOrCreateConn: Exception while testing existing connection.", ex); + } + logger.info("connectionTest: Testing connection completed - fail."); + return false; + } + + /** + * Injects a connection into the client cache. + * + * @param conStr The connection string (aka the cache key) + * @param conn The connection to inject into the client cache, most often a Mock used in testing. + */ + @VisibleForTesting + protected synchronized void addConnection(String conStr, Connection conn) + { + clientCache.put(conStr, conn); + } +} diff --git a/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseFieldResolver.java b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseFieldResolver.java new file mode 100644 index 0000000000..627f5b2aa6 --- /dev/null +++ b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseFieldResolver.java @@ -0,0 +1,74 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.data.FieldResolver; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.hadoop.hbase.client.Result; + +/** + * Used to resolve and convert complex types from HBase to Apache Arrow's type system + * when using BlockUtils.setComplexValue(...). + */ +public class HbaseFieldResolver + implements FieldResolver +{ + private final byte[] family; + private final boolean isNative; + + /** + * @param isNative True if the values are stored as native byte arrays in HBase. + * @param family The HBase column family that this field resolver is for. + */ + public HbaseFieldResolver(boolean isNative, byte[] family) + { + this.isNative = isNative; + this.family = family; + } + + /** + * Static construction helper. + * + * @param isNative True if the values are stored as native byte arrays in HBase. + * @param family The HBase column family that this field resolver is for. + */ + public static HbaseFieldResolver resolver(boolean isNative, String family) + { + return new HbaseFieldResolver(isNative, family.getBytes()); + } + + /** + * @param field The Apache Arrow field we'd like to extract from the val. + * @param val The value from which we'd like to extract the provide field. + * @return Object containing the value for the requested field. + * @see FieldResolver in the Athena Query Federation SDK + */ + @Override + public Object getFieldValue(Field field, Object val) + { + if (!(val instanceof Result)) { + String clazz = (val != null) ? val.getClass().getName() : "null"; + throw new IllegalArgumentException("Expected value of type Result but found " + clazz); + } + + byte[] rawFieldValue = ((Result) val).getValue(family, field.getName().getBytes()); + return HbaseSchemaUtils.coerceType(isNative, field.getType(), rawFieldValue); + } +} diff --git a/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseMetadataHandler.java b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseMetadataHandler.java new file mode 100644 index 0000000000..54219c6c43 --- /dev/null +++ b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseMetadataHandler.java @@ -0,0 +1,291 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequest; +import com.amazonaws.athena.connector.lambda.metadata.glue.GlueFieldLexer; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.AWSGlueClientBuilder; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.hadoop.hbase.HRegionInfo; +import org.apache.hadoop.hbase.NamespaceDescriptor; +import org.apache.hadoop.hbase.TableName; +import org.apache.hadoop.hbase.client.Admin; +import org.apache.hadoop.hbase.client.Connection; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Handles metadata requests for the Athena HBase Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Uses a Glue table property (hbase-metadata-flag) to indicate that the table (whose name matched the HBase table + * name) can indeed be used to supplement metadata from HBase itself. + * 2. Uses a Glue table property (hbase-native-storage-flag) to indicate that the table is stored in HBase + * using native byte storage (e.g. int as 4 BYTES instead of int serialized as a String). + * 3. Attempts to resolve sensitive fields such as HBase connection strings via SecretsManager so that you can substitute + * variables with values from by doing something like hostname:port:password=${my_secret} + */ +public class HbaseMetadataHandler + extends GlueMetadataHandler +{ + //FLAG used to indicate the given table is stored using HBase native formatting not as strings + protected static final String HBASE_NATIVE_STORAGE_FLAG = "hbase-native-storage-flag"; + //Field name used to store the connection string as a property on Split objects. + protected static final String HBASE_CONN_STR = "connStr"; + //Field name used to store the HBase scan start key as a property on Split objects. + protected static final String START_KEY_FIELD = "start_key"; + //Field name used to store the HBase scan end key as a property on Split objects. + protected static final String END_KEY_FIELD = "end_key"; + //Field name used to store the HBase region id as a property on Split objects. + protected static final String REGION_ID_FIELD = "region_id"; + //Field name used to store the HBase region name as a property on Split objects. + protected static final String REGION_NAME_FIELD = "region_name"; + private static final Logger logger = LoggerFactory.getLogger(HbaseMetadataHandler.class); + //The Env variable name used to store the default HBase connection string if no catalog specific + //env variable is set. + private static final String DEFAULT_HBASE = "default_hbase"; + //The Glue table property that indicates that a table matching the name of an HBase table + //is indeed enabled for use by this connector. + private static final String HBASE_METADATA_FLAG = "hbase-metadata-flag"; + //Used to filter out Glue tables which lack HBase metadata flag. + private static final TableFilter TABLE_FILTER = (Table table) -> table.getParameters().containsKey(HBASE_METADATA_FLAG); + //The Env variable name used to indicate that we want to disable the use of Glue DataCatalog for supplemental + //metadata and instead rely solely on the connector's schema inference capabilities. + private static final String GLUE_ENV_VAR = "disable_glue"; + //Used to denote the 'type' of this connector for diagnostic purposes. + private static final String SOURCE_TYPE = "hbase"; + //The number of rows to scan when attempting to infer schema from an HBase table. + private static final int NUM_ROWS_TO_SCAN = 10; + private final AWSGlue awsGlue; + private final HbaseConnectionFactory connectionFactory; + + public HbaseMetadataHandler() + { + super((System.getenv(GLUE_ENV_VAR) == null) ? AWSGlueClientBuilder.standard().build() : null, SOURCE_TYPE); + this.awsGlue = getAwsGlue(); + this.connectionFactory = new HbaseConnectionFactory(); + } + + @VisibleForTesting + protected HbaseMetadataHandler(AWSGlue awsGlue, + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + HbaseConnectionFactory connectionFactory, + String spillBucket, + String spillPrefix) + { + super(awsGlue, keyFactory, secretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + this.awsGlue = awsGlue; + this.connectionFactory = connectionFactory; + } + + private Connection getOrCreateConn(MetadataRequest request) + { + String endpoint = resolveSecrets(getConnStr(request)); + return connectionFactory.getOrCreateConn(endpoint); + } + + /** + * Retrieves the HBase connection details from an env variable matching the catalog name, if no such + * env variable exists we fall back to the default env variable defined by DEFAULT_HBASE. + */ + private String getConnStr(MetadataRequest request) + { + String conStr = System.getenv(request.getCatalogName()); + if (conStr == null) { + logger.info("getConnStr: No environment variable found for catalog {} , using default {}", + request.getCatalogName(), DEFAULT_HBASE); + conStr = System.getenv(DEFAULT_HBASE); + } + return conStr; + } + + /** + * List namespaces in your HBase instance treating each as a 'schema' (aka database) + * + * @see GlueMetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest request) + throws IOException + { + Connection conn = getOrCreateConn(request); + Admin admin = conn.getAdmin(); + List schemas = new ArrayList<>(); + NamespaceDescriptor[] namespaces = admin.listNamespaceDescriptors(); + for (int i = 0; i < namespaces.length; i++) { + NamespaceDescriptor namespace = namespaces[i]; + schemas.add(namespace.getName()); + } + return new ListSchemasResponse(request.getCatalogName(), schemas); + } + + /** + * List tables in the requested schema in your HBase instance treating the requested schema as an HBase + * namespace. + * + * @see GlueMetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest request) + throws IOException + { + Connection conn = getOrCreateConn(request); + Admin admin = conn.getAdmin(); + List tableNames = new ArrayList<>(); + + TableName[] tables = admin.listTableNamesByNamespace(request.getSchemaName()); + for (int i = 0; i < tables.length; i++) { + TableName tableName = tables[i]; + tableNames.add(new com.amazonaws.athena.connector.lambda.domain.TableName(request.getSchemaName(), + tableName.getNameAsString().replace(request.getSchemaName() + ":", ""))); + } + return new ListTablesResponse(request.getCatalogName(), tableNames); + } + + /** + * If Glue is enabled as a source of supplemental metadata we look up the requested Schema/Table in Glue and + * filters out any results that don't have the HBASE_METADATA_FLAG set. If no matching results were found in Glue, + * then we resort to inferring the schema of the HBase table using HbaseSchemaUtils.inferSchema(...). If there + * is no such table in HBase the operation will fail. + * + * @see GlueMetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request) + throws Exception + { + logger.info("doGetTable: enter", request.getTableName()); + Schema origSchema = null; + try { + if (awsGlue != null) { + origSchema = super.doGetTable(blockAllocator, request, TABLE_FILTER).getSchema(); + } + } + catch (RuntimeException ex) { + logger.warn("doGetTable: Unable to retrieve table[{}:{}] from AWSGlue.", + request.getTableName().getSchemaName(), + request.getTableName().getTableName(), + ex); + } + + if (origSchema == null) { + Connection conn = getOrCreateConn(request); + origSchema = HbaseSchemaUtils.inferSchema(conn, request.getTableName(), NUM_ROWS_TO_SCAN); + } + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + origSchema.getFields().forEach((Field field) -> + schemaBuilder.addField(field.getName(), field.getType(), field.getChildren()) + ); + + origSchema.getCustomMetadata().entrySet().forEach((Map.Entry meta) -> + schemaBuilder.addMetadata(meta.getKey(), meta.getValue())); + + schemaBuilder.addField(HbaseSchemaUtils.ROW_COLUMN_NAME, Types.MinorType.VARCHAR.getType()); + + Schema schema = schemaBuilder.build(); + logger.info("doGetTable: return {}", schema); + return new GetTableResponse(request.getCatalogName(), request.getTableName(), schema); + } + + /** + * Our table doesn't support complex layouts or partitioning so leave this as a NoOp and the SDK will notice that we + * do not have any partition columns, nor have we set an custom fields using enhancePartitionSchema(...), and as a + * result the SDK will generate a single place holder partition for us. This is because we need to convey that there is at least + * 1 partition to read as part of the query or Athena will assume partition pruning found no candidate layouts to read. + * + * @see GlueMetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + { + //NoOp + } + + /** + * If the table is spread across multiple region servers, then we parallelize the scan by making each region server a split. + * + * @see GlueMetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest request) + throws IOException + { + Set splits = new HashSet<>(); + Connection conn = getOrCreateConn(request); + Admin admin = conn.getAdmin(); + + //We can read each region in parallel + for (HRegionInfo info : admin.getTableRegions(HbaseSchemaUtils.getQualifiedTable(request.getTableName()))) { + Split.Builder splitBuilder = Split.newBuilder(makeSpillLocation(request), makeEncryptionKey()) + .add(HBASE_CONN_STR, getConnStr(request)) + .add(START_KEY_FIELD, new String(info.getStartKey())) + .add(END_KEY_FIELD, new String(info.getEndKey())) + .add(REGION_ID_FIELD, String.valueOf(info.getRegionId())) + .add(REGION_NAME_FIELD, info.getRegionNameAsString()); + + splits.add(splitBuilder.build()); + } + + return new GetSplitsResponse(request.getCatalogName(), splits, null); + } + + /** + * @see GlueMetadataHandler + */ + @Override + protected Field convertField(String name, String glueType) + { + return GlueFieldLexer.lex(name, glueType); + } +} diff --git a/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseRecordHandler.java b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseRecordHandler.java new file mode 100644 index 0000000000..1047118ebd --- /dev/null +++ b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseRecordHandler.java @@ -0,0 +1,248 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.hadoop.hbase.client.Connection; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.ResultScanner; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.client.Table; +import org.apache.hadoop.hbase.filter.CompareFilter; +import org.apache.hadoop.hbase.filter.Filter; +import org.apache.hadoop.hbase.filter.SingleColumnValueFilter; +import org.apache.hadoop.hbase.util.Bytes; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.Map; + +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.END_KEY_FIELD; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.HBASE_CONN_STR; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.HBASE_NATIVE_STORAGE_FLAG; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.START_KEY_FIELD; +import static java.nio.charset.StandardCharsets.UTF_8; + +/** + * Handles data read record requests for the Athena HBase Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Supporting String and native 'byte[]' storage. + * 2. Attempts to resolve sensitive configuration fields such as HBase connection string via SecretsManager so that you can + * substitute variables with values from by doing something like hostname:port:password=${my_secret} + */ +public class HbaseRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(HbaseRecordHandler.class); + + //Used to denote the 'type' of this connector for diagnostic purposes. + private static final String SOURCE_TYPE = "hbase"; + + private final AmazonS3 amazonS3; + private final HbaseConnectionFactory connectionFactory; + + public HbaseRecordHandler() + { + this(AmazonS3ClientBuilder.defaultClient(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + new HbaseConnectionFactory()); + } + + @VisibleForTesting + protected HbaseRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena, HbaseConnectionFactory connectionFactory) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + this.amazonS3 = amazonS3; + this.connectionFactory = connectionFactory; + } + + private Connection getOrCreateConn(String conStr) + { + String endpoint = resolveSecrets(conStr); + return connectionFactory.getOrCreateConn(endpoint); + } + + /** + * Scans HBase using the scan settings set on the requested Split by HbaseMetadataHandler. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest request, QueryStatusChecker queryStatusChecker) + throws IOException + { + Schema projection = request.getSchema(); + Split split = request.getSplit(); + String conStr = split.getProperty(HBASE_CONN_STR); + boolean isNative = projection.getCustomMetadata().get(HBASE_NATIVE_STORAGE_FLAG) != null; + + //setup the scan so that we only read the key range associated with the region represented by our Split. + Scan scan = new Scan(split.getProperty(START_KEY_FIELD).getBytes(), split.getProperty(END_KEY_FIELD).getBytes()); + + //attempts to push down a partial predicate using HBase Filters + scan.setFilter(pushdownPredicate(isNative, request.getConstraints())); + + //setup the projection so we only pull columns/families that we need + for (Field next : request.getSchema().getFields()) { + addToProjection(scan, next); + } + + Connection conn = getOrCreateConn(conStr); + Table table = conn.getTable(HbaseSchemaUtils.getQualifiedTable(request.getTableName())); + + try (ResultScanner scanner = table.getScanner(scan)) { + for (Result row : scanner) { + if (!queryStatusChecker.isQueryRunning()) { + return; + } + blockSpiller.writeRows((Block block, int rowNum) -> { + boolean match = true; + for (Field field : projection.getFields()) { + if (match) { + match &= writeField(block, field, isNative, row, rowNum); + } + } + return match ? 1 : 0; + }); + } + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + /** + * Used to filter and write field values from the HBase scan to the response block. + * + * @param block The Block we should write to. + * @param field The Apache Arrow Field we need to write. + * @param isNative Boolean indicating if the HBase value is stored as a String (false) or as Native byte[] (true). + * @param row The HBase row from which we should extract a value for the field denoted by vector. + * @param rowNum The rowNumber to write into on the vector. + * @return True if the value passed the ConstraintEvaluator's test. + */ + private boolean writeField(Block block, Field field, boolean isNative, Result row, int rowNum) + { + String fieldName = field.getName(); + ArrowType type = field.getType(); + Types.MinorType minorType = Types.getMinorTypeForArrowType(type); + try { + //Is this field the special 'row' field that can be used to group column families that may + //have been spread across different region servers if they are needed in the same query. + if (HbaseSchemaUtils.ROW_COLUMN_NAME.equals(fieldName)) { + String value = Bytes.toString(row.getRow()); + return block.offerValue(fieldName, rowNum, value); + } + + switch (minorType) { + case STRUCT: + //Column is actually a Column Family stored as a STRUCT. + return block.offerComplexValue(fieldName, + rowNum, + HbaseFieldResolver.resolver(isNative, fieldName), + row); + default: + //We expect the column name format to be : + String[] columnParts = HbaseSchemaUtils.extractColumnParts(fieldName); + byte[] rawValue = row.getValue(columnParts[0].getBytes(), columnParts[1].getBytes()); + Object value = HbaseSchemaUtils.coerceType(isNative, type, rawValue); + return block.offerValue(fieldName, rowNum, value); + } + } + catch (RuntimeException ex) { + throw new RuntimeException("Exception while processing field " + fieldName + " type " + minorType, ex); + } + } + + /** + * Addes the specified Apache Arrow field to the Scan to satisfy the requested projection. + * + * @param scan The scan object that will be used to read data from HBase. + * @param field The field to be added to the scan. + */ + private void addToProjection(Scan scan, Field field) + { + //ignore the special 'row' column since we get that by default. + if (HbaseSchemaUtils.ROW_COLUMN_NAME.equalsIgnoreCase(field.getName())) { + return; + } + + Types.MinorType columnType = Types.getMinorTypeForArrowType(field.getType()); + switch (columnType) { + case STRUCT: + for (Field child : field.getChildren()) { + scan.addColumn(field.getName().getBytes(UTF_8), child.getName().getBytes(UTF_8)); + } + return; + default: + String[] nameParts = HbaseSchemaUtils.extractColumnParts(field.getName()); + if (nameParts.length != 2) { + throw new RuntimeException("Column name " + field.getName() + " does not meet family:column hbase convention."); + } + scan.addColumn(nameParts[0].getBytes(UTF_8), nameParts[1].getBytes(UTF_8)); + } + } + + /** + * Attempts to push down at basic Filter predicate into HBase. + * + * @param isNative True if the values are stored in HBase using native byte[] vs being serialized as Strings. + * @param constraints The constraints that we can attempt to push into HBase as part of the scan. + * @return A filter if we found a predicate we can push down, null otherwise/ + * @note Currently this method only supports constraints that can be represented by HBase's SingleColumnValueFilter + * and CompareOp of EQUAL. In the future we can add > and < for certain field types. + */ + private Filter pushdownPredicate(boolean isNative, Constraints constraints) + { + for (Map.Entry next : constraints.getSummary().entrySet()) { + if (next.getValue().isSingleValue() && !next.getValue().isNullAllowed()) { + String[] colParts = HbaseSchemaUtils.extractColumnParts(next.getKey()); + return new SingleColumnValueFilter(colParts[0].getBytes(), + colParts[1].getBytes(), + CompareFilter.CompareOp.EQUAL, + HbaseSchemaUtils.toBytes(isNative, next.getValue().getSingleValue())); + } + } + + return null; + } +} diff --git a/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseSchemaUtils.java b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseSchemaUtils.java new file mode 100644 index 0000000000..96926cc2a8 --- /dev/null +++ b/athena-hbase/src/main/java/com/amazonaws/athena/connectors/hbase/HbaseSchemaUtils.java @@ -0,0 +1,277 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.arrow.vector.util.Text; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Connection; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.ResultScanner; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.client.Table; +import org.apache.hadoop.hbase.filter.PageFilter; +import org.apache.hadoop.hbase.util.Bytes; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.nio.ByteBuffer; +import java.util.HashMap; +import java.util.Map; + +/** + * Collection of helpful utilities that handle HBase schema inference, type, and naming conversion. + */ +public class HbaseSchemaUtils +{ + //Field name for the special 'row' column which represets the HBase key used to store a given row. + protected static final String ROW_COLUMN_NAME = "row"; + //The HBase namespce qualifier character which commonly separates namespaces and column families from tables and columns. + protected static final String NAMESPACE_QUALIFIER = ":"; + private static final Logger logger = LoggerFactory.getLogger(HbaseSchemaUtils.class); + + private HbaseSchemaUtils() {} + + /** + * This method will produce an Apache Arrow Schema for the given TableName and HBase connection + * by scanning up to the requested number of rows and using basic schema inference to determine + * data types. + * + * @param client The HBase connection to use for the scan operation. + * @param tableName The HBase TableName for which to produce an Apache Arrow Schema. + * @param numToScan The number of records to scan as part of producing the Schema. + * @return An Apache Arrow Schema representing the schema of the HBase table. + * @note The resulting schema is a union of the schema of every row that is scanned. Any time two rows + * have a field with the same name but different inferred type the code will default the type of + * that field in the resulting schema to a VARCHAR. This approach is not perfect and can struggle + * to produce a usable schema if the table has a significant mix of entities. + */ + public static Schema inferSchema(Connection client, TableName tableName, int numToScan) + { + Map> schemaInference = new HashMap<>(); + Scan scan = new Scan().setMaxResultSize(numToScan).setFilter(new PageFilter(numToScan)); + try (Table table = client.getTable(org.apache.hadoop.hbase.TableName.valueOf(getQualifiedTableName(tableName))); + ResultScanner scanner = table.getScanner(scan)) { + for (Result result : scanner) { + for (KeyValue keyValue : result.list()) { + String family = new String(keyValue.getFamily()); + String column = new String(keyValue.getQualifier()); + + Map schemaForFamily = schemaInference.get(family); + if (schemaForFamily == null) { + schemaForFamily = new HashMap<>(); + schemaInference.put(family, schemaForFamily); + } + + //Get the previously inferred type for this column if we've seen it on a past row + ArrowType prevInferredType = schemaForFamily.get(column); + + //Infer the type of the column from the value on the current row. + Types.MinorType inferredType = inferType(keyValue.getValue()); + + //Check if the previous and currently inferred types match + if (prevInferredType != null && Types.getMinorTypeForArrowType(prevInferredType) != inferredType) { + logger.info("inferSchema: Type changed detected for field, using VARCHAR - family: {} col: {} previousType: {} newType: {}", + family, column, prevInferredType, inferredType); + schemaForFamily.put(column, Types.MinorType.VARCHAR.getType()); + } + else { + schemaForFamily.put(column, inferredType.getType()); + } + + logger.info("inferSchema: family: {} col: {} inferredType: {}", family, column, inferredType); + } + } + + //Used the union of all row's to produce our resultant Apache Arrow Schema. + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + for (Map.Entry> nextFamily : schemaInference.entrySet()) { + String family = nextFamily.getKey(); + for (Map.Entry nextCol : nextFamily.getValue().entrySet()) { + schemaBuilder.addField(family + NAMESPACE_QUALIFIER + nextCol.getKey(), nextCol.getValue()); + } + } + + return schemaBuilder.build(); + } + catch (IOException ex) { + throw new RuntimeException(ex); + } + } + + /** + * Helper which goes from an Athena Federation SDK TableName to an HBase table name string. + * + * @param tableName An Athena Federation SDK TableName. + * @return The corresponding HBase table name string. + */ + public static String getQualifiedTableName(TableName tableName) + { + return tableName.getSchemaName() + NAMESPACE_QUALIFIER + tableName.getTableName(); + } + + /** + * Helper which goes from an Athena Federation SDK TableName to an HBase TableName. + * + * @param tableName An Athena Federation SDK TableName. + * @return The corresponding HBase TableName. + */ + public static org.apache.hadoop.hbase.TableName getQualifiedTable(TableName tableName) + { + return org.apache.hadoop.hbase.TableName.valueOf(tableName.getSchemaName() + NAMESPACE_QUALIFIER + tableName.getTableName()); + } + + /** + * Given a value from HBase attempt to infer it's type. + * + * @param value An HBase value. + * @return The Apache Arrow Minor Type most closely associated with the provided value. + * @note This method of inference is very naive and only works if the values are stored in HBase + * as Strings. It uses VARCHAR as its fallback if it can't not parse our the value to + * one of the other supported inferred types. It is expected that customers of this connector + * may want to customize this logic or rely on explicit Schema in Glue. + */ + public static Types.MinorType inferType(byte[] value) + { + String strVal = Bytes.toString(value); + try { + Long.valueOf(strVal); + return Types.MinorType.BIGINT; + } + catch (RuntimeException ex) { + } + + try { + Double.valueOf(strVal); + return Types.MinorType.FLOAT8; + } + catch (RuntimeException ex) { + } + + return Types.MinorType.VARCHAR; + } + + /** + * Helper that can coerce the given HBase value to the requested Apache Arrow type. + * + * @param isNative If True, the HBase value is stored using native bytes. If False, the value is serialized as a String. + * @param type The Apache Arrow Type that the value should be coerced to before returning. + * @param value The HBase value to coerce. + * @return The coerced value which is now allowed with the provided Apache Arrow type. + */ + public static Object coerceType(boolean isNative, ArrowType type, byte[] value) + { + if (value == null) { + return null; + } + + Types.MinorType minorType = Types.getMinorTypeForArrowType(type); + switch (minorType) { + case VARCHAR: + return Bytes.toString(value); + case INT: + return isNative ? ByteBuffer.wrap(value).getInt() : Integer.parseInt(Bytes.toString(value)); + case BIGINT: + return isNative ? ByteBuffer.wrap(value).getLong() : Long.parseLong(Bytes.toString(value)); + case FLOAT4: + return isNative ? ByteBuffer.wrap(value).getFloat() : Float.parseFloat(Bytes.toString(value)); + case FLOAT8: + return isNative ? ByteBuffer.wrap(value).getDouble() : Double.parseDouble(Bytes.toString(value)); + case BIT: + if (isNative) { + return (value[0] != 0); + } + else { + return Boolean.parseBoolean(Bytes.toString(value)); + } + case VARBINARY: + return value; + default: + throw new IllegalArgumentException(type + " with minorType[" + minorType + "] is not supported."); + } + } + + /** + * Helper which can go from a Glue/Apache Arrow column name to its HBase family + column. + * + * @param glueColumnName The input column name in format "family:column". + * @return + */ + public static String[] extractColumnParts(String glueColumnName) + { + return glueColumnName.split(NAMESPACE_QUALIFIER); + } + + /** + * Used to convert from Apache Arrow typed values to HBase values. + * + * @param isNative If True, the HBase value should be stored using native bytes. + * If False, the value should be serialized as a String before storing it. + * @param value The value to convert. + * @return The HBase byte representation of the value. + * @note This is commonly used when attempting to push constraints into HBase which requires converting a small + * number of values from Apache Arrow's Type system to HBase compatible representations for comparisons. + */ + public static byte[] toBytes(boolean isNative, Object value) + { + if (value == null || value instanceof byte[]) { + return (byte[]) value; + } + + if (value instanceof String) { + return ((String) value).getBytes(); + } + + if (value instanceof Text) { + return ((Text) value).toString().getBytes(); + } + + if (!isNative) { + return String.valueOf(value).getBytes(); + } + + if (value instanceof Integer) { + return ByteBuffer.allocate(4).putInt((int) value).array(); + } + + if (value instanceof Long) { + return ByteBuffer.allocate(8).putLong((long) value).array(); + } + + if (value instanceof Float) { + return ByteBuffer.allocate(4).putFloat((float) value).array(); + } + + if (value instanceof Double) { + return ByteBuffer.allocate(8).putDouble((double) value).array(); + } + + if (value instanceof Boolean) { + return ByteBuffer.allocate(1).put((byte) ((boolean) value ? 1 : 0)).array(); + } + + throw new RuntimeException("Unsupported object type for " + value + " " + value.getClass().getName()); + } +} diff --git a/athena-hbase/src/main/resources/sample_data.hbase b/athena-hbase/src/main/resources/sample_data.hbase new file mode 100644 index 0000000000..45ecdfb360 --- /dev/null +++ b/athena-hbase/src/main/resources/sample_data.hbase @@ -0,0 +1,106 @@ + +create 'hbase_payments:transactions', 'summary','details' + +put 'hbase_payments:transactions','tx00001','summary:amount',1810.21 +put 'hbase_payments:transactions','tx00001','summary:cc_id','4119' +put 'hbase_payments:transactions','tx00001','summary:auth','XDF6J' +put 'hbase_payments:transactions','tx00001','summary:status','FUNDED' +put 'hbase_payments:transactions','tx00001','summary:order_id','0001235' +put 'hbase_payments:transactions','tx00001','summary:customer_id','11123' +put 'hbase_payments:transactions','tx00001','details:fee',18.02 +put 'hbase_payments:transactions','tx00001','details:bank','AMEX' +put 'hbase_payments:transactions','tx00001','details:network','AMEX' +put 'hbase_payments:transactions','tx00001','details:days_payable',30 +put 'hbase_payments:transactions','tx00001','details:latency',450 +put 'hbase_payments:transactions','tx00001','details:fraud_score',1 + + +put 'hbase_payments:transactions','tx00002','summary:amount',110.11 +put 'hbase_payments:transactions','tx00002','summary:cc_id','4119' +put 'hbase_payments:transactions','tx00002','summary:auth','OKLH8' +put 'hbase_payments:transactions','tx00002','summary:status','FUNDED' +put 'hbase_payments:transactions','tx00002','summary:order_id','0001234' +put 'hbase_payments:transactions','tx00002','summary:customer_id','11123' +put 'hbase_payments:transactions','tx00002','details:fee',1.10 +put 'hbase_payments:transactions','tx00002','details:bank','AMEX' +put 'hbase_payments:transactions','tx00002','details:network','AMEX' +put 'hbase_payments:transactions','tx00002','details:days_payable',30 +put 'hbase_payments:transactions','tx00002','details:latency',350 +put 'hbase_payments:transactions','tx00002','details:fraud_score',2 + + +put 'hbase_payments:transactions','tx00003','summary:amount',33.12 +put 'hbase_payments:transactions','tx00003','summary:cc_id','1189' +put 'hbase_payments:transactions','tx00003','summary:auth','OKLH8' +put 'hbase_payments:transactions','tx00003','summary:status','PENDING' +put 'hbase_payments:transactions','tx00003','summary:order_id','0002234' +put 'hbase_payments:transactions','tx00003','summary:customer_id','9820' +put 'hbase_payments:transactions','tx00003','details:fee',0.33 +put 'hbase_payments:transactions','tx00003','details:bank','CHASE' +put 'hbase_payments:transactions','tx00003','details:network','VISA' +put 'hbase_payments:transactions','tx00003','details:days_payable',90 +put 'hbase_payments:transactions','tx00003','details:latency',800 +put 'hbase_payments:transactions','tx00003','details:fraud_score',7 + +put 'hbase_payments:transactions','tx00004','summary:amount',323.82 +put 'hbase_payments:transactions','tx00004','summary:cc_id','8827' +put 'hbase_payments:transactions','tx00004','summary:auth','8UJKZS' +put 'hbase_payments:transactions','tx00004','summary:status','FUNDED' +put 'hbase_payments:transactions','tx00004','summary:order_id','0001238' +put 'hbase_payments:transactions','tx00004','summary:customer_id','453' +put 'hbase_payments:transactions','tx00004','details:fee',3.23 +put 'hbase_payments:transactions','tx00004','details:bank','BoA' +put 'hbase_payments:transactions','tx00004','details:network','MASTERCARD' +put 'hbase_payments:transactions','tx00004','details:days_payable',45 +put 'hbase_payments:transactions','tx00004','details:latency',600 +put 'hbase_payments:transactions','tx00004','details:fraud_score',3 + +put 'hbase_payments:transactions','tx00005','summary:amount',8.57 +put 'hbase_payments:transactions','tx00005','summary:cc_id','9001' +put 'hbase_payments:transactions','tx00005','summary:auth','PLQA2' +put 'hbase_payments:transactions','tx00005','summary:status','PENDING' +put 'hbase_payments:transactions','tx00005','summary:order_id','0001237' +put 'hbase_payments:transactions','tx00005','summary:customer_id','92053' +put 'hbase_payments:transactions','tx00005','details:fee',0.08 +put 'hbase_payments:transactions','tx00005','details:bank','CHASE' +put 'hbase_payments:transactions','tx00005','details:network','VISA' +put 'hbase_payments:transactions','tx00005','details:days_payable',90 +put 'hbase_payments:transactions','tx00005','details:latency',250 +put 'hbase_payments:transactions','tx00005','details:fraud_score',2 + +put 'hbase_payments:transactions','tx00006','summary:amount',18.10 +put 'hbase_payments:transactions','tx00006','summary:cc_id','5612' +put 'hbase_payments:transactions','tx00006','summary:auth','BVF32' +put 'hbase_payments:transactions','tx00006','summary:status','PENDING' +put 'hbase_payments:transactions','tx00006','summary:order_id','0001236' +put 'hbase_payments:transactions','tx00006','summary:customer_id','12151' +put 'hbase_payments:transactions','tx00006','details:fee',0.01 +put 'hbase_payments:transactions','tx00006','details:bank','CHASE' +put 'hbase_payments:transactions','tx00006','details:network','VISA' +put 'hbase_payments:transactions','tx00006','details:days_payable',90 +put 'hbase_payments:transactions','tx00006','details:latency',112 +put 'hbase_payments:transactions','tx00006','details:fraud_score',1 + +scan 'hbase_payments:transactions' + +create 'hbase_payments:payment_providers', 'provider','network' + +put 'hbase_payments:payment_providers','VISA','provider:name','VISA' +put 'hbase_payments:payment_providers','VISA','provider:fee',0.01 +put 'hbase_payments:payment_providers','VISA','network:type','atm' +put 'hbase_payments:payment_providers','VISA','network:endpoint','visa.clearing.com' +put 'hbase_payments:payment_providers','VISA','network:endpoint_port',8080 + +put 'hbase_payments:payment_providers','MASTERCARD','provider:name','MASTERCARD' +put 'hbase_payments:payment_providers','MASTERCARD','provider:fee',0.02 +put 'hbase_payments:payment_providers','MASTERCARD','network:type','json-rpc' +put 'hbase_payments:payment_providers','MASTERCARD','network:endpoint','ms.ms-clearing.com' +put 'hbase_payments:payment_providers','MASTERCARD','network:endpoint_port',443 + +put 'hbase_payments:payment_providers','AMEX','provider:name','AMEX' +put 'hbase_payments:payment_providers','AMEX','provider:fee',0.03 +put 'hbase_payments:payment_providers','AMEX','network:type','xml-rpc' +put 'hbase_payments:payment_providers','AMEX','network:endpoint','amex.amex-clearing.com' +put 'hbase_payments:payment_providers','AMEX','network:endpoint_port',443 + +scan 'hbase_payments:payment_providers' \ No newline at end of file diff --git a/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactoryTest.java b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactoryTest.java new file mode 100644 index 0000000000..2c5622b658 --- /dev/null +++ b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseConnectionFactoryTest.java @@ -0,0 +1,62 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import org.apache.hadoop.hbase.client.Admin; +import org.apache.hadoop.hbase.client.Connection; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; + +import java.io.IOException; + +import static org.junit.Assert.*; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +public class HbaseConnectionFactoryTest +{ + private HbaseConnectionFactory connectionFactory; + + @Before + public void setUp() + throws Exception + { + connectionFactory = new HbaseConnectionFactory(); + } + + @Test + public void clientCacheHitTest() + throws IOException + { + Connection mockConn = mock(Connection.class); + Admin mockAdmin = mock(Admin.class); + when(mockConn.getAdmin()).thenReturn(mockAdmin); + + connectionFactory.addConnection("conStr", mockConn); + Connection conn = connectionFactory.getOrCreateConn("conStr"); + + assertEquals(mockConn, conn); + verify(mockConn, times(1)).getAdmin(); + verify(mockAdmin, times(1)).listTableNames(); + } +} diff --git a/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseFieldResolverTest.java b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseFieldResolverTest.java new file mode 100644 index 0000000000..62360e609a --- /dev/null +++ b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseFieldResolverTest.java @@ -0,0 +1,49 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.hadoop.hbase.client.Result; +import org.junit.Test; + +import static org.junit.Assert.assertEquals; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class HbaseFieldResolverTest +{ + + @Test + public void getFieldValue() + { + String expectedValue = "myValue"; + String family = "family"; + Field field = FieldBuilder.newBuilder("field1", Types.MinorType.VARCHAR.getType()).build(); + Result mockResult = mock(Result.class); + HbaseFieldResolver resolver = HbaseFieldResolver.resolver(false, family); + + when(mockResult.getValue(any(byte[].class), any(byte[].class))).thenReturn(expectedValue.getBytes()); + Object result = resolver.getFieldValue(field, mockResult); + assertEquals(expectedValue, result); + } +} diff --git a/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseMetadataHandlerTest.java b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseMetadataHandlerTest.java new file mode 100644 index 0000000000..290e57a3bb --- /dev/null +++ b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseMetadataHandlerTest.java @@ -0,0 +1,301 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.hadoop.hbase.HRegionInfo; +import org.apache.hadoop.hbase.NamespaceDescriptor; +import org.apache.hadoop.hbase.client.Admin; +import org.apache.hadoop.hbase.client.Connection; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.ResultScanner; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.client.Table; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +import static com.amazonaws.athena.connectors.hbase.TestUtils.makeResult; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyInt; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class HbaseMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(HbaseMetadataHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String catalog = "default"; + private HbaseMetadataHandler handler; + private BlockAllocator allocator; + + @Mock + private Connection mockClient; + + @Mock + private Admin mockAdmin; + + @Mock + private Table mockTable; + + @Mock + private HbaseConnectionFactory mockConnFactory; + + @Mock + private AWSGlue awsGlue; + + @Mock + private AWSSecretsManager secretsManager; + + @Mock + private AmazonAthena athena; + + @Before + public void setUp() + throws Exception + { + handler = new HbaseMetadataHandler(awsGlue, + new LocalKeyFactory(), + secretsManager, + athena, + mockConnFactory, + "spillBucket", + "spillPrefix"); + + when(mockConnFactory.getOrCreateConn(anyString())).thenReturn(mockClient); + when(mockClient.getAdmin()).thenReturn(mockAdmin); + when(mockClient.getTable(any())).thenReturn(mockTable); + + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doListSchemaNames() + throws IOException + { + logger.info("doListSchemaNames: enter"); + + NamespaceDescriptor[] schemaNames = {NamespaceDescriptor.create("schema1").build(), + NamespaceDescriptor.create("schema2").build(), + NamespaceDescriptor.create("schema3").build()}; + + when(mockAdmin.listNamespaceDescriptors()).thenReturn(schemaNames); + + ListSchemasRequest req = new ListSchemasRequest(identity, "queryId", "default"); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + + logger.info("doListSchemas - {}", res.getSchemas()); + Set expectedSchemaName = new HashSet<>(); + expectedSchemaName.add("schema1"); + expectedSchemaName.add("schema2"); + expectedSchemaName.add("schema3"); + assertEquals(expectedSchemaName, new HashSet<>(res.getSchemas())); + + logger.info("doListSchemaNames: exit"); + } + + @Test + public void doListTables() + throws IOException + { + logger.info("doListTables - enter"); + + String schema = "schema1"; + + org.apache.hadoop.hbase.TableName[] tables = { + org.apache.hadoop.hbase.TableName.valueOf("schema1", "table1"), + org.apache.hadoop.hbase.TableName.valueOf("schema1", "table2"), + org.apache.hadoop.hbase.TableName.valueOf("schema1", "table3") + }; + + Set tableNames = new HashSet<>(); + tableNames.add("table1"); + tableNames.add("table2"); + tableNames.add("table3"); + + when(mockAdmin.listTableNamesByNamespace(eq(schema))).thenReturn(tables); + ListTablesRequest req = new ListTablesRequest(identity, "queryId", "default", schema); + ListTablesResponse res = handler.doListTables(allocator, req); + logger.info("doListTables - {}", res.getTables()); + + for (TableName next : res.getTables()) { + assertEquals(schema, next.getSchemaName()); + assertTrue(tableNames.contains(next.getTableName())); + } + assertEquals(tableNames.size(), res.getTables().size()); + + logger.info("doListTables - exit"); + } + + /** + * TODO: Add more types. + */ + @Test + public void doGetTable() + throws Exception + { + logger.info("doGetTable - enter"); + + String schema = "schema1"; + String table = "table1"; + List results = TestUtils.makeResults(); + + ResultScanner mockScanner = mock(ResultScanner.class); + when(mockTable.getScanner(any(Scan.class))).thenReturn(mockScanner); + when(mockScanner.iterator()).thenReturn(results.iterator()); + + GetTableRequest req = new GetTableRequest(identity, "queryId", catalog, new TableName(schema, table)); + GetTableResponse res = handler.doGetTable(allocator, req); + logger.info("doGetTable - {}", res); + + Schema expectedSchema = TestUtils.makeSchema() + .addField(HbaseSchemaUtils.ROW_COLUMN_NAME, Types.MinorType.VARCHAR.getType()) + .build(); + + assertEquals(expectedSchema.getFields().size(), res.getSchema().getFields().size()); + logger.info("doGetTable - exit"); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + GetTableLayoutRequest req = new GetTableLayoutRequest(identity, + "queryId", + "default", + new TableName("schema1", "table1"), + new Constraints(new HashMap<>()), + SchemaBuilder.newBuilder().build(), + Collections.EMPTY_SET); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout - {}", res); + Block partitions = res.getPartitions(); + for (int row = 0; row < partitions.getRowCount() && row < 10; row++) { + logger.info("doGetTableLayout:{} {}", row, BlockUtils.rowToString(partitions, row)); + } + + assertTrue(partitions.getRowCount() > 0); + + logger.info("doGetTableLayout: partitions[{}]", partitions.getRowCount()); + } + + @Test + public void doGetSplits() + throws IOException + { + logger.info("doGetSplits: enter"); + + List regionServers = new ArrayList<>(); + regionServers.add(TestUtils.makeRegion(1, "schema1", "table1")); + regionServers.add(TestUtils.makeRegion(2, "schema1", "table1")); + regionServers.add(TestUtils.makeRegion(3, "schema1", "table1")); + regionServers.add(TestUtils.makeRegion(4, "schema1", "table1")); + + when(mockAdmin.getTableRegions(any())).thenReturn(regionServers); + List partitionCols = new ArrayList<>(); + + Block partitions = BlockUtils.newBlock(allocator, "partitionId", Types.MinorType.INT.getType(), 0); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName("schema", "table_name"), + partitions, + partitionCols, + new Constraints(new HashMap<>()), + null); + + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + + logger.info("doGetSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", + new Object[] {continuationToken, response.getSplits().size()}); + + assertTrue("Continuation criteria violated", response.getSplits().size() == 4); + assertTrue("Continuation criteria violated", response.getContinuationToken() == null); + + logger.info("doGetSplits: exit"); + } +} diff --git a/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseRecordHandlerTest.java b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseRecordHandlerTest.java new file mode 100644 index 0000000000..a5132ec1bc --- /dev/null +++ b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseRecordHandlerTest.java @@ -0,0 +1,304 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.hadoop.hbase.client.Connection; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.ResultScanner; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.client.Table; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.IOException; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.END_KEY_FIELD; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.HBASE_CONN_STR; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.REGION_ID_FIELD; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.REGION_NAME_FIELD; +import static com.amazonaws.athena.connectors.hbase.HbaseMetadataHandler.START_KEY_FIELD; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class HbaseRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(HbaseRecordHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String catalog = "default"; + private HbaseRecordHandler handler; + private BlockAllocator allocator; + private List mockS3Storage = new ArrayList<>(); + private AmazonS3 amazonS3; + private S3BlockSpillReader spillReader; + private Schema schemaForRead; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + + @Mock + private Connection mockClient; + + @Mock + private Table mockTable; + + @Mock + private HbaseConnectionFactory mockConnFactory; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws IOException + { + logger.info("setUpBefore - enter"); + + when(mockConnFactory.getOrCreateConn(anyString())).thenReturn(mockClient); + when(mockClient.getTable(any())).thenReturn(mockTable); + + allocator = new BlockAllocatorImpl(); + + amazonS3 = mock(AmazonS3.class); + + when(amazonS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + }); + + when(amazonS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + }); + + schemaForRead = TestUtils.makeSchema().addStringField(HbaseSchemaUtils.ROW_COLUMN_NAME).build(); + + handler = new HbaseRecordHandler(amazonS3, mockSecretsManager, mockAthena, mockConnFactory); + spillReader = new S3BlockSpillReader(amazonS3, allocator); + + logger.info("setUpBefore - exit"); + } + + @After + public void after() + { + allocator.close(); + } + + @Test + public void doReadRecordsNoSpill() + throws Exception + { + logger.info("doReadRecordsNoSpill: enter"); + + String schema = "schema1"; + String table = "table1"; + + List results = TestUtils.makeResults(100); + ResultScanner mockScanner = mock(ResultScanner.class); + when(mockTable.getScanner(any(Scan.class))).thenReturn(mockScanner); + when(mockScanner.iterator()).thenReturn(results.iterator()); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("family1:col3", SortedRangeSet.copyOf(Types.MinorType.BIGINT.getType(), + ImmutableList.of(Range.equal(allocator, Types.MinorType.BIGINT.getType(), 1L)), false)); + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split.Builder splitBuilder = Split.newBuilder(splitLoc, keyFactory.create()) + .add(HBASE_CONN_STR, "fake_con_str") + .add(START_KEY_FIELD, "fake_start_key") + .add(END_KEY_FIELD, "fake_end_key") + .add(REGION_ID_FIELD, "fake_region_id") + .add(REGION_NAME_FIELD, "fake_region_name"); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + splitBuilder.build(), + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsNoSpill: rows[{}]", response.getRecordCount()); + + assertTrue(response.getRecords().getRowCount() == 1); + logger.info("doReadRecordsNoSpill: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("doReadRecordsNoSpill: exit"); + } + + @Test + public void doReadRecordsSpill() + throws Exception + { + logger.info("doReadRecordsSpill: enter"); + + String schema = "schema1"; + String table = "table1"; + + List results = TestUtils.makeResults(10_000); + ResultScanner mockScanner = mock(ResultScanner.class); + when(mockTable.getScanner(any(Scan.class))).thenReturn(mockScanner); + when(mockScanner.iterator()).thenReturn(results.iterator()); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("family1:col3", SortedRangeSet.copyOf(Types.MinorType.BIGINT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.BIGINT.getType(), 0L)), true)); + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split.Builder splitBuilder = Split.newBuilder(splitLoc, keyFactory.create()) + .add(HBASE_CONN_STR, "fake_con_str") + .add(START_KEY_FIELD, "fake_start_key") + .add(END_KEY_FIELD, "fake_end_key") + .add(REGION_ID_FIELD, "fake_region_id") + .add(REGION_NAME_FIELD, "fake_region_name"); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + splitBuilder.build(), + new Constraints(constraintsMap), + 1_500_000L, //~1.5MB so we should see some spill + 0L + ); + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof RemoteReadRecordsResponse); + + try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) { + logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size()); + + assertTrue(response.getNumberBlocks() > 1); + + int blockNum = 0; + for (SpillLocation next : response.getRemoteBlocks()) { + S3SpillLocation spillLocation = (S3SpillLocation) next; + try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) { + + logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount()); + // assertTrue(++blockNum < response.getRemoteBlocks().size() && block.getRowCount() > 10_000); + + logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0)); + assertNotNull(BlockUtils.rowToString(block, 0)); + } + } + } + + logger.info("doReadRecordsSpill: exit"); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseSchemaUtilsTest.java b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseSchemaUtilsTest.java new file mode 100644 index 0000000000..b1d8ede80f --- /dev/null +++ b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/HbaseSchemaUtilsTest.java @@ -0,0 +1,175 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.domain.TableName; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Connection; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.ResultScanner; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.client.Table; +import org.junit.Test; +import org.mockito.invocation.InvocationOnMock; + +import java.io.IOException; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.concurrent.atomic.AtomicLong; + +import static com.amazonaws.athena.connectors.hbase.HbaseSchemaUtils.coerceType; +import static com.amazonaws.athena.connectors.hbase.HbaseSchemaUtils.toBytes; +import static com.amazonaws.athena.connectors.hbase.TestUtils.makeResult; +import static org.junit.Assert.assertArrayEquals; +import static org.junit.Assert.assertEquals; +import static org.junit.Assert.assertNotNull; +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +public class HbaseSchemaUtilsTest +{ + @Test + public void inferSchema() + throws IOException + { + int numToScan = 4; + TableName tableName = new TableName("schema", "table"); + List results = TestUtils.makeResults(); + + Connection mockConnection = mock(Connection.class); + Table mockTable = mock(Table.class); + ResultScanner mockScanner = mock(ResultScanner.class); + + when(mockConnection.getTable(any(org.apache.hadoop.hbase.TableName.class))).thenAnswer((InvocationOnMock invocation) -> { + org.apache.hadoop.hbase.TableName table = invocation.getArgumentAt(0, org.apache.hadoop.hbase.TableName.class); + assertEquals(tableName.getSchemaName() + ":" + tableName.getTableName(), table.getNameAsString()); + return mockTable; + }); + + when(mockTable.getScanner(any(Scan.class))).thenAnswer((InvocationOnMock invocation) -> { + Scan scan = invocation.getArgumentAt(0, Scan.class); + assertEquals(numToScan, scan.getMaxResultSize()); + return mockScanner; + }); + when(mockScanner.iterator()).thenReturn(results.iterator()); + + Schema schema = HbaseSchemaUtils.inferSchema(mockConnection, tableName, numToScan); + + Map actualFields = new HashMap<>(); + schema.getFields().stream().forEach(next -> actualFields.put(next.getName(), Types.getMinorTypeForArrowType(next.getType()))); + + Map expectedFields = new HashMap<>(); + TestUtils.makeSchema().build().getFields().stream() + .forEach(next -> expectedFields.put(next.getName(), Types.getMinorTypeForArrowType(next.getType()))); + + for (Map.Entry nextExpected : expectedFields.entrySet()) { + assertNotNull(actualFields.get(nextExpected.getKey())); + assertEquals(nextExpected.getKey(), nextExpected.getValue(), actualFields.get(nextExpected.getKey())); + } + assertEquals(expectedFields.size(), actualFields.size()); + + verify(mockConnection, times(1)).getTable(any()); + verify(mockTable, times(1)).getScanner(any(Scan.class)); + verify(mockScanner, times(1)).iterator(); + } + + @Test + public void getQualifiedTableName() + { + String table = "table"; + String schema = "schema"; + String expected = "schema:table"; + String actual = HbaseSchemaUtils.getQualifiedTableName(new TableName(schema, table)); + assertEquals(expected, actual); + } + + @Test + public void getQualifiedTable() + { + String table = "table"; + String schema = "schema"; + org.apache.hadoop.hbase.TableName expected = org.apache.hadoop.hbase.TableName.valueOf(schema + ":" + table); + org.apache.hadoop.hbase.TableName actual = HbaseSchemaUtils.getQualifiedTable(new TableName(schema, table)); + assertEquals(expected, actual); + } + + @Test + public void inferType() + { + assertEquals(Types.MinorType.BIGINT, HbaseSchemaUtils.inferType("1".getBytes())); + assertEquals(Types.MinorType.BIGINT, HbaseSchemaUtils.inferType("1000".getBytes())); + assertEquals(Types.MinorType.BIGINT, HbaseSchemaUtils.inferType("-1".getBytes())); + assertEquals(Types.MinorType.FLOAT8, HbaseSchemaUtils.inferType("1.0".getBytes())); + assertEquals(Types.MinorType.FLOAT8, HbaseSchemaUtils.inferType(".01".getBytes())); + assertEquals(Types.MinorType.FLOAT8, HbaseSchemaUtils.inferType("-.01".getBytes())); + assertEquals(Types.MinorType.VARCHAR, HbaseSchemaUtils.inferType("BDFKD".getBytes())); + assertEquals(Types.MinorType.VARCHAR, HbaseSchemaUtils.inferType("".getBytes())); + } + + @Test + public void coerceTypeTest() + { + boolean isNative = false; + assertEquals("asf", coerceType(isNative, Types.MinorType.VARCHAR.getType(), "asf".getBytes())); + assertEquals("2.0", coerceType(isNative, Types.MinorType.VARCHAR.getType(), "2.0".getBytes())); + assertEquals(1, coerceType(isNative, Types.MinorType.INT.getType(), "1".getBytes())); + assertEquals(-1, coerceType(isNative, Types.MinorType.INT.getType(), "-1".getBytes())); + assertEquals(1L, coerceType(isNative, Types.MinorType.BIGINT.getType(), "1".getBytes())); + assertEquals(-1L, coerceType(isNative, Types.MinorType.BIGINT.getType(), "-1".getBytes())); + assertEquals(1.1F, coerceType(isNative, Types.MinorType.FLOAT4.getType(), "1.1".getBytes())); + assertEquals(-1.1F, coerceType(isNative, Types.MinorType.FLOAT4.getType(), "-1.1".getBytes())); + assertEquals(1.1D, coerceType(isNative, Types.MinorType.FLOAT8.getType(), "1.1".getBytes())); + assertEquals(-1.1D, coerceType(isNative, Types.MinorType.FLOAT8.getType(), "-1.1".getBytes())); + assertArrayEquals("-1.1".getBytes(), (byte[]) coerceType(isNative, Types.MinorType.VARBINARY.getType(), "-1.1".getBytes())); + } + + @Test + public void coerceTypeNativeTest() + { + boolean isNative = true; + assertEquals("asf", coerceType(isNative, Types.MinorType.VARCHAR.getType(), "asf".getBytes())); + assertEquals("2.0", coerceType(isNative, Types.MinorType.VARCHAR.getType(), "2.0".getBytes())); + assertEquals(1, coerceType(isNative, Types.MinorType.INT.getType(), toBytes(isNative, 1))); + assertEquals(-1, coerceType(isNative, Types.MinorType.INT.getType(), toBytes(isNative, -1))); + assertEquals(1L, coerceType(isNative, Types.MinorType.BIGINT.getType(), toBytes(isNative, 1L))); + assertEquals(-1L, coerceType(isNative, Types.MinorType.BIGINT.getType(), toBytes(isNative, -1L))); + assertEquals(1.1F, coerceType(isNative, Types.MinorType.FLOAT4.getType(), toBytes(isNative, 1.1F))); + assertEquals(-1.1F, coerceType(isNative, Types.MinorType.FLOAT4.getType(), toBytes(isNative, -1.1F))); + assertEquals(1.1D, coerceType(isNative, Types.MinorType.FLOAT8.getType(), toBytes(isNative, 1.1D))); + assertEquals(-1.1D, coerceType(isNative, Types.MinorType.FLOAT8.getType(), toBytes(isNative, -1.1D))); + assertArrayEquals("-1.1".getBytes(), (byte[]) coerceType(isNative, Types.MinorType.VARBINARY.getType(), "-1.1".getBytes())); + } + + @Test + public void extractColumnParts() + { + String[] parts = HbaseSchemaUtils.extractColumnParts("family:column"); + assertEquals("family", parts[0]); + assertEquals("column", parts[1]); + } +} diff --git a/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/TestUtils.java b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/TestUtils.java new file mode 100644 index 0000000000..98ff8147c1 --- /dev/null +++ b/athena-hbase/src/test/java/com/amazonaws/athena/connectors/hbase/TestUtils.java @@ -0,0 +1,162 @@ +/*- + * #%L + * athena-hbase + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.hbase; + +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.hadoop.hbase.HRegionInfo; +import org.apache.hadoop.hbase.KeyValue; +import org.apache.hadoop.hbase.client.Result; +import org.mockito.invocation.InvocationOnMock; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; + +import static org.mockito.Matchers.any; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +public class TestUtils +{ + private static final Logger logger = LoggerFactory.getLogger(TestUtils.class); + + private TestUtils() {} + + public static SchemaBuilder makeSchema() + { + return SchemaBuilder.newBuilder() + .addStringField("family1:col1") + .addStringField("family1:col2") + .addBigIntField("family1:col3") + .addStringField("family1:col10") + .addFloat8Field("family1:col20") + .addBigIntField("family1:col30") + .addStringField("family2:col1") + .addFloat8Field("family2:col2") + .addBigIntField("family2:col3") + .addStringField("family2:col10") + .addFloat8Field("family2:col20") + .addBigIntField("family2:col30") + .addBigIntField("family3:col1"); + } + + public static List makeResults() + { + List results = new ArrayList<>(); + //Initial row with two column familyies and a mix of columns and types + results.add(makeResult("family1", "col1", "varchar", + "family1", "col2", "1.0", + "family1", "col3", "1", + "family2", "col1", "varchar", + "family2", "col2", "1.0", + "family2", "col3", "1" + )); + + //Add a row which has only new columns to ensure we get a union + results.add(makeResult("family1", "col10", "varchar", + "family1", "col20", "1.0", + "family1", "col30", "1", + "family2", "col10", "varchar", + "family2", "col20", "1.0", + "family2", "col30", "1", + "family3", "col1", "1" + )); + + //Add a 2nd occurance of col2 in family1 but with a conflicting type, it should result in a final type of varchar + results.add(makeResult("family1", "col2", "1")); + + return results; + } + + public static List makeResults(int numResults) + { + List results = new ArrayList<>(); + + for (int i = 0; i < numResults; i++) { + //Initial row with two column familyies and a mix of columns and types + results.add(makeResult("family1", "col1", "varchar" + i, + "family1", "col2", String.valueOf(i * 1.1), + "family1", "col3", String.valueOf(i), + "family2", "col1", "varchar" + i, + "family2", "col2", String.valueOf(i * 1.1), + "family2", "col3", String.valueOf(i) + )); + + //Add a row which has only new columns to ensure we get a union + results.add(makeResult("family1", "col10", "varchar" + i, + "family1", "col20", String.valueOf(i * 1.1), + "family1", "col30", String.valueOf(i), + "family2", "col10", "varchar" + i, + "family2", "col20", String.valueOf(i * 1.1), + "family2", "col30", String.valueOf(i), + "family3", "col1", String.valueOf(i) + )); + + //Add a 2nd occurance of col2 in family1 but with a conflicting type, it should result in a final type of varchar + results.add(makeResult("family1", "col2", String.valueOf(i))); + } + + return results; + } + + public static Result makeResult(String... values) + { + if (values.length % 3 != 0) { + throw new RuntimeException("Method requires values in multiples of 3 -> family, qualifier, value"); + } + + List result = new ArrayList<>(); + Map valueMap = new HashMap<>(); + for (int i = 0; i < values.length; i += 3) { + KeyValue mockKeyValue = mock(KeyValue.class); + when(mockKeyValue.getFamily()).thenReturn(values[i].getBytes()); + when(mockKeyValue.getQualifier()).thenReturn(values[i + 1].getBytes()); + when(mockKeyValue.getValue()).thenReturn(values[i + 2].getBytes()); + valueMap.put(values[i] + ":" + values[i + 1], values[i + 2]); + result.add(mockKeyValue); + } + + Result mockResult = mock(Result.class); + when(mockResult.list()).thenReturn(result); + when(mockResult.getValue(any(byte[].class), any(byte[].class))) + .thenAnswer((InvocationOnMock invocation) -> { + String family = new String(invocation.getArgumentAt(0, byte[].class)); + String column = new String(invocation.getArgumentAt(1, byte[].class)); + String key = family + ":" + column; + if (!valueMap.containsKey(key)) { + return null; + } + else { + return valueMap.get(key).getBytes(); + } + }); + return mockResult; + } + + public static HRegionInfo makeRegion(int id, String schema, String table) + { + return new HRegionInfo(id, org.apache.hadoop.hbase.TableName.valueOf(schema, table), 1); + } +} diff --git a/athena-jdbc/LICENSE.txt b/athena-jdbc/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-jdbc/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-jdbc/README.md b/athena-jdbc/README.md new file mode 100644 index 0000000000..a960eb3d1f --- /dev/null +++ b/athena-jdbc/README.md @@ -0,0 +1,176 @@ +# Amazon Athena Lambda Jdbc Connector + +This connector enables Amazon Athena to access your SQL database or RDS instance(s) using JDBC driver. + +Following databases are supported: + +1. MySql +2. PostGreSql +3. Redshift + +See `com.amazonaws.connectors.athena.jdbc.connectio.JdbcConnectionFactory.DatabaseEngine` for latest database types supported. + +# Terms + +* **Database Instance:** Any instance of a database deployed on premises, EC2 or using RDS. +* **Database type:** Could be one of mysql, postgres, redshift. +* **Handler:** A Lambda handler accessing your database instance(s). Could be metadata or a record handler. +* **Metadata Handler:** A Lambda handler that retrieves metadata from your database instance(s). +* **Record Handler:** A Lambda handler that retrieves data records from your database instance(s). +* **Property/Parameter:** A database property used by handlers to extract database information for connection. These are set as Lambda environment variables. +* **Connection String:** Used to establish connection to a database instance. +* **Catalog:** Athena Catalog. This is not a Glue Catalog. Must be used to prefix `connection_string` property. + +# Usage + +## Parameters + +Jdbc Connector supports several configuration parameters using Lambda environment variables. Each parameter should be prefixed with a database instance name (any unique string) except spill s3 bucket and prefix. + + +## Connection String: + +Connection string is used to connect to a database instance. + +We support following format: + +`${db_type}://` + +``` +db_type One of following, mysql, postgres, redshift. +jdbc_connection_string Connection string for a database type. For example, MySql connection String: jdbc:mysql://host1:33060/database +``` + + +## Multiplexing handler parameters + +Multiplexer provides a way to connect to multiple database instances of any type using a single Lambda function. Requests are routed depending on catalog name. Use following classes in Lambda for using multiplexer. + +|DB Type|MetadataHandler|RecordHandler| +|--- |--- |--- | +|All supported database types|com.amazonaws.connectors.athena.jdbc.MultiplexingJdbcMetadataHandler|com.amazonaws.connectors.athena.jdbc.MultiplexingJdbcRecordHandler| + +**Parameters:** + +``` +_connection_string Database instance connection string. One of two types specified above. Required. +default Default connection string. Required. This will be used when a catalog is not recognized. +``` + +Example properties for a Mux Lambda function that supports three database instances, mysql1, mysql2 and postgres1: + +|Property|Value| +|---|---| +|mysql_catalog1_connection_string |mysql://jdbc:mysql://mysql1.host:3306/default?${Test/RDS/PostGres1}| +| | | +| | | +|mysql_catalog2_connection_string |mysql://jdbc:mysql://mysql2.host:3333/default?user=sample2&password=sample2| +| | | +| | | +|postgres_catalog3_connection_string |postgres://jdbc:postgresql://postgres1.host:5432/default?${Test/RDS/PostGres1}| + +JDBC Connector supports substitution of any string enclosed like ${SecretName} with username and password retrieved from AWS Secrets Manager. Example, `mysql://jdbc:mysql://mysql1.host:3306/default?...&${Test/RDS/PostGres1}&...` will be replaced to `mysql://jdbc:mysql://mysql1.host:3306/default?...&user=sample2&password=sample2&...` and secret name `Test/RDS/PostGres1` will be used to retrieve secrets. + +## Database specific handler parameters + +Database specific metadata and record handlers can also be used to connect to a database instance. These are currently capable of connecting to a single database instance. + +|DB Type|MetadataHandler|RecordHandler| +|---|---|---| +|MySql|com.amazonaws.connectors.athena.jdbc.mysql.MySqlMetadataHandler|com.amazonaws.connectors.athena.jdbc.mysql.MySqlRecordHandler| +|PostGreSql|com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlMetadataHandler|com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlRecordHandler| +|Redshift|com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlMetadataHandler|com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlRecordHandler| + +**Parameters:** + +``` +default Default connection string. Required. This will be used when a catalog is not recognized. +``` + +These handlers support one database instance. Must provide `default` parameter, everything else is ignored. + +**Example property for a single MySql instance supported by a Lambda function:** + +| Property | Value | +| --- | --- | +| catalog1_connection_string | mysql://mysql1.host:3306/default?secret=Test/RDS/MySql1 | + +## Common parameters + +### Spill parameters: + +All database instances accessed using Lambda spill to the same location + +``` +spill_bucket Bucket name for spill. Required. +spill_prefix Spill bucket key prefix. Required. +``` + +What is spilled? + +# Data types support + +|Jdbc|Arrow| +| ---|---| +|Boolean|Bit| +|Integer|Tiny| +|Short|Smallint| +|Integer|Int| +|Long|Bigint| +|float|Float4| +|Double|Float8| +|Date|DateDay| +|Timestamp|DateMilli| +|String|Varchar| +|Bytes|Varbinary| +|BigDecimal|Decimal| + +See respective database documentation for conversion between JDBC and database types. + +# Secrets + +We support two ways to input database user name and password: + +1. **AWS Secrets Manager:** The name of the secret in AWS Secrets Manager can be embedded in JDBC connection string, which is used to replace with `username` and `password` in secret value. Support is tightly integrated for AWS RDS database instances. When using AWS RDS, we highly recommend using AWS Secrets Manager, including credential rotation. If your database is not using AWS RDS, store credentials as JSON in the following format `{“username”: “${username}”, “password”: “${password}”}.`. +2. **Basic Auth:** Username and password can be specified in the JDBC connection string. + +# Partitions and Splits +### MySql +A partition is represented by a single partition column of type varchar. We leverage partitions defined on a MySql table, and this column contains partition names. For a table that does not have partition names, * is returned which single partition. A partition is equivalent to a split. + +|Name|Type|Description +|---|---|---| +|partition_name|Varchar|Named partition in MySql. E.g. p0| + + +### PostGreSql & Redshift +A partition is represented by two partition columns of type varchar. We leverage partitions as child tables defined on a PostGres table, and these columns contain child schema and child table information. For a table that does not have partition names, * is returned which single partition. A partition is equivalent to a split. + +|Name|Type|Description +|---|---|---| +|partition_schema|Varchar|Child table schema name| +|partition_name|Varchar|Child table name| + +In case of Redshift partition_schema and partition_name will always be "*". + +### Deploying The Connector + +To use the Amazon Athena HBase Connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-jdbc dir, run `mvn clean install`. +3. From the athena-jdbc dir, run `../tools/publish.sh S3_BUCKET_NAME athena-jdbc` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) + +# JDBC Driver Versions +Current supported versions of JDBC Driver: + +|Database Type|Version| +|---|---| +|MySql|8.0.17| +|PostGreSql|42.2.8| +|Redshift|1.2.34.1058| + +# Limitations +* Write DDL operations are not supported. Athena can only do read operations currently. Athena assumes tables and relevant entities already exist in database instance. +* In Mux setup, spill bucket and prefix is shared across all database instances. +* Any relevant Lambda Limits. See Lambda documentation. diff --git a/athena-jdbc/athena-jdbc.yaml b/athena-jdbc/athena-jdbc.yaml new file mode 100644 index 0000000000..1d8f484456 --- /dev/null +++ b/athena-jdbc/athena-jdbc.yaml @@ -0,0 +1,102 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaJdbcConnector + Description: 'This connector enables Amazon Athena to communicate with your Database instance(s) using JDBC driver.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + LambdaFunctionName: + Description: 'The name you will give to the Function accessing Database instance. This is also the name you can use to query your database from athena using the registration-less catalog of “lambda:”.' + Type: String + DefaultConnectionString: + Description: 'The default connection string is used when catalog is "lambda:${LambdaFunctionName}". Catalog specific Connection Strings can be added later. Format: ${DatabaseType}://${NativeJdbcConnectionString}.' + Type: String + SecretNamePrefix: + Description: 'Used to create resource-based authorization policy for "secretsmanager:GetSecretValue" action. E.g. All Athena JDBC Federation secret names can be prefixed with "AthenaJdbcFederation" and authorization policy will allow "arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:AthenaJdbcFederation*". Parameter value in this case should be "AthenaJdbcFederation". If you do not have a prefix, you can manually update the IAM policy to add allow any secret names.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + Default: athena-federation-spill + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill/jdbc + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: 'If set to ''false'' data spilled to S3 is encrypted with AES GCM' + Default: 'false' + Type: String + SecurityGroupIds: + Description: 'One or more SecurityGroup IDs corresponding to the SecurityGroup that should be applied to the Lambda function. (e.g. sg1,sg2,sg3)' + Type: 'List' + SubnetIds: + Description: 'One or more Subnet IDs corresponding to the Subnet that the Lambda function can use to access you data source. (e.g. subnet1,subnet2)' + Type: 'List' +Resources: + JdbcConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + default: !Ref DefaultConnectionString + FunctionName: !Ref LambdaFunctionName + Handler: "com.amazonaws.connectors.athena.jdbc.MultiplexingJdbcCompositeHandler" + CodeUri: "./target/athena-jdbc-1.0.jar" + Description: "Enables Amazon Athena to communicate with Databases using JDBC" + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - secretsmanager:GetSecretValue + Effect: Allow + Resource: !Sub 'arn:aws:secretsmanager:${AWS::Region}:${AWS::AccountId}:${SecretNamePrefix}*' + Version: '2012-10-17' + - Statement: + - Action: + - logs:CreateLogGroup + Effect: Allow + Resource: !Sub 'arn:aws:logs:${AWS::Region}:${AWS::AccountId}:*' + Version: '2012-10-17' + - Statement: + - Action: + - logs:CreateLogStream + - logs:PutLogEvents + Effect: Allow + Resource: !Sub 'arn:aws:logs:${AWS::Region}:${AWS::AccountId}:log-group:/aws/lambda/${LambdaFunctionName}:*' + Version: '2012-10-17' + - Statement: + - Action: + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket + #VPCAccessPolicy allows our connector to run in a VPC so that it can access your data source. + - VPCAccessPolicy: {} + VpcConfig: + SecurityGroupIds: !Ref SecurityGroupIds + SubnetIds: !Ref SubnetIds \ No newline at end of file diff --git a/athena-jdbc/pom.xml b/athena-jdbc/pom.xml new file mode 100644 index 0000000000..1c06e9991c --- /dev/null +++ b/athena-jdbc/pom.xml @@ -0,0 +1,92 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-jdbc + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + org.apache.commons + commons-text + 1.8 + + + mysql + mysql-connector-java + 8.0.17 + + + org.antlr + stringtemplate + 4.0.2 + + + com.amazonaws + aws-java-sdk-secretsmanager + ${aws-sdk.version} + + + org.apache.arrow + arrow-jdbc + 0.15.0 + + + org.postgresql + postgresql + 42.2.8 + + + com.amazon.redshift + redshift-jdbc42 + 1.2.34.1058 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + + + + redshift + https://s3.amazonaws.com/redshift-maven-repository/release + + + diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcCompositeHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcCompositeHandler.java new file mode 100644 index 0000000000..752bd14122 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose {@link MultiplexingJdbcMetadataHandler} and {@link MultiplexingJdbcRecordHandler}. + */ +public class MultiplexingJdbcCompositeHandler + extends CompositeHandler +{ + public MultiplexingJdbcCompositeHandler() + { + super(new MultiplexingJdbcMetadataHandler(), new MultiplexingJdbcRecordHandler()); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcMetadataHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcMetadataHandler.java new file mode 100644 index 0000000000..ae75e5f6e1 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcMetadataHandler.java @@ -0,0 +1,137 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JDBCUtil; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcMetadataHandler; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.annotations.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.Validate; + +import java.util.Map; + +/** + * Metadata handler multiplexer that supports multiple engines e.g. MySQL, Oracle, etc. in same Lambda. + */ +public class MultiplexingJdbcMetadataHandler + extends JdbcMetadataHandler +{ + private static final int MAX_CATALOGS_TO_MULTIPLEX = 100; + private final Map metadataHandlerMap; + + /** + * @param metadataHandlerMap catalog -> JdbcMetadataHandler + */ + @VisibleForTesting + MultiplexingJdbcMetadataHandler(final AWSSecretsManager secretsManager, final AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory, + final Map metadataHandlerMap, final DatabaseConnectionConfig databaseConnectionConfig) + { + super(databaseConnectionConfig, secretsManager, athena, jdbcConnectionFactory); + this.metadataHandlerMap = Validate.notEmpty(metadataHandlerMap, "metadataHandlerMap must not be empty"); + + if (this.metadataHandlerMap.size() > MAX_CATALOGS_TO_MULTIPLEX) { + throw new RuntimeException("Max 100 catalogs supported in multiplexer."); + } + } + + /** + * Initializes mux routing map. Creates a reverse index of Athena catalogs supported by a database instance. Max 100 catalogs supported currently. + */ + public MultiplexingJdbcMetadataHandler() + { + this.metadataHandlerMap = Validate.notEmpty(JDBCUtil.createJdbcMetadataHandlerMap(System.getenv()), "Could not find any delegatee."); + } + + private void validateMultiplexer(final String catalogName) + { + if (this.metadataHandlerMap.get(catalogName) == null) { + throw new RuntimeException("Catalog not supported in multiplexer " + catalogName); + } + } + + @Override + public Schema getPartitionSchema(final String catalogName) + { + validateMultiplexer(catalogName); + return this.metadataHandlerMap.get(catalogName).getPartitionSchema(catalogName); + } + + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest listSchemasRequest) + { + validateMultiplexer(listSchemasRequest.getCatalogName()); + return this.metadataHandlerMap.get(listSchemasRequest.getCatalogName()).doListSchemaNames(blockAllocator, listSchemasRequest); + } + + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest listTablesRequest) + { + validateMultiplexer(listTablesRequest.getCatalogName()); + return this.metadataHandlerMap.get(listTablesRequest.getCatalogName()).doListTables(blockAllocator, listTablesRequest); + } + + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest getTableRequest) + { + validateMultiplexer(getTableRequest.getCatalogName()); + return this.metadataHandlerMap.get(getTableRequest.getCatalogName()).doGetTable(blockAllocator, getTableRequest); + } + + @Override + public void getPartitions(final BlockWriter blockWriter, final GetTableLayoutRequest getTableLayoutRequest, QueryStatusChecker queryStatusChecker) + throws Exception + { + validateMultiplexer(getTableLayoutRequest.getCatalogName()); + this.metadataHandlerMap.get(getTableLayoutRequest.getCatalogName()).getPartitions(blockWriter, getTableLayoutRequest, queryStatusChecker); + } + + @Override + public GetTableLayoutResponse doGetTableLayout(BlockAllocator blockAllocator, GetTableLayoutRequest getTableLayoutRequest) + throws Exception + { + validateMultiplexer(getTableLayoutRequest.getCatalogName()); + return this.metadataHandlerMap.get(getTableLayoutRequest.getCatalogName()).doGetTableLayout(blockAllocator, getTableLayoutRequest); + } + + @Override + public GetSplitsResponse doGetSplits( + final BlockAllocator blockAllocator, final GetSplitsRequest getSplitsRequest) + { + validateMultiplexer(getSplitsRequest.getCatalogName()); + return this.metadataHandlerMap.get(getSplitsRequest.getCatalogName()).doGetSplits(blockAllocator, getSplitsRequest); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcRecordHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcRecordHandler.java new file mode 100644 index 0000000000..fd75bf0598 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcRecordHandler.java @@ -0,0 +1,92 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JDBCUtil; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcRecordHandler; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.annotations.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.Validate; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.util.Map; + +/** + * Record handler multiplexer that supports multiple engines e.g. MySQL, Oracle, etc. in same Lambda. + */ +public class MultiplexingJdbcRecordHandler + extends JdbcRecordHandler +{ + private static final int MAX_CATALOGS_TO_MULTIPLEX = 100; + private final Map recordHandlerMap; + + public MultiplexingJdbcRecordHandler() + { + this.recordHandlerMap = Validate.notEmpty(JDBCUtil.createJdbcRecordHandlerMap(System.getenv()), "Could not find any delegatee."); + } + + @VisibleForTesting + MultiplexingJdbcRecordHandler(final AmazonS3 amazonS3, final AWSSecretsManager secretsManager, final AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory, + final DatabaseConnectionConfig databaseConnectionConfig, final Map recordHandlerMap) + { + super(amazonS3, secretsManager, athena, databaseConnectionConfig, jdbcConnectionFactory); + this.recordHandlerMap = Validate.notEmpty(recordHandlerMap, "recordHandlerMap must not be empty"); + + if (this.recordHandlerMap.size() > MAX_CATALOGS_TO_MULTIPLEX) { + throw new RuntimeException("Max 100 catalogs supported in multiplexer."); + } + } + + private void validateMultiplexer(final String catalogName) + { + if (this.recordHandlerMap.get(catalogName) == null) { + throw new RuntimeException("Catalog not supported in multiplexer " + catalogName); + } + } + + @Override + public void readWithConstraint( + final BlockSpiller blockSpiller, + final ReadRecordsRequest readRecordsRequest, QueryStatusChecker queryStatusChecker) + { + validateMultiplexer(readRecordsRequest.getCatalogName()); + this.recordHandlerMap.get(readRecordsRequest.getCatalogName()).readWithConstraint(blockSpiller, readRecordsRequest, queryStatusChecker); + } + + @Override + public PreparedStatement buildSplitSql(Connection jdbcConnection, String catalogName, TableName tableName, Schema schema, Constraints constraints, Split split) + throws SQLException + { + return this.recordHandlerMap.get(catalogName).buildSplitSql(jdbcConnection, catalogName, tableName, schema, constraints, split); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfig.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfig.java new file mode 100644 index 0000000000..c2d3c40cba --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfig.java @@ -0,0 +1,100 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import org.apache.commons.lang3.Validate; + +import java.util.Objects; + +public class DatabaseConnectionConfig +{ + private String catalog; + private final JdbcConnectionFactory.DatabaseEngine type; + private final String jdbcConnectionString; + private String secret; + + public DatabaseConnectionConfig(final String catalog, final JdbcConnectionFactory.DatabaseEngine type, final String jdbcConnectionString, final String secret) + { + this.catalog = Validate.notBlank(catalog, "catalog must not be blank"); + this.type = Validate.notNull(type, "type must not be blank"); + this.jdbcConnectionString = Validate.notBlank(jdbcConnectionString, "jdbcConnectionString must not be blank"); + this.secret = Validate.notBlank(secret, "secret must not be blank"); + } + + public DatabaseConnectionConfig(final String catalog, final JdbcConnectionFactory.DatabaseEngine type, final String jdbcConnectionString) + { + this.catalog = Validate.notBlank(catalog, "catalog must not be blank"); + this.type = Validate.notNull(type, "type must not be blank"); + this.jdbcConnectionString = Validate.notBlank(jdbcConnectionString, "jdbcConnectionString must not be blank"); + } + + public JdbcConnectionFactory.DatabaseEngine getType() + { + return type; + } + + public String getJdbcConnectionString() + { + return jdbcConnectionString; + } + + public String getCatalog() + { + return catalog; + } + + public String getSecret() + { + return secret; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + DatabaseConnectionConfig that = (DatabaseConnectionConfig) o; + return Objects.equals(getCatalog(), that.getCatalog()) && + getType() == that.getType() && + Objects.equals(getJdbcConnectionString(), that.getJdbcConnectionString()) && + Objects.equals(getSecret(), that.getSecret()); + } + + @Override + public int hashCode() + { + return Objects.hash(getCatalog(), getType(), getJdbcConnectionString(), getSecret()); + } + + @Override + public String toString() + { + return "DatabaseConnectionConfig{" + + "catalog='" + catalog + '\'' + + ", type=" + type + + ", jdbcConnectionString='" + jdbcConnectionString + '\'' + + ", secret='" + secret + '\'' + + '}'; + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfigBuilder.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfigBuilder.java new file mode 100644 index 0000000000..10d98120eb --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfigBuilder.java @@ -0,0 +1,145 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.Validate; + +import java.util.ArrayList; +import java.util.List; +import java.util.Map; +import java.util.Optional; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Builds database connection configuration per database instance. + */ +public class DatabaseConnectionConfigBuilder +{ + private static final String CONNECTION_STRING_PROPERTY_SUFFIX = "_connection_string"; + public static final String DEFAULT_CONNECTION_STRING_PROPERTY = "default"; + private static final int MUX_CATALOG_LIMIT = 100; + + private static final String CONNECTION_STRING_REGEX = "([a-zA-Z]+)://(.*)"; + private static final Pattern CONNECTION_STRING_PATTERN = Pattern.compile(CONNECTION_STRING_REGEX); + private static final String SECRET_PATTERN_STRING = "\\$\\{([a-zA-Z0-9/_+=.@-]+)}"; + public static final Pattern SECRET_PATTERN = Pattern.compile(SECRET_PATTERN_STRING); + + private Map properties; + + /** + * Utility to build database instance connection configurations from Environment variables. + * + * @return List of {@link DatabaseConnectionConfig} + */ + public static List buildFromSystemEnv() + { + return new DatabaseConnectionConfigBuilder() + .properties(System.getenv()) + .build(); + } + + /** + * Builder input all system properties. + * + * @param properties system properties + * @return {@link DatabaseConnectionConfigBuilder} + */ + public DatabaseConnectionConfigBuilder properties(final Map properties) + { + this.properties = properties; + return this; + } + + /** + * Builds Database instance configurations from input properties. + * + * @return List of {@link DatabaseConnectionConfig} + */ + public List build() + { + Validate.notEmpty(this.properties, "properties must not be empty"); + Validate.notBlank(this.properties.get(DEFAULT_CONNECTION_STRING_PROPERTY), "Default connection string must be present"); + + List databaseConnectionConfigs = new ArrayList<>(); + + int numberOfCatalogs = 0; + for (Map.Entry property : this.properties.entrySet()) { + final String key = property.getKey(); + final String value = property.getValue(); + + String catalogName; + if (DEFAULT_CONNECTION_STRING_PROPERTY.equals(key.toLowerCase())) { + catalogName = key.toLowerCase(); + } + else if (key.endsWith(CONNECTION_STRING_PROPERTY_SUFFIX)) { + catalogName = key.replace(CONNECTION_STRING_PROPERTY_SUFFIX, ""); + } + else { + // unknown property ignore + continue; + } + databaseConnectionConfigs.add(extractDatabaseConnectionConfig(catalogName, value)); + + numberOfCatalogs++; + if (numberOfCatalogs > MUX_CATALOG_LIMIT) { + throw new RuntimeException("Too many database instances in mux. Max supported is " + MUX_CATALOG_LIMIT); + } + } + + return databaseConnectionConfigs; + } + + private DatabaseConnectionConfig extractDatabaseConnectionConfig(final String catalogName, final String connectionString) + { + Matcher m = CONNECTION_STRING_PATTERN.matcher(connectionString); + final String dbType; + final String jdbcConnectionString; + if (m.find() && m.groupCount() == 2) { + dbType = m.group(1); + jdbcConnectionString = m.group(2); + } + else { + throw new RuntimeException("Invalid connection String for Catalog " + catalogName); + } + + Validate.notBlank(dbType, "Database type must not be blank."); + Validate.notBlank(jdbcConnectionString, "JDBC Connection string must not be blank."); + + JdbcConnectionFactory.DatabaseEngine databaseEngine = JdbcConnectionFactory.DatabaseEngine.valueOf(dbType.toUpperCase()); + + final Optional optionalSecretName = extractSecretName(jdbcConnectionString); + + return optionalSecretName.map(s -> new DatabaseConnectionConfig(catalogName, databaseEngine, jdbcConnectionString, s)) + .orElseGet(() -> new DatabaseConnectionConfig(catalogName, databaseEngine, jdbcConnectionString)); + } + + private Optional extractSecretName(final String jdbcConnectionString) + { + Matcher secretMatcher = SECRET_PATTERN.matcher(jdbcConnectionString); + String secretName = null; + if (secretMatcher.find() && secretMatcher.groupCount() == 1) { + secretName = secretMatcher.group(1); + } + + return StringUtils.isBlank(secretName) ? Optional.empty() : Optional.of(secretName); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionInfo.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionInfo.java new file mode 100644 index 0000000000..f64b8cf775 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionInfo.java @@ -0,0 +1,66 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import org.apache.commons.lang3.Validate; + +import java.util.Objects; + +public class DatabaseConnectionInfo +{ + private final String driverClassName; + private final int defaultPort; + + public DatabaseConnectionInfo(final String driverClassName, final int defaultPort) + { + this.driverClassName = Validate.notBlank(driverClassName, "driverClassName must not be blank"); + this.defaultPort = defaultPort; + } + + public String getDriverClassName() + { + return driverClassName; + } + + public int getDefaultPort() + { + return defaultPort; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + DatabaseConnectionInfo that = (DatabaseConnectionInfo) o; + return getDefaultPort() == that.getDefaultPort() && + Objects.equals(getDriverClassName(), that.getDriverClassName()); + } + + @Override + public int hashCode() + { + return Objects.hash(getDriverClassName(), getDefaultPort()); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/GenericJdbcConnectionFactory.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/GenericJdbcConnectionFactory.java new file mode 100644 index 0000000000..b85d856b2c --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/GenericJdbcConnectionFactory.java @@ -0,0 +1,118 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import com.google.common.collect.ImmutableMap; +import org.apache.commons.lang3.Validate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.DriverManager; +import java.sql.SQLException; +import java.util.Map; +import java.util.Properties; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +/** + * Provides a generic jdbc connection factory that can be used to connect to standard databases. Configures following + * defaults if not present: + *

    + *
  • Default ports will be used for the engine if not present.
  • + *
+ */ +public class GenericJdbcConnectionFactory + implements JdbcConnectionFactory +{ + private static final Logger LOGGER = LoggerFactory.getLogger(GenericJdbcConnectionFactory.class); + + private static final String MYSQL_DRIVER_CLASS = "com.mysql.cj.jdbc.Driver"; + private static final int MYSQL_DEFAULT_PORT = 3306; + + private static final String POSTGRESQL_DRIVER_CLASS = "org.postgresql.Driver"; + private static final int POSTGRESQL_DEFAULT_PORT = 5432; + + private static final String REDSHIFT_DRIVER_CLASS = "com.amazon.redshift.jdbc.Driver"; + private static final int REDSHIFT_DEFAULT_PORT = 5439; + + private static final String ORACLE_DRIVER_CLASS = "oracle.jdbc.driver.OracleDriver"; + private static final int ORACLE_DEFAULT_PORT = 1521; + + private static final String SQL_SERVER_DRIVER_CLASS = "com.microsoft.sqlserver.jdbc.SQLServerDriver"; + private static final int SQL_SERVER_DEFAULT_PORT = 1433; + + private static final String SECRET_NAME_PATTERN_STRING = "(\\$\\{[a-zA-Z0-9/_+=.@-]+})"; + public static final Pattern SECRET_NAME_PATTERN = Pattern.compile(SECRET_NAME_PATTERN_STRING); + + private static final ImmutableMap CONNECTION_INFO = ImmutableMap.of( + DatabaseEngine.MYSQL, new DatabaseConnectionInfo(MYSQL_DRIVER_CLASS, MYSQL_DEFAULT_PORT), + DatabaseEngine.POSTGRES, new DatabaseConnectionInfo(POSTGRESQL_DRIVER_CLASS, POSTGRESQL_DEFAULT_PORT), + DatabaseEngine.REDSHIFT, new DatabaseConnectionInfo(REDSHIFT_DRIVER_CLASS, REDSHIFT_DEFAULT_PORT), + DatabaseEngine.ORACLE, new DatabaseConnectionInfo(ORACLE_DRIVER_CLASS, ORACLE_DEFAULT_PORT), + DatabaseEngine.SQLSERVER, new DatabaseConnectionInfo(SQL_SERVER_DRIVER_CLASS, SQL_SERVER_DEFAULT_PORT)); + + private final DatabaseConnectionConfig databaseConnectionConfig; + private final Properties jdbcProperties; + + public GenericJdbcConnectionFactory(final DatabaseConnectionConfig databaseConnectionConfig, final Map properties) + { + this.databaseConnectionConfig = Validate.notNull(databaseConnectionConfig, "databaseEngine must not be null"); + + this.jdbcProperties = new Properties(); + if (properties != null) { + this.jdbcProperties.putAll(properties); + } + } + + @Override + public Connection getConnection(final JdbcCredentialProvider jdbcCredentialProvider) + { + try { + DatabaseConnectionInfo databaseConnectionInfo = CONNECTION_INFO.get(this.databaseConnectionConfig.getType()); + + final String derivedJdbcString; + if (jdbcCredentialProvider != null) { + Matcher secretMatcher = SECRET_NAME_PATTERN.matcher(databaseConnectionConfig.getJdbcConnectionString()); + final String secretReplacement = String.format("user=%s&password=%s", jdbcCredentialProvider.getCredential().getUser(), + jdbcCredentialProvider.getCredential().getPassword()); + derivedJdbcString = secretMatcher.replaceAll(secretReplacement); + } + else { + derivedJdbcString = databaseConnectionConfig.getJdbcConnectionString(); + } + + // create connection string + LOGGER.info("Connection string {}", derivedJdbcString); + + // register driver + Class.forName(databaseConnectionInfo.getDriverClassName()).newInstance(); + + // create connection + return DriverManager.getConnection(derivedJdbcString, this.jdbcProperties); + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException); + } + catch (ClassNotFoundException | IllegalAccessException | InstantiationException ex) { + throw new RuntimeException(ex); + } + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcConnectionFactory.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcConnectionFactory.java new file mode 100644 index 0000000000..0e264346fc --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcConnectionFactory.java @@ -0,0 +1,60 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import java.sql.Connection; + +/** + * Factory abstracts creation of JDBC connection to database. + */ +public interface JdbcConnectionFactory +{ + /** + * Used to connect with standard databases. Default port for the engine will be used if value is negative. + * + * @param jdbcCredentialProvider jdbc user and password provider + * @return {@link Connection} + */ + Connection getConnection(JdbcCredentialProvider jdbcCredentialProvider); + + /** + * Databases supported to create JDBC connection. These would be connector names as well. + */ + enum DatabaseEngine + { + MYSQL("mysql"), + POSTGRES("postgres"), + ORACLE("oracle"), + SQLSERVER("sqlserver"), + REDSHIFT("redshift"); + + private final String dbName; + + DatabaseEngine(final String dbName) + { + this.dbName = dbName; + } + + public String getDbName() + { + return this.dbName; + } + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredential.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredential.java new file mode 100644 index 0000000000..b188a3370a --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredential.java @@ -0,0 +1,69 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import org.apache.commons.lang3.Validate; + +import java.util.Objects; + +/** + * Used to store Jdbc user and password information. + */ +public class JdbcCredential +{ + private final String user; + private final String password; + + public JdbcCredential(String user, String password) + { + this.user = Validate.notBlank(user, "User must not be blank"); + this.password = Validate.notBlank(password, "Password must not be blank"); + } + + public String getUser() + { + return user; + } + + public String getPassword() + { + return password; + } + + @Override + public boolean equals(Object o) + { + if (this == o) { + return true; + } + if (o == null || getClass() != o.getClass()) { + return false; + } + JdbcCredential that = (JdbcCredential) o; + return Objects.equals(getUser(), that.getUser()) && + Objects.equals(getPassword(), that.getPassword()); + } + + @Override + public int hashCode() + { + return Objects.hash(getUser(), getPassword()); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredentialProvider.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredentialProvider.java new file mode 100644 index 0000000000..161c0a4c68 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredentialProvider.java @@ -0,0 +1,33 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +/** + * JDBC username and password provider. + */ +public interface JdbcCredentialProvider +{ + /** + * Retrieves credential for database. + * + * @return {@link JdbcCredential} + */ + JdbcCredential getCredential(); +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/RdsSecretsCredentialProvider.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/RdsSecretsCredentialProvider.java new file mode 100644 index 0000000000..199fa62e3f --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/RdsSecretsCredentialProvider.java @@ -0,0 +1,56 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import com.fasterxml.jackson.databind.ObjectMapper; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.util.HashMap; +import java.util.Map; + +public class RdsSecretsCredentialProvider + implements JdbcCredentialProvider +{ + private static final Logger LOGGER = LoggerFactory.getLogger(RdsSecretsCredentialProvider.class); + private static final ObjectMapper OBJECT_MAPPER = new ObjectMapper(); + + private final JdbcCredential jdbcCredential; + + public RdsSecretsCredentialProvider(final String secretString) + { + Map rdsSecrets; + try { + rdsSecrets = OBJECT_MAPPER.readValue(secretString, HashMap.class); + } + catch (IOException ioException) { + throw new RuntimeException("Could not deserialize RDS credentials into HashMap", ioException); + } + + this.jdbcCredential = new JdbcCredential(rdsSecrets.get("username"), rdsSecrets.get("password")); + } + + @Override + public JdbcCredential getCredential() + { + return this.jdbcCredential; + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/StaticJdbcCredentialProvider.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/StaticJdbcCredentialProvider.java new file mode 100644 index 0000000000..bf3a5326f8 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/connection/StaticJdbcCredentialProvider.java @@ -0,0 +1,44 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.Validate; + +public class StaticJdbcCredentialProvider + implements JdbcCredentialProvider +{ + private final JdbcCredential jdbcCredential; + + public StaticJdbcCredentialProvider(final JdbcCredential jdbcCredential) + { + this.jdbcCredential = Validate.notNull(jdbcCredential, "jdbcCredential must not be null."); + + if (StringUtils.isAnyBlank(jdbcCredential.getUser(), jdbcCredential.getPassword())) { + throw new RuntimeException("User or password must not be blank."); + } + } + + @Override + public JdbcCredential getCredential() + { + return this.jdbcCredential; + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JDBCUtil.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JDBCUtil.java new file mode 100644 index 0000000000..27e2f28299 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JDBCUtil.java @@ -0,0 +1,151 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfigBuilder; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlRecordHandler; +import com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlRecordHandler; +import com.google.common.collect.ImmutableMap; +import org.apache.commons.lang3.Validate; + +import java.util.List; +import java.util.Map; + +public final class JDBCUtil +{ + private static final String DEFAULT_CATALOG_PREFIX = "lambda:"; + private static final String LAMBDA_FUNCTION_NAME_PROPERTY = "AWS_LAMBDA_FUNCTION_NAME"; + private JDBCUtil() {} + + public static DatabaseConnectionConfig getSingleDatabaseConfigFromEnv(final JdbcConnectionFactory.DatabaseEngine databaseEngine) + { + List databaseConnectionConfigs = DatabaseConnectionConfigBuilder.buildFromSystemEnv(); + + for (DatabaseConnectionConfig databaseConnectionConfig : databaseConnectionConfigs) { + if (DatabaseConnectionConfigBuilder.DEFAULT_CONNECTION_STRING_PROPERTY.equals(databaseConnectionConfig.getCatalog())) { + return databaseConnectionConfig; + } + } + + throw new RuntimeException("Must provide default connection string parameter " + DatabaseConnectionConfigBuilder.DEFAULT_CONNECTION_STRING_PROPERTY); + } + + /** + * Creates a map of Catalog to respective metadata handler to be used by Multiplexer. + * + * @param properties system properties. + * @return Map of String -> {@link JdbcMetadataHandler} + */ + public static Map createJdbcMetadataHandlerMap(final Map properties) + { + ImmutableMap.Builder metadataHandlerMap = ImmutableMap.builder(); + + final String functionName = Validate.notBlank(properties.get(LAMBDA_FUNCTION_NAME_PROPERTY), "Lambda function name not present in environment."); + List databaseConnectionConfigs = new DatabaseConnectionConfigBuilder().properties(properties).build(); + + if (databaseConnectionConfigs.isEmpty()) { + throw new RuntimeException("At least one connection string required."); + } + + boolean defaultPresent = false; + + for (DatabaseConnectionConfig databaseConnectionConfig : databaseConnectionConfigs) { + JdbcMetadataHandler jdbcMetadataHandler = createJdbcMetadataHandler(databaseConnectionConfig); + metadataHandlerMap.put(databaseConnectionConfig.getCatalog(), jdbcMetadataHandler); + + if (DatabaseConnectionConfigBuilder.DEFAULT_CONNECTION_STRING_PROPERTY.equals(databaseConnectionConfig.getCatalog())) { + metadataHandlerMap.put(DEFAULT_CATALOG_PREFIX + functionName, jdbcMetadataHandler); + defaultPresent = true; + } + } + + if (!defaultPresent) { + throw new RuntimeException("Must provide connection parameters for default database instance " + DatabaseConnectionConfigBuilder.DEFAULT_CONNECTION_STRING_PROPERTY); + } + + return metadataHandlerMap.build(); + } + + private static JdbcMetadataHandler createJdbcMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig) + { + switch (databaseConnectionConfig.getType()) { + case MYSQL: + return new MySqlMetadataHandler(databaseConnectionConfig); + case REDSHIFT: + case POSTGRES: + return new PostGreSqlMetadataHandler(databaseConnectionConfig); + default: + throw new RuntimeException("Mux: Unhandled database engine " + databaseConnectionConfig.getType()); + } + } + + /** + * Creates a map of Catalog to respective record handler to be used by Multiplexer. + * + * @param properties system properties. + * @return Map of String -> {@link JdbcRecordHandler} + */ + public static Map createJdbcRecordHandlerMap(final Map properties) + { + ImmutableMap.Builder recordHandlerMap = ImmutableMap.builder(); + + final String functionName = Validate.notBlank(properties.get(LAMBDA_FUNCTION_NAME_PROPERTY), "Lambda function name not present in environment."); + List databaseConnectionConfigs = new DatabaseConnectionConfigBuilder().properties(properties).build(); + + if (databaseConnectionConfigs.isEmpty()) { + throw new RuntimeException("At least one connection string required."); + } + + boolean defaultPresent = false; + + for (DatabaseConnectionConfig databaseConnectionConfig : databaseConnectionConfigs) { + JdbcRecordHandler jdbcRecordHandler = createJdbcRecordHandler(databaseConnectionConfig); + recordHandlerMap.put(databaseConnectionConfig.getCatalog(), jdbcRecordHandler); + + if (DatabaseConnectionConfigBuilder.DEFAULT_CONNECTION_STRING_PROPERTY.equals(databaseConnectionConfig.getCatalog())) { + recordHandlerMap.put(DEFAULT_CATALOG_PREFIX + functionName, jdbcRecordHandler); + defaultPresent = true; + } + } + + if (!defaultPresent) { + throw new RuntimeException("Must provide connection parameters for default database instance " + DatabaseConnectionConfigBuilder.DEFAULT_CONNECTION_STRING_PROPERTY); + } + + return recordHandlerMap.build(); + } + + private static JdbcRecordHandler createJdbcRecordHandler(final DatabaseConnectionConfig databaseConnectionConfig) + { + switch (databaseConnectionConfig.getType()) { + case MYSQL: + return new MySqlRecordHandler(databaseConnectionConfig); + case REDSHIFT: + case POSTGRES: + return new PostGreSqlRecordHandler(databaseConnectionConfig); + default: + throw new RuntimeException("Mux: Unhandled database engine " + databaseConnectionConfig.getType()); + } + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcArrowTypeConverter.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcArrowTypeConverter.java new file mode 100644 index 0000000000..84811c4e71 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcArrowTypeConverter.java @@ -0,0 +1,40 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import org.apache.arrow.adapter.jdbc.JdbcFieldInfo; +import org.apache.arrow.adapter.jdbc.JdbcToArrowUtils; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +public final class JdbcArrowTypeConverter +{ + private static final Logger LOGGER = LoggerFactory.getLogger(JdbcMetadataHandler.class); + + private JdbcArrowTypeConverter() {} + + public static ArrowType toArrowType(final int jdbcType, final int precision, final int scale) + { + return JdbcToArrowUtils.getArrowTypeForJdbcField( + new JdbcFieldInfo(jdbcType, precision, scale), + null); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcMetadataHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcMetadataHandler.java new file mode 100644 index 0000000000..64950343d4 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcMetadataHandler.java @@ -0,0 +1,260 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.connectors.athena.jdbc.connection.RdsSecretsCredentialProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableSet; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.Validate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.DatabaseMetaData; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; + +public abstract class JdbcMetadataHandler + extends MetadataHandler +{ + private static final Logger LOGGER = LoggerFactory.getLogger(JdbcMetadataHandler.class); + private final JdbcConnectionFactory jdbcConnectionFactory; + private final DatabaseConnectionConfig databaseConnectionConfig; + + /** + * Used only by Multiplexing handler. All calls will be delegated to respective database handler. + */ + protected JdbcMetadataHandler() + { + super(null); + this.jdbcConnectionFactory = null; + this.databaseConnectionConfig = null; + } + + protected JdbcMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig, final JdbcConnectionFactory jdbcConnectionFactory) + { + super(databaseConnectionConfig.getType().getDbName()); + this.jdbcConnectionFactory = Validate.notNull(jdbcConnectionFactory, "jdbcConnectionFactory must not be null"); + + this.databaseConnectionConfig = Validate.notNull(databaseConnectionConfig, "databaseConnectionConfig must not be null"); + } + + @VisibleForTesting + protected JdbcMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig, final AWSSecretsManager secretsManager, + final AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory) + { + super(null, secretsManager, athena, databaseConnectionConfig.getType().getDbName(), null, null); + this.jdbcConnectionFactory = Validate.notNull(jdbcConnectionFactory, "jdbcConnectionFactory must not be null"); + this.databaseConnectionConfig = Validate.notNull(databaseConnectionConfig, "databaseConnectionConfig must not be null"); + } + + protected JdbcConnectionFactory getJdbcConnectionFactory() + { + return jdbcConnectionFactory; + } + + protected JdbcCredentialProvider getCredentialProvider() + { + final String secretName = databaseConnectionConfig.getSecret(); + if (StringUtils.isNotBlank(secretName)) { + LOGGER.info("Using Secrets Manager."); + return new RdsSecretsCredentialProvider(getSecret(secretName)); + } + + return null; + } + + @Override + public ListSchemasResponse doListSchemaNames(final BlockAllocator blockAllocator, final ListSchemasRequest listSchemasRequest) + { + try (Connection connection = jdbcConnectionFactory.getConnection(getCredentialProvider())) { + LOGGER.info("{}: List schema names for Catalog {}", listSchemasRequest.getQueryId(), listSchemasRequest.getCatalogName()); + return new ListSchemasResponse(listSchemasRequest.getCatalogName(), listDatabaseNames(connection)); + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage()); + } + } + + private Set listDatabaseNames(final Connection jdbcConnection) + throws SQLException + { + try (ResultSet resultSet = jdbcConnection.getMetaData().getSchemas()) { + ImmutableSet.Builder schemaNames = ImmutableSet.builder(); + while (resultSet.next()) { + String schemaName = resultSet.getString("TABLE_SCHEM"); + // skip internal schemas + if (!schemaName.equals("information_schema")) { + schemaNames.add(schemaName); + } + } + return schemaNames.build(); + } + } + + @Override + public ListTablesResponse doListTables(final BlockAllocator blockAllocator, final ListTablesRequest listTablesRequest) + { + try (Connection connection = jdbcConnectionFactory.getConnection(getCredentialProvider())) { + LOGGER.info("{}: List table names for Catalog {}, Table {}", listTablesRequest.getQueryId(), listTablesRequest.getCatalogName(), listTablesRequest.getSchemaName()); + return new ListTablesResponse(listTablesRequest.getCatalogName(), listTables(connection, listTablesRequest.getSchemaName())); + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage()); + } + } + + private List listTables(final Connection jdbcConnection, final String databaseName) + throws SQLException + { + try (ResultSet resultSet = getTables(jdbcConnection, databaseName)) { + ImmutableList.Builder list = ImmutableList.builder(); + while (resultSet.next()) { + list.add(getSchemaTableName(resultSet)); + } + return list.build(); + } + } + + private ResultSet getTables(final Connection connection, final String schemaName) + throws SQLException + { + DatabaseMetaData metadata = connection.getMetaData(); + String escape = metadata.getSearchStringEscape(); + return metadata.getTables( + connection.getCatalog(), + escapeNamePattern(schemaName, escape), + null, + new String[] {"TABLE", "VIEW"}); + } + + private TableName getSchemaTableName(final ResultSet resultSet) + throws SQLException + { + return new TableName( + resultSet.getString("TABLE_SCHEM"), + resultSet.getString("TABLE_NAME")); + } + + protected String escapeNamePattern(final String name, final String escape) + { + if ((name == null) || (escape == null)) { + return name; + } + Preconditions.checkArgument(!escape.equals("_"), "Escape string must not be '_'"); + Preconditions.checkArgument(!escape.equals("%"), "Escape string must not be '%'"); + String escapedName = name.replace(escape, escape + escape); + escapedName = escapedName.replace("_", escape + "_"); + escapedName = escapedName.replace("%", escape + "%"); + return escapedName; + } + + @Override + public GetTableResponse doGetTable(final BlockAllocator blockAllocator, final GetTableRequest getTableRequest) + { + try (Connection connection = jdbcConnectionFactory.getConnection(getCredentialProvider())) { + Schema partitionSchema = getPartitionSchema(getTableRequest.getCatalogName()); + return new GetTableResponse(getTableRequest.getCatalogName(), getTableRequest.getTableName(), getSchema(connection, getTableRequest.getTableName(), partitionSchema), + partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet())); + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage()); + } + } + + private Schema getSchema(Connection jdbcConnection, TableName tableName, Schema partitionSchema) + throws SQLException + { + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + + try (ResultSet resultSet = getColumns(jdbcConnection.getCatalog(), tableName, jdbcConnection.getMetaData())) { + boolean found = false; + while (resultSet.next()) { + found = true; + ArrowType columnType = JdbcArrowTypeConverter.toArrowType( + resultSet.getInt("DATA_TYPE"), + resultSet.getInt("COLUMN_SIZE"), + resultSet.getInt("DECIMAL_DIGITS")); + String columnName = resultSet.getString("COLUMN_NAME"); + schemaBuilder.addField(FieldBuilder.newBuilder(columnName, columnType).build()); + } + if (!found) { + throw new RuntimeException("Could not find table in " + tableName.getSchemaName()); + } + + // add partition columns + partitionSchema.getFields().forEach(schemaBuilder::addField); + + return schemaBuilder.build(); + } + } + + private ResultSet getColumns(final String catalogName, final TableName tableHandle, final DatabaseMetaData metadata) + throws SQLException + { + String escape = metadata.getSearchStringEscape(); + return metadata.getColumns( + catalogName, + escapeNamePattern(tableHandle.getSchemaName(), escape), + escapeNamePattern(tableHandle.getTableName(), escape), + null); + } + + public abstract Schema getPartitionSchema(final String catalogName); + + @Override + public abstract void getPartitions( + final BlockWriter blockWriter, + final GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception; + + @Override + public abstract GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest getSplitsRequest); +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcRecordHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcRecordHandler.java new file mode 100644 index 0000000000..a0292fef2b --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcRecordHandler.java @@ -0,0 +1,183 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.connectors.athena.jdbc.connection.RdsSecretsCredentialProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.adapter.jdbc.JdbcFieldInfo; +import org.apache.arrow.adapter.jdbc.JdbcToArrowUtils; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.Validate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.ResultSetMetaData; +import java.sql.SQLException; +import java.util.Calendar; +import java.util.HashMap; +import java.util.Map; + +public abstract class JdbcRecordHandler + extends RecordHandler +{ + private static final Logger LOGGER = LoggerFactory.getLogger(JdbcRecordHandler.class); + // TODO support all data types + private static final ImmutableMap> VALUE_EXTRACTOR = new ImmutableMap.Builder>() + .put(Types.MinorType.BIT, ResultSet::getBoolean) + .put(Types.MinorType.TINYINT, ResultSet::getByte) + .put(Types.MinorType.SMALLINT, ResultSet::getShort) + .put(Types.MinorType.INT, ResultSet::getInt) + .put(Types.MinorType.BIGINT, ResultSet::getLong) + .put(Types.MinorType.FLOAT4, ResultSet::getFloat) + .put(Types.MinorType.FLOAT8, ResultSet::getDouble) + .put(Types.MinorType.DATEDAY, ResultSet::getDate) + .put(Types.MinorType.DATEMILLI, ResultSet::getTimestamp) + .put(Types.MinorType.VARCHAR, ResultSet::getString) + .put(Types.MinorType.VARBINARY, ResultSet::getBytes) + .put(Types.MinorType.DECIMAL, ResultSet::getBigDecimal) + .build(); + private final JdbcConnectionFactory jdbcConnectionFactory; + private final DatabaseConnectionConfig databaseConnectionConfig; + + /** + * Used only by Multiplexing handler. All invocations will be delegated to respective database handler. + */ + protected JdbcRecordHandler() + { + super(null); + this.jdbcConnectionFactory = null; + this.databaseConnectionConfig = null; + } + + protected JdbcRecordHandler(final AmazonS3 amazonS3, final AWSSecretsManager secretsManager, AmazonAthena athena, final DatabaseConnectionConfig databaseConnectionConfig, + final JdbcConnectionFactory jdbcConnectionFactory) + { + super(amazonS3, secretsManager, athena, databaseConnectionConfig.getType().getDbName()); + this.jdbcConnectionFactory = Validate.notNull(jdbcConnectionFactory, "jdbcConnectionFactory must not be null"); + this.databaseConnectionConfig = Validate.notNull(databaseConnectionConfig, "databaseConnectionConfig must not be null"); + } + + private JdbcCredentialProvider getCredentialProvider() + { + final String secretName = this.databaseConnectionConfig.getSecret(); + if (StringUtils.isNotBlank(secretName)) { + return new RdsSecretsCredentialProvider(getSecret(secretName)); + } + + return null; + } + + @Override + public void readWithConstraint(BlockSpiller blockSpiller, ReadRecordsRequest readRecordsRequest, QueryStatusChecker queryStatusChecker) + { + LOGGER.info("{}: Catalog: {}, table {}, splits {}", readRecordsRequest.getQueryId(), readRecordsRequest.getCatalogName(), readRecordsRequest.getTableName(), + readRecordsRequest.getSplit().getProperties()); + try (Connection connection = this.jdbcConnectionFactory.getConnection(getCredentialProvider()); + PreparedStatement preparedStatement = buildSplitSql(connection, readRecordsRequest.getCatalogName(), readRecordsRequest.getTableName(), readRecordsRequest.getSchema(), + readRecordsRequest.getConstraints(), readRecordsRequest.getSplit()); + ResultSet resultSet = preparedStatement.executeQuery()) { + ResultSetMetaData resultSetMetaData = resultSet.getMetaData(); + + Map partitionValues = readRecordsRequest.getSplit().getProperties(); + + final Map typeMap = new HashMap<>(); + int columnIndex = 1; + for (Field nextField : readRecordsRequest.getSchema().getFields()) { + if (partitionValues.containsKey(nextField.getName())) { + continue; //ignore partition columns + } + Types.MinorType minorTypeForArrowType = Types.getMinorTypeForArrowType(JdbcToArrowUtils.getArrowTypeForJdbcField(new JdbcFieldInfo(resultSetMetaData, columnIndex), + Calendar.getInstance())); + typeMap.put(nextField.getName(), minorTypeForArrowType); + columnIndex++; + } + + while (resultSet.next()) { + if (!queryStatusChecker.isQueryRunning()) { + return; + } + blockSpiller.writeRows((Block block, int rowNum) -> { + try { + boolean matched; + for (Field nextField : readRecordsRequest.getSchema().getFields()) { + Object value; + if (partitionValues.containsKey(nextField.getName())) { + value = partitionValues.get(nextField.getName()); + } + else { + value = getArrowValue(resultSet, nextField.getName(), typeMap.get(nextField.getName())); + } + matched = block.offerValue(nextField.getName(), rowNum, value); + if (!matched) { + return 0; + } + } + + return 1; + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage(), sqlException); + } + }); + } + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage(), sqlException); + } + } + + private Object getArrowValue(final ResultSet resultSet, final String columnName, final Types.MinorType minorType) + throws SQLException + { + return VALUE_EXTRACTOR.getOrDefault(minorType, (rs, col) -> { + throw new RuntimeException("Unhandled column type " + minorType); + }) + .call(resultSet, columnName); + } + + public abstract PreparedStatement buildSplitSql(Connection jdbcConnection, String catalogName, TableName tableName, Schema schema, Constraints constraints, Split split) + throws SQLException; + + private interface ResultSetValueExtractor + { + T call(ResultSet resultSet, String columnName) + throws SQLException; + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcSplitQueryBuilder.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcSplitQueryBuilder.java new file mode 100644 index 0000000000..d5e6485393 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcSplitQueryBuilder.java @@ -0,0 +1,289 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.google.common.base.Joiner; +import com.google.common.base.Preconditions; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.Iterables; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.Validate; +import org.joda.time.DateTimeZone; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.math.BigDecimal; +import java.sql.Connection; +import java.sql.Date; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.sql.Timestamp; +import java.util.ArrayList; +import java.util.Collections; +import java.util.List; +import java.util.Map; +import java.util.concurrent.TimeUnit; +import java.util.stream.Collectors; + +/** + * Query builder for DB split. + */ +public abstract class JdbcSplitQueryBuilder +{ + private static final Logger LOGGER = LoggerFactory.getLogger(JdbcSplitQueryBuilder.class); + + private static final int MILLIS_SHIFT = 12; + + private final String quoteCharacters; + + public JdbcSplitQueryBuilder(String quoteCharacters) + { + this.quoteCharacters = Validate.notBlank(quoteCharacters, "quoteCharacters must not be blank"); + } + + public PreparedStatement buildSql( + final Connection jdbcConnection, + final String catalog, + final String schema, + final String table, + final Schema tableSchema, + final Constraints constraints, + final Split split) + throws SQLException + { + StringBuilder sql = new StringBuilder(); + + String columnNames = tableSchema.getFields().stream() + .map(Field::getName) + .filter(c -> !split.getProperties().containsKey(c)) + .map(this::quote) + .collect(Collectors.joining(", ")); + + sql.append("SELECT "); + sql.append(columnNames); + if (columnNames.isEmpty()) { + sql.append("null"); + } + + sql.append(getFromClauseWithSplit(catalog, schema, table, split)); + + List accumulator = new ArrayList<>(); + + List clauses = toConjuncts(tableSchema.getFields(), constraints, accumulator, split.getProperties()); + if (!clauses.isEmpty()) { + sql.append(" WHERE ") + .append(Joiner.on(" AND ").join(clauses)); + } + + PreparedStatement statement = jdbcConnection.prepareStatement(sql.toString()); + + // TODO all types, converts Arrow values to JDBC. + for (int i = 0; i < accumulator.size(); i++) { + TypeAndValue typeAndValue = accumulator.get(i); + + Types.MinorType minorTypeForArrowType = Types.getMinorTypeForArrowType(typeAndValue.getType()); + + switch (minorTypeForArrowType) { + case BIGINT: + statement.setLong(i + 1, (long) typeAndValue.getValue()); + break; + case INT: + statement.setInt(i + 1, ((Number) typeAndValue.getValue()).intValue()); + break; + case SMALLINT: + statement.setShort(i + 1, ((Number) typeAndValue.getValue()).shortValue()); + break; + case TINYINT: + statement.setByte(i + 1, ((Number) typeAndValue.getValue()).byteValue()); + break; + case FLOAT8: + statement.setDouble(i + 1, (double) typeAndValue.getValue()); + break; + case FLOAT4: + statement.setFloat(i + 1, (float) typeAndValue.getValue()); + break; + case BIT: + statement.setBoolean(i + 1, (boolean) typeAndValue.getValue()); + break; + case DATEDAY: + long millis = TimeUnit.DAYS.toMillis((long) typeAndValue.getValue()); + statement.setDate(i + 1, new Date(DateTimeZone.UTC.getMillisKeepLocal(DateTimeZone.getDefault(), millis))); + break; + case DATEMILLI: + statement.setTimestamp(i + 1, new Timestamp((long) typeAndValue.getValue())); + break; + case VARCHAR: + statement.setString(i + 1, String.valueOf(typeAndValue.getValue())); + break; + case VARBINARY: + statement.setBytes(i + 1, (byte[]) typeAndValue.getValue()); + break; + case DECIMAL: + ArrowType.Decimal decimalType = (ArrowType.Decimal) typeAndValue.getType(); + statement.setBigDecimal(i + 1, BigDecimal.valueOf((long) typeAndValue.getValue(), decimalType.getScale())); + break; + default: + throw new UnsupportedOperationException(String.format("Can't handle type: %s, %s", typeAndValue.getType(), minorTypeForArrowType)); + } + } + + return statement; + } + + protected abstract String getFromClauseWithSplit(final String catalog, final String schema, final String table, final Split split); + + private List toConjuncts(List columns, Constraints constraints, List accumulator, Map partitionSplit) + { + ImmutableList.Builder builder = ImmutableList.builder(); + for (Field column : columns) { + if (partitionSplit.containsKey(column.getName())) { + continue; // Ignore constraints on partition name as RDBMS does not contain these as columns. Presto will filter these values. + } + ArrowType type = column.getType(); + if (constraints.getSummary() != null && !constraints.getSummary().isEmpty()) { + ValueSet valueSet = constraints.getSummary().get(column.getName()); + if (valueSet != null) { + builder.add(toPredicate(column.getName(), valueSet, type, accumulator)); + } + } + } + return builder.build(); + } + + private String toPredicate(String columnName, ValueSet valueSet, ArrowType type, List accumulator) + { + List disjuncts = new ArrayList<>(); + List singleValues = new ArrayList<>(); + + // TODO Add isNone and isAll checks once we have data on nullability. + + if (valueSet instanceof SortedRangeSet) { + if (valueSet.isNone() && valueSet.isNullAllowed()) { + return String.format("(%s IS NULL)", columnName); + } + + if (valueSet.isNullAllowed()) { + disjuncts.add(String.format("(%s IS NULL)", columnName)); + } + + Range rangeSpan = ((SortedRangeSet) valueSet).getSpan(); + if (!valueSet.isNullAllowed() && rangeSpan.getLow().isLowerUnbounded() && rangeSpan.getHigh().isUpperUnbounded()) { + return String.format("(%s IS NOT NULL)", columnName); + } + + for (Range range : valueSet.getRanges().getOrderedRanges()) { + if (range.isSingleValue()) { + singleValues.add(range.getLow().getValue()); + } + else { + List rangeConjuncts = new ArrayList<>(); + if (!range.getLow().isLowerUnbounded()) { + switch (range.getLow().getBound()) { + case ABOVE: + rangeConjuncts.add(toPredicate(columnName, ">", range.getLow().getValue(), type, accumulator)); + break; + case EXACTLY: + rangeConjuncts.add(toPredicate(columnName, ">=", range.getLow().getValue(), type, accumulator)); + break; + case BELOW: + throw new IllegalArgumentException("Low marker should never use BELOW bound"); + default: + throw new AssertionError("Unhandled bound: " + range.getLow().getBound()); + } + } + if (!range.getHigh().isUpperUnbounded()) { + switch (range.getHigh().getBound()) { + case ABOVE: + throw new IllegalArgumentException("High marker should never use ABOVE bound"); + case EXACTLY: + rangeConjuncts.add(toPredicate(columnName, "<=", range.getHigh().getValue(), type, accumulator)); + break; + case BELOW: + rangeConjuncts.add(toPredicate(columnName, "<", range.getHigh().getValue(), type, accumulator)); + break; + default: + throw new AssertionError("Unhandled bound: " + range.getHigh().getBound()); + } + } + // If rangeConjuncts is null, then the range was ALL, which should already have been checked for + Preconditions.checkState(!rangeConjuncts.isEmpty()); + disjuncts.add("(" + Joiner.on(" AND ").join(rangeConjuncts) + ")"); + } + } + + // Add back all of the possible single values either as an equality or an IN predicate + if (singleValues.size() == 1) { + disjuncts.add(toPredicate(columnName, "=", Iterables.getOnlyElement(singleValues), type, accumulator)); + } + else if (singleValues.size() > 1) { + for (Object value : singleValues) { + accumulator.add(new TypeAndValue(type, value)); + } + String values = Joiner.on(",").join(Collections.nCopies(singleValues.size(), "?")); + disjuncts.add(quote(columnName) + " IN (" + values + ")"); + } + } + + return "(" + Joiner.on(" OR ").join(disjuncts) + ")"; + } + + private String toPredicate(String columnName, String operator, Object value, ArrowType type, + List accumulator) + { + accumulator.add(new TypeAndValue(type, value)); + return quote(columnName) + " " + operator + " ?"; + } + + protected String quote(String name) + { + name = name.replace(quoteCharacters, quoteCharacters + quoteCharacters); + return quoteCharacters + name + quoteCharacters; + } + + private static class TypeAndValue + { + private final ArrowType type; + private final Object value; + + TypeAndValue(ArrowType type, Object value) + { + this.type = Validate.notNull(type, "type is null"); + this.value = Validate.notNull(value, "value is null"); + } + + ArrowType getType() + { + return type; + } + + Object getValue() + { + return value; + } + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/PreparedStatementBuilder.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/PreparedStatementBuilder.java new file mode 100644 index 0000000000..e1ab8a8a1f --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/manager/PreparedStatementBuilder.java @@ -0,0 +1,79 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import org.apache.commons.lang3.Validate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.util.List; + +public class PreparedStatementBuilder +{ + private static final Logger LOGGER = LoggerFactory.getLogger(PreparedStatementBuilder.class); + + private String query; + private List parameters; + private Connection connection; + + public PreparedStatementBuilder withParameters(final List parameters) + { + this.parameters = Validate.notEmpty(parameters, "parameters must not be null"); + return this; + } + + public PreparedStatementBuilder withQuery(final String query) + { + this.query = Validate.notBlank(query, "query must not be blank"); + return this; + } + + public PreparedStatementBuilder withConnection(final Connection connection) + { + this.connection = Validate.notNull(connection, "connection must not be null"); + return this; + } + + public PreparedStatement build() + { + Validate.notEmpty(parameters, "parameters must not be null"); + Validate.notBlank(query, "query must not be blank"); + Validate.notNull(connection, "connection must not be null"); + + LOGGER.info("Running query {}", this.query); + LOGGER.info("Parameters {}", this.parameters); + + try { + PreparedStatement preparedStatement = connection.prepareStatement(this.query); + + for (int i = 1; i <= parameters.size(); i++) { + preparedStatement.setString(i, parameters.get(i - 1)); + } + + return preparedStatement; + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage(), sqlException); + } + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlCompositeHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlCompositeHandler.java new file mode 100644 index 0000000000..bfbdcb9f33 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.mysql; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose {@link MySqlMetadataHandler} and {@link MySqlRecordHandler}. + */ +public class MySqlCompositeHandler + extends CompositeHandler +{ + public MySqlCompositeHandler() + { + super(new MySqlMetadataHandler(), new MySqlRecordHandler()); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlMetadataHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlMetadataHandler.java new file mode 100644 index 0000000000..3220cad210 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlMetadataHandler.java @@ -0,0 +1,190 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.mysql; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.GenericJdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JDBCUtil; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.manager.PreparedStatementBuilder; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +/** + * Handles metadata for MySQL. User must have access to `schemata`, `tables`, `columns`, `partitions` tables in + * information_schema. + */ +public class MySqlMetadataHandler + extends JdbcMetadataHandler +{ + static final Map JDBC_PROPERTIES = ImmutableMap.of("databaseTerm", "SCHEMA"); + static final String GET_PARTITIONS_QUERY = "SELECT DISTINCT partition_name FROM INFORMATION_SCHEMA.PARTITIONS WHERE TABLE_NAME = ? AND TABLE_SCHEMA = ? " + + "AND partition_name IS NOT NULL"; + static final String BLOCK_PARTITION_COLUMN_NAME = "partition_name"; + static final String ALL_PARTITIONS = "*"; + static final String PARTITION_COLUMN_NAME = "partition_name"; + private static final Logger LOGGER = LoggerFactory.getLogger(MySqlMetadataHandler.class); + private static final int MAX_SPLITS_PER_REQUEST = 1000_000; + + public MySqlMetadataHandler() + { + this(JDBCUtil.getSingleDatabaseConfigFromEnv(JdbcConnectionFactory.DatabaseEngine.MYSQL)); + } + + /** + * Used by Mux. + */ + public MySqlMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig) + { + super(databaseConnectionConfig, new GenericJdbcConnectionFactory(databaseConnectionConfig, JDBC_PROPERTIES)); + } + + @VisibleForTesting + protected MySqlMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig, final AWSSecretsManager secretsManager, + AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory) + { + super(databaseConnectionConfig, secretsManager, athena, jdbcConnectionFactory); + } + + @Override + public Schema getPartitionSchema(final String catalogName) + { + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder() + .addField(BLOCK_PARTITION_COLUMN_NAME, Types.MinorType.VARCHAR.getType()); + return schemaBuilder.build(); + } + + @Override + public void getPartitions(final BlockWriter blockWriter, final GetTableLayoutRequest getTableLayoutRequest, QueryStatusChecker queryStatusChecker) + { + LOGGER.info("{}: Schema {}, table {}", getTableLayoutRequest.getQueryId(), getTableLayoutRequest.getTableName().getSchemaName(), + getTableLayoutRequest.getTableName().getTableName()); + try (Connection connection = getJdbcConnectionFactory().getConnection(getCredentialProvider())) { + final String escape = connection.getMetaData().getSearchStringEscape(); + + List parameters = Arrays.asList(getTableLayoutRequest.getTableName().getTableName(), getTableLayoutRequest.getTableName().getSchemaName()); + try (PreparedStatement preparedStatement = new PreparedStatementBuilder().withConnection(connection).withQuery(GET_PARTITIONS_QUERY).withParameters(parameters).build(); + ResultSet resultSet = preparedStatement.executeQuery()) { + // Return a single partition if no partitions defined + if (!resultSet.next()) { + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(BLOCK_PARTITION_COLUMN_NAME, rowNum, ALL_PARTITIONS); + LOGGER.info("Adding partition {}", ALL_PARTITIONS); + //we wrote 1 row so we return 1 + return 1; + }); + } + else { + do { + final String partitionName = resultSet.getString(PARTITION_COLUMN_NAME); + + // 1. Returns all partitions of table, we are not supporting constraints push down to filter partitions. + // 2. This API is not paginated, we could use order by and limit clause with offsets here. + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(BLOCK_PARTITION_COLUMN_NAME, rowNum, partitionName); + LOGGER.info("Adding partition {}", partitionName); + //we wrote 1 row so we return 1 + return 1; + }); + } + while (resultSet.next() && queryStatusChecker.isQueryRunning()); + } + } + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage(), sqlException); + } + } + + @Override + public GetSplitsResponse doGetSplits( + final BlockAllocator blockAllocator, final GetSplitsRequest getSplitsRequest) + { + LOGGER.info("{}: Catalog {}, table {}", getSplitsRequest.getQueryId(), getSplitsRequest.getTableName().getSchemaName(), getSplitsRequest.getTableName().getTableName()); + int partitionContd = decodeContinuationToken(getSplitsRequest); + Set splits = new HashSet<>(); + Block partitions = getSplitsRequest.getPartitions(); + + // TODO consider splitting further depending on #rows or data size. Could use Hash key for splitting if no partitions. + for (int curPartition = partitionContd; curPartition < partitions.getRowCount(); curPartition++) { + FieldReader locationReader = partitions.getFieldReader(BLOCK_PARTITION_COLUMN_NAME); + locationReader.setPosition(curPartition); + + SpillLocation spillLocation = makeSpillLocation(getSplitsRequest); + + LOGGER.info("{}: Input partition is {}", getSplitsRequest.getQueryId(), locationReader.readText()); + + Split.Builder splitBuilder = Split.newBuilder(spillLocation, makeEncryptionKey()) + .add(BLOCK_PARTITION_COLUMN_NAME, String.valueOf(locationReader.readText())); + + splits.add(splitBuilder.build()); + + if (splits.size() >= MAX_SPLITS_PER_REQUEST) { + //We exceeded the number of split we want to return in a single request, return and provide a continuation token. + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), splits, encodeContinuationToken(curPartition)); + } + } + + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), splits, null); + } + + private int decodeContinuationToken(GetSplitsRequest request) + { + if (request.hasContinuationToken()) { + return Integer.valueOf(request.getContinuationToken()); + } + + //No continuation token present + return 0; + } + + private String encodeContinuationToken(int partition) + { + return String.valueOf(partition); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlQueryStringBuilder.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlQueryStringBuilder.java new file mode 100644 index 0000000000..8d34e9b348 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlQueryStringBuilder.java @@ -0,0 +1,55 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.mysql; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcSplitQueryBuilder; +import com.google.common.base.Strings; + +public class MySqlQueryStringBuilder + extends JdbcSplitQueryBuilder +{ + MySqlQueryStringBuilder(final String quoteCharacters) + { + super(quoteCharacters); + } + + @Override + protected String getFromClauseWithSplit(String catalog, String schema, String table, Split split) + { + StringBuilder tableName = new StringBuilder(); + if (!Strings.isNullOrEmpty(catalog)) { + tableName.append(quote(catalog)).append('.'); + } + if (!Strings.isNullOrEmpty(schema)) { + tableName.append(quote(schema)).append('.'); + } + tableName.append(quote(table)); + + String partitionName = split.getProperty(MySqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME); + + if (MySqlMetadataHandler.ALL_PARTITIONS.equals(partitionName)) { + // No partitions + return String.format(" FROM %s ", tableName); + } + + return String.format(" FROM %s PARTITION(%s) ", tableName, partitionName); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlRecordHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlRecordHandler.java new file mode 100644 index 0000000000..d552543d9e --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlRecordHandler.java @@ -0,0 +1,89 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.mysql; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.GenericJdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JDBCUtil; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcRecordHandler; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcSplitQueryBuilder; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.google.common.annotations.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.Validate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.SQLException; + +/** + * Data handler, user must have necessary permissions to read from necessary tables. + */ +public class MySqlRecordHandler + extends JdbcRecordHandler +{ + private static final Logger LOGGER = LoggerFactory.getLogger(MySqlRecordHandler.class); + + private static final String MYSQL_QUOTE_CHARACTER = "`"; + + private final JdbcSplitQueryBuilder jdbcSplitQueryBuilder; + + public MySqlRecordHandler() + { + this(JDBCUtil.getSingleDatabaseConfigFromEnv(JdbcConnectionFactory.DatabaseEngine.MYSQL)); + } + + public MySqlRecordHandler(final DatabaseConnectionConfig databaseConnectionConfig) + { + this(databaseConnectionConfig, AmazonS3ClientBuilder.defaultClient(), AWSSecretsManagerClientBuilder.defaultClient(), AmazonAthenaClientBuilder.defaultClient(), + new GenericJdbcConnectionFactory(databaseConnectionConfig, MySqlMetadataHandler.JDBC_PROPERTIES), new MySqlQueryStringBuilder(MYSQL_QUOTE_CHARACTER)); + } + + @VisibleForTesting + MySqlRecordHandler(final DatabaseConnectionConfig databaseConnectionConfig, final AmazonS3 amazonS3, final AWSSecretsManager secretsManager, + final AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory, final JdbcSplitQueryBuilder jdbcSplitQueryBuilder) + { + super(amazonS3, secretsManager, athena, databaseConnectionConfig, jdbcConnectionFactory); + this.jdbcSplitQueryBuilder = Validate.notNull(jdbcSplitQueryBuilder, "query builder must not be null"); + } + + @Override + public PreparedStatement buildSplitSql(Connection jdbcConnection, String catalogName, TableName tableName, Schema schema, Constraints constraints, Split split) + throws SQLException + { + PreparedStatement preparedStatement = jdbcSplitQueryBuilder.buildSql(jdbcConnection, null, tableName.getSchemaName(), tableName.getTableName(), schema, constraints, split); + + // Disable fetching all rows. + preparedStatement.setFetchSize(Integer.MIN_VALUE); + + return preparedStatement; + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlCompositeHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlCompositeHandler.java new file mode 100644 index 0000000000..157b33544e --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlCompositeHandler.java @@ -0,0 +1,37 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.postgresql; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose {@link PostGreSqlMetadataHandler} and {@link PostGreSqlRecordHandler}. + */ +public class PostGreSqlCompositeHandler + extends CompositeHandler +{ + public PostGreSqlCompositeHandler(MetadataHandler metadataHandler, RecordHandler recordHandler) + { + super(new PostGreSqlMetadataHandler(), new PostGreSqlRecordHandler()); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlMetadataHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlMetadataHandler.java new file mode 100644 index 0000000000..e87ef525a1 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlMetadataHandler.java @@ -0,0 +1,190 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.postgresql; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.GenericJdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JDBCUtil; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.manager.PreparedStatementBuilder; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.annotations.VisibleForTesting; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; + +public class PostGreSqlMetadataHandler + extends JdbcMetadataHandler +{ + static final Map JDBC_PROPERTIES = ImmutableMap.of("databaseTerm", "SCHEMA"); + static final String GET_PARTITIONS_QUERY = "SELECT nmsp_child.nspname AS child_schema, child.relname AS child FROM pg_inherits JOIN pg_class parent " + + "ON pg_inherits.inhparent = parent.oid JOIN pg_class child ON pg_inherits.inhrelid = child.oid JOIN pg_namespace nmsp_parent " + + "ON nmsp_parent.oid = parent.relnamespace JOIN pg_namespace nmsp_child ON nmsp_child.oid = child.relnamespace where nmsp_parent.nspname = ? " + + "AND parent.relname = ?"; + static final String BLOCK_PARTITION_COLUMN_NAME = "partition_name"; + static final String BLOCK_PARTITION_SCHEMA_COLUMN_NAME = "partition_schema_name"; + static final String ALL_PARTITIONS = "*"; + private static final Logger LOGGER = LoggerFactory.getLogger(PostGreSqlMetadataHandler.class); + private static final String PARTITION_SCHEMA_NAME = "child_schema"; + private static final String PARTITION_NAME = "child"; + private static final int MAX_SPLITS_PER_REQUEST = 1000_000; + + public PostGreSqlMetadataHandler() + { + this(JDBCUtil.getSingleDatabaseConfigFromEnv(JdbcConnectionFactory.DatabaseEngine.POSTGRES)); + } + + public PostGreSqlMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig) + { + super(databaseConnectionConfig, new GenericJdbcConnectionFactory(databaseConnectionConfig, JDBC_PROPERTIES)); + } + + @VisibleForTesting + protected PostGreSqlMetadataHandler(final DatabaseConnectionConfig databaseConnectionConfig, final AWSSecretsManager secretsManager, + final AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory) + { + super(databaseConnectionConfig, secretsManager, athena, jdbcConnectionFactory); + } + + @Override + public Schema getPartitionSchema(final String catalogName) + { + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder() + .addField(BLOCK_PARTITION_SCHEMA_COLUMN_NAME, Types.MinorType.VARCHAR.getType()) + .addField(BLOCK_PARTITION_COLUMN_NAME, Types.MinorType.VARCHAR.getType()); + return schemaBuilder.build(); + } + + @Override + public void getPartitions(final BlockWriter blockWriter, final GetTableLayoutRequest getTableLayoutRequest, QueryStatusChecker queryStatusChecker) + { + LOGGER.info("{}: Catalog {}, table {}", getTableLayoutRequest.getQueryId(), getTableLayoutRequest.getTableName().getSchemaName(), + getTableLayoutRequest.getTableName().getTableName()); + try (Connection connection = getJdbcConnectionFactory().getConnection(getCredentialProvider())) { + List parameters = Arrays.asList(getTableLayoutRequest.getTableName().getSchemaName(), + getTableLayoutRequest.getTableName().getTableName()); + try (PreparedStatement preparedStatement = new PreparedStatementBuilder().withConnection(connection).withQuery(GET_PARTITIONS_QUERY).withParameters(parameters).build(); + ResultSet resultSet = preparedStatement.executeQuery()) { + // Return a single partition if no partitions defined + if (!resultSet.next()) { + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(BLOCK_PARTITION_SCHEMA_COLUMN_NAME, rowNum, ALL_PARTITIONS); + block.setValue(BLOCK_PARTITION_COLUMN_NAME, rowNum, ALL_PARTITIONS); + //we wrote 1 row so we return 1 + return 1; + }); + } + else { + do { + final String partitionSchemaName = resultSet.getString(PARTITION_SCHEMA_NAME); + final String partitionName = resultSet.getString(PARTITION_NAME); + + // 1. Returns all partitions of table, we are not supporting constraints push down to filter partitions. + // 2. This API is not paginated, we could use order by and limit clause with offsets here. + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(BLOCK_PARTITION_SCHEMA_COLUMN_NAME, rowNum, partitionSchemaName); + block.setValue(BLOCK_PARTITION_COLUMN_NAME, rowNum, partitionName); + //we wrote 1 row so we return 1 + return 1; + }); + } + while (resultSet.next()); + } + } + } + catch (SQLException sqlException) { + throw new RuntimeException(sqlException.getErrorCode() + ": " + sqlException.getMessage(), sqlException); + } + } + + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest getSplitsRequest) + { + LOGGER.info("{}: Catalog {}, table {}", getSplitsRequest.getQueryId(), getSplitsRequest.getTableName().getSchemaName(), getSplitsRequest.getTableName().getTableName()); + int partitionContd = decodeContinuationToken(getSplitsRequest); + Set splits = new HashSet<>(); + Block partitions = getSplitsRequest.getPartitions(); + + // TODO consider splitting further depending on #rows or data size. Could use Hash key for splitting if no partitions. i/ATHENA-3979 + for (int curPartition = partitionContd; curPartition < partitions.getRowCount(); curPartition++) { + FieldReader partitionsSchemaFieldReader = partitions.getFieldReader(BLOCK_PARTITION_SCHEMA_COLUMN_NAME); + partitionsSchemaFieldReader.setPosition(curPartition); + FieldReader partitionsFieldReader = partitions.getFieldReader(BLOCK_PARTITION_COLUMN_NAME); + partitionsFieldReader.setPosition(curPartition); + + //Every split must have a unique location if we wish to spill to avoid failures + SpillLocation spillLocation = makeSpillLocation(getSplitsRequest); + + LOGGER.info("{}: Input partition is {}", getSplitsRequest.getQueryId(), String.valueOf(partitionsFieldReader.readText())); + Split.Builder splitBuilder = Split.newBuilder(spillLocation, makeEncryptionKey()) + .add(BLOCK_PARTITION_SCHEMA_COLUMN_NAME, String.valueOf(partitionsSchemaFieldReader.readText())) + .add(BLOCK_PARTITION_COLUMN_NAME, String.valueOf(partitionsFieldReader.readText())); + + splits.add(splitBuilder.build()); + + if (splits.size() >= MAX_SPLITS_PER_REQUEST) { + //We exceeded the number of split we want to return in a single request, return and provide a continuation token. + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), splits, encodeContinuationToken(curPartition)); + } + } + + return new GetSplitsResponse(getSplitsRequest.getCatalogName(), splits, null); + } + + private int decodeContinuationToken(GetSplitsRequest request) + { + if (request.hasContinuationToken()) { + return Integer.valueOf(request.getContinuationToken()); + } + + //No continuation token present + return 0; + } + + private String encodeContinuationToken(int partition) + { + return String.valueOf(partition); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlQueryStringBuilder.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlQueryStringBuilder.java new file mode 100644 index 0000000000..57d124038b --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlQueryStringBuilder.java @@ -0,0 +1,56 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.postgresql; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcSplitQueryBuilder; +import com.google.common.base.Strings; + +public class PostGreSqlQueryStringBuilder + extends JdbcSplitQueryBuilder +{ + PostGreSqlQueryStringBuilder(final String quoteCharacters) + { + super(quoteCharacters); + } + + @Override + protected String getFromClauseWithSplit(String catalog, String schema, String table, Split split) + { + StringBuilder tableName = new StringBuilder(); + if (!Strings.isNullOrEmpty(catalog)) { + tableName.append(quote(catalog)).append('.'); + } + if (!Strings.isNullOrEmpty(schema)) { + tableName.append(quote(schema)).append('.'); + } + tableName.append(quote(table)); + + String partitionSchemaName = split.getProperty(PostGreSqlMetadataHandler.BLOCK_PARTITION_SCHEMA_COLUMN_NAME); + String partitionName = split.getProperty(PostGreSqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME); + + if (PostGreSqlMetadataHandler.ALL_PARTITIONS.equals(partitionName)) { + // No partitions + return String.format(" FROM %s ", tableName); + } + + return String.format(" FROM %s.%s ", quote(partitionSchemaName), quote(partitionName)); + } +} diff --git a/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlRecordHandler.java b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlRecordHandler.java new file mode 100644 index 0000000000..fc05e30764 --- /dev/null +++ b/athena-jdbc/src/main/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlRecordHandler.java @@ -0,0 +1,88 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.postgresql; + +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.GenericJdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JDBCUtil; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcRecordHandler; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcSplitQueryBuilder; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.google.common.annotations.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Schema; +import org.apache.commons.lang3.Validate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.SQLException; + +public class PostGreSqlRecordHandler + extends JdbcRecordHandler +{ + private static final Logger LOGGER = LoggerFactory.getLogger(PostGreSqlRecordHandler.class); + + private static final int FETCH_SIZE = 1000; + + private final JdbcSplitQueryBuilder jdbcSplitQueryBuilder; + + private static final String POSTGRES_QUOTE_CHARACTER = "\""; + + public PostGreSqlRecordHandler() + { + this(JDBCUtil.getSingleDatabaseConfigFromEnv(JdbcConnectionFactory.DatabaseEngine.POSTGRES)); + } + + public PostGreSqlRecordHandler(final DatabaseConnectionConfig databaseConnectionConfig) + { + this(databaseConnectionConfig, AmazonS3ClientBuilder.defaultClient(), AWSSecretsManagerClientBuilder.defaultClient(), AmazonAthenaClientBuilder.defaultClient(), + new GenericJdbcConnectionFactory(databaseConnectionConfig, PostGreSqlMetadataHandler.JDBC_PROPERTIES), new PostGreSqlQueryStringBuilder(POSTGRES_QUOTE_CHARACTER)); + } + + @VisibleForTesting + PostGreSqlRecordHandler(final DatabaseConnectionConfig databaseConnectionConfig, final AmazonS3 amazonS3, final AWSSecretsManager secretsManager, + final AmazonAthena athena, final JdbcConnectionFactory jdbcConnectionFactory, final JdbcSplitQueryBuilder jdbcSplitQueryBuilder) + { + super(amazonS3, secretsManager, athena, databaseConnectionConfig, jdbcConnectionFactory); + this.jdbcSplitQueryBuilder = Validate.notNull(jdbcSplitQueryBuilder, "query builder must not be null"); + } + + @Override + public PreparedStatement buildSplitSql(Connection jdbcConnection, String catalogName, TableName tableName, Schema schema, Constraints constraints, Split split) + throws SQLException + { + PreparedStatement preparedStatement = jdbcSplitQueryBuilder.buildSql(jdbcConnection, null, tableName.getSchemaName(), tableName.getTableName(), schema, constraints, split); + + // Disable fetching all rows. + preparedStatement.setFetchSize(FETCH_SIZE); + + return preparedStatement; + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcMetadataHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcMetadataHandlerTest.java new file mode 100644 index 0000000000..e5a4ca2d48 --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcMetadataHandlerTest.java @@ -0,0 +1,143 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlMetadataHandler; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.util.Collections; +import java.util.Map; + +public class MultiplexingJdbcMetadataHandlerTest +{ + private Map metadataHandlerMap; + private MySqlMetadataHandler mySqlMetadataHandler; + private JdbcMetadataHandler jdbcMetadataHandler; + private BlockAllocator allocator; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + private QueryStatusChecker queryStatusChecker; + private JdbcConnectionFactory jdbcConnectionFactory; + + @Before + public void setup() + { + //this.allocator = Mockito.mock(BlockAllocator.class); + this.allocator = new BlockAllocatorImpl(); + //Mockito.when(this.allocator.createBlock(Mockito.any(Schema.class))).thenReturn(Mockito.mock(Block.class)); + this.mySqlMetadataHandler = Mockito.mock(MySqlMetadataHandler.class); + this.metadataHandlerMap = Collections.singletonMap("mysql", this.mySqlMetadataHandler); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + this.queryStatusChecker = Mockito.mock(QueryStatusChecker.class); + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/${testSecret}", "testSecret"); + this.jdbcMetadataHandler = new MultiplexingJdbcMetadataHandler(this.secretsManager, this.athena, this.jdbcConnectionFactory, this.metadataHandlerMap, databaseConnectionConfig); + } + + @Test + public void doListSchemaNames() + { + ListSchemasRequest listSchemasRequest = Mockito.mock(ListSchemasRequest.class); + Mockito.when(listSchemasRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcMetadataHandler.doListSchemaNames(this.allocator, listSchemasRequest); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).doListSchemaNames(Mockito.eq(this.allocator), Mockito.eq(listSchemasRequest)); + } + + @Test + public void doListTables() + { + ListTablesRequest listTablesRequest = Mockito.mock(ListTablesRequest.class); + Mockito.when(listTablesRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcMetadataHandler.doListTables(this.allocator, listTablesRequest); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).doListTables(Mockito.eq(this.allocator), Mockito.eq(listTablesRequest)); + } + + @Test + public void doGetTable() + { + GetTableRequest getTableRequest = Mockito.mock(GetTableRequest.class); + Mockito.when(getTableRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcMetadataHandler.doGetTable(this.allocator, getTableRequest); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).doGetTable(Mockito.eq(this.allocator), Mockito.eq(getTableRequest)); + } + + @Test + public void doGetTableLayout() + throws Exception + { + GetTableLayoutRequest getTableLayoutRequest = Mockito.mock(GetTableLayoutRequest.class); + Mockito.when(getTableLayoutRequest.getTableName()).thenReturn(new TableName("testSchema", "testTable")); + Mockito.when(getTableLayoutRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcMetadataHandler.doGetTableLayout(this.allocator, getTableLayoutRequest); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).doGetTableLayout(Mockito.eq(this.allocator), Mockito.eq(getTableLayoutRequest)); + } + + @Test + public void getPartitionSchema() + { + this.jdbcMetadataHandler.getPartitionSchema("mysql"); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).getPartitionSchema(Mockito.eq("mysql")); + } + + @Test(expected = RuntimeException.class) + public void getPartitionSchemaForUnsupportedCatalog() + { + this.jdbcMetadataHandler.getPartitionSchema("unsupportedCatalog"); + } + + + @Test + public void getPartitions() + throws Exception + { + GetTableLayoutRequest getTableLayoutRequest = Mockito.mock(GetTableLayoutRequest.class); + Mockito.when(getTableLayoutRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcMetadataHandler.getPartitions(Mockito.mock(BlockWriter.class), getTableLayoutRequest, queryStatusChecker); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).getPartitions(Mockito.any(BlockWriter.class), Mockito.eq(getTableLayoutRequest), Mockito.eq(queryStatusChecker)); + } + + @Test + public void doGetSplits() + { + GetSplitsRequest getSplitsRequest = Mockito.mock(GetSplitsRequest.class); + Mockito.when(getSplitsRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcMetadataHandler.doGetSplits(this.allocator, getSplitsRequest); + Mockito.verify(this.mySqlMetadataHandler, Mockito.times(1)).doGetSplits(Mockito.eq(this.allocator), Mockito.eq(getSplitsRequest)); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcRecordHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcRecordHandlerTest.java new file mode 100644 index 0000000000..d1a9e3ae02 --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/MultiplexingJdbcRecordHandlerTest.java @@ -0,0 +1,104 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcRecordHandler; +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlRecordHandler; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.sql.Connection; +import java.sql.SQLException; +import java.util.Collections; +import java.util.Map; + +public class MultiplexingJdbcRecordHandlerTest +{ + private Map recordHandlerMap; + private MySqlRecordHandler mySqlRecordHandler; + private JdbcRecordHandler jdbcRecordHandler; + private AmazonS3 amazonS3; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + private QueryStatusChecker queryStatusChecker; + private JdbcConnectionFactory jdbcConnectionFactory; + + @Before + public void setup() + { + this.mySqlRecordHandler = Mockito.mock(MySqlRecordHandler.class); + this.recordHandlerMap = Collections.singletonMap("mysql", this.mySqlRecordHandler); + this.amazonS3 = Mockito.mock(AmazonS3.class); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + this.queryStatusChecker = Mockito.mock(QueryStatusChecker.class); + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/${testSecret}", "testSecret"); + this.jdbcRecordHandler = new MultiplexingJdbcRecordHandler(this.amazonS3, this.secretsManager, this.athena, this.jdbcConnectionFactory, databaseConnectionConfig, this.recordHandlerMap); + } + + @Test + public void readWithConstraint() + { + BlockSpiller blockSpiller = Mockito.mock(BlockSpiller.class); + ReadRecordsRequest readRecordsRequest = Mockito.mock(ReadRecordsRequest.class); + Mockito.when(readRecordsRequest.getCatalogName()).thenReturn("mysql"); + this.jdbcRecordHandler.readWithConstraint(blockSpiller, readRecordsRequest, queryStatusChecker); + Mockito.verify(this.mySqlRecordHandler, Mockito.times(1)).readWithConstraint(Mockito.eq(blockSpiller), Mockito.eq(readRecordsRequest), Mockito.eq(queryStatusChecker)); + } + + @Test(expected = RuntimeException.class) + public void readWithConstraintWithUnsupportedCatalog() + { + BlockSpiller blockSpiller = Mockito.mock(BlockSpiller.class); + ReadRecordsRequest readRecordsRequest = Mockito.mock(ReadRecordsRequest.class); + Mockito.when(readRecordsRequest.getCatalogName()).thenReturn("unsupportedCatalog"); + this.jdbcRecordHandler.readWithConstraint(blockSpiller, readRecordsRequest, queryStatusChecker); + } + + @Test + public void buildSplitSql() + throws SQLException + { + ReadRecordsRequest readRecordsRequest = Mockito.mock(ReadRecordsRequest.class); + Mockito.when(readRecordsRequest.getCatalogName()).thenReturn("mysql"); + Connection jdbcConnection = Mockito.mock(Connection.class); + TableName tableName = new TableName("testSchema", "tableName"); + Schema schema = Mockito.mock(Schema.class); + Constraints constraints = Mockito.mock(Constraints.class); + Split split = Mockito.mock(Split.class); + this.jdbcRecordHandler.buildSplitSql(jdbcConnection, "mysql", tableName, schema, constraints, split); + Mockito.verify(this.mySqlRecordHandler, Mockito.times(1)).buildSplitSql(Mockito.eq(jdbcConnection), Mockito.eq("mysql"), Mockito.eq(tableName), Mockito.eq(schema), Mockito.eq(constraints), Mockito.eq(split)); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/TestBase.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/TestBase.java new file mode 100644 index 0000000000..9168e0a8b6 --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/TestBase.java @@ -0,0 +1,91 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc; + +import org.mockito.Mockito; +import org.mockito.stubbing.Answer; + +import java.sql.ResultSet; +import java.sql.SQLException; +import java.util.Arrays; +import java.util.concurrent.atomic.AtomicInteger; + +public class TestBase +{ + protected ResultSet mockResultSet(String[] columnNames, int[] columnTypes, Object[][] rows, AtomicInteger rowNumber) + throws SQLException + { + ResultSet resultSet = Mockito.mock(ResultSet.class, Mockito.RETURNS_DEEP_STUBS); + + Mockito.when(resultSet.next()).thenAnswer( + (Answer) invocation -> { + if (rows.length <= 0 || rows[0].length <= 0) { + return false; + } + + return rowNumber.getAndIncrement() + 1 < rows.length; + }); + + Mockito.when(resultSet.getInt(Mockito.any())).thenAnswer((Answer) invocation -> { + Object argument = invocation.getArguments()[0]; + + if (argument instanceof Integer) { + int colIndex = (Integer) argument; + return (Integer) rows[rowNumber.get()][colIndex] - 1; + } + else if (argument instanceof String) { + int colIndex = Arrays.asList(columnNames).indexOf(argument); + return (Integer) rows[rowNumber.get()][colIndex]; + } + else { + throw new RuntimeException("Unexpected argument type " + argument.getClass()); + } + }); + + Mockito.when(resultSet.getString(Mockito.any())).thenAnswer((Answer) invocation -> { + Object argument = invocation.getArguments()[0]; + if (argument instanceof Integer) { + int colIndex = (Integer) argument - 1; + return String.valueOf(rows[rowNumber.get()][colIndex]); + } + else if (argument instanceof String) { + int colIndex = Arrays.asList(columnNames).indexOf(argument); + return String.valueOf(rows[rowNumber.get()][colIndex]); + } + else { + throw new RuntimeException("Unexpected argument type " + argument.getClass()); + } + }); + + if (columnTypes != null) { + Mockito.when(resultSet.getMetaData().getColumnCount()).thenReturn(columnNames.length); + Mockito.when(resultSet.getMetaData().getColumnDisplaySize(Mockito.anyInt())).thenReturn(10); + Mockito.when(resultSet.getMetaData().getColumnType(Mockito.anyInt())).thenAnswer((Answer) invocation -> columnTypes[(Integer) invocation.getArguments()[0] - 1]); + } + + return resultSet; + } + + protected ResultSet mockResultSet(String[] columnNames, Object[][] rows, AtomicInteger rowNumber) + throws SQLException + { + return this.mockResultSet(columnNames, null, rows, rowNumber); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfigBuilderTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfigBuilderTest.java new file mode 100644 index 0000000000..747cf061a6 --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/connection/DatabaseConnectionConfigBuilderTest.java @@ -0,0 +1,72 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import com.google.common.collect.ImmutableMap; +import org.junit.Assert; +import org.junit.Test; + +import java.util.Arrays; +import java.util.Collections; +import java.util.List; + +public class DatabaseConnectionConfigBuilderTest +{ + private static final String CONNECTION_STRING1 = "mysql://jdbc:mysql://hostname/${testSecret}"; + private static final String CONNECTION_STRING2 = "postgres://jdbc:postgresql://hostname/user=testUser&password=testPassword"; + + @Test + public void build() + { + DatabaseConnectionConfig expectedDatabase1 = new DatabaseConnectionConfig("testCatalog1", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "jdbc:mysql://hostname/${testSecret}", "testSecret"); + DatabaseConnectionConfig expectedDatabase2 = new DatabaseConnectionConfig("testCatalog2", JdbcConnectionFactory.DatabaseEngine.POSTGRES, + "jdbc:postgresql://hostname/user=testUser&password=testPassword"); + DatabaseConnectionConfig defaultConnection = new DatabaseConnectionConfig("default", JdbcConnectionFactory.DatabaseEngine.POSTGRES, + "jdbc:postgresql://hostname/user=testUser&password=testPassword"); + + List databaseConnectionConfigs = new DatabaseConnectionConfigBuilder() + .properties(ImmutableMap.of( + "default", CONNECTION_STRING2, + "testCatalog1_connection_string", CONNECTION_STRING1, + "testCatalog2_connection_string", CONNECTION_STRING2)) + .build(); + + Assert.assertEquals(Arrays.asList(defaultConnection, expectedDatabase1, expectedDatabase2), databaseConnectionConfigs); + } + + @Test(expected = RuntimeException.class) + public void buildInvalidConnectionString() + { + new DatabaseConnectionConfigBuilder().properties(Collections.singletonMap("default", "malformedUrl")).build(); + } + + @Test(expected = RuntimeException.class) + public void buildWithNoDefault() + { + new DatabaseConnectionConfigBuilder().properties(Collections.singletonMap("testDb_connection_string", CONNECTION_STRING1)).build(); + } + + @Test(expected = RuntimeException.class) + public void buildMalformedConnectionString() + { + new DatabaseConnectionConfigBuilder().properties(Collections.singletonMap("testDb_connection_string", null)).build(); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredentialProviderTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredentialProviderTest.java new file mode 100644 index 0000000000..ba5edb1745 --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/connection/JdbcCredentialProviderTest.java @@ -0,0 +1,53 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.connection; + +import org.junit.Assert; +import org.junit.Test; + +import static org.junit.Assert.*; + +public class JdbcCredentialProviderTest +{ + + @Test + public void getStaticCredential() + { + JdbcCredential expectedCredential = new JdbcCredential("testUser", "testPassword"); + JdbcCredentialProvider jdbcCredentialProvider = new StaticJdbcCredentialProvider(expectedCredential); + + Assert.assertEquals(expectedCredential, jdbcCredentialProvider.getCredential()); + } + + @Test + public void getRdsSecretsCredential() + { + JdbcCredential expectedCredential = new JdbcCredential("testUser", "testPassword"); + JdbcCredentialProvider jdbcCredentialProvider = new RdsSecretsCredentialProvider("{\"username\": \"testUser\", \"password\": \"testPassword\"}"); + + Assert.assertEquals(expectedCredential, jdbcCredentialProvider.getCredential()); + } + + @Test(expected = RuntimeException.class) + public void getRdsSecretsCredentialIOException() + { + new RdsSecretsCredentialProvider(""); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JDBCUtilTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JDBCUtilTest.java new file mode 100644 index 0000000000..478aab1f3f --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JDBCUtilTest.java @@ -0,0 +1,101 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlRecordHandler; +import com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.postgresql.PostGreSqlRecordHandler; +import com.google.common.collect.ImmutableMap; +import org.junit.Assert; +import org.junit.Test; + +import java.util.Collections; +import java.util.Map; + +public class JDBCUtilTest +{ + private static final int PORT = 1111; + private static final String CONNECTION_STRING1 = "mysql://jdbc:mysql://hostname/${testSecret}"; + private static final String CONNECTION_STRING2 = "postgres://jdbc:postgresql://hostname/user=testUser&password=testPassword"; + + + @Test + public void createJdbcMetadataHandlerMap() + { + Map catalogs = JDBCUtil.createJdbcMetadataHandlerMap(ImmutableMap.builder() + .put("testCatalog1_connection_string", CONNECTION_STRING1) + .put("testCatalog2_connection_string", CONNECTION_STRING2) + .put("default", CONNECTION_STRING2) + .put("AWS_LAMBDA_FUNCTION_NAME", "functionName") + .build()); + + Assert.assertEquals(4, catalogs.size()); + Assert.assertEquals(catalogs.get("testCatalog1").getClass(), MySqlMetadataHandler.class); + Assert.assertEquals(catalogs.get("testCatalog2").getClass(), PostGreSqlMetadataHandler.class); + Assert.assertEquals(catalogs.get("lambda:functionName").getClass(), PostGreSqlMetadataHandler.class); + } + + @Test(expected = RuntimeException.class) + public void createJdbcMetadataHandlerEmptyConnectionStrings() + { + JDBCUtil.createJdbcMetadataHandlerMap(Collections.emptyMap()); + } + + @Test(expected = RuntimeException.class) + public void createJdbcMetadataHandlerNoDefault() + { + JDBCUtil.createJdbcMetadataHandlerMap(ImmutableMap.builder() + .put("testCatalog1_connection_string", CONNECTION_STRING1) + .put("testCatalog2_connection_string", CONNECTION_STRING2) + .build()); + } + + + @Test + public void createJdbcRecordHandlerMap() + { + Map catalogs = JDBCUtil.createJdbcRecordHandlerMap(ImmutableMap.builder() + .put("testCatalog1_connection_string", CONNECTION_STRING1) + .put("testCatalog2_connection_string", CONNECTION_STRING2) + .put("default", CONNECTION_STRING2) + .put("AWS_LAMBDA_FUNCTION_NAME", "functionName") + .build()); + + Assert.assertEquals(catalogs.get("testCatalog1").getClass(), MySqlRecordHandler.class); + Assert.assertEquals(catalogs.get("testCatalog2").getClass(), PostGreSqlRecordHandler.class); + Assert.assertEquals(catalogs.get("lambda:functionName").getClass(), PostGreSqlRecordHandler.class); + } + + @Test(expected = RuntimeException.class) + public void createJdbcRecordHandlerMapEmptyConnectionStrings() + { + JDBCUtil.createJdbcRecordHandlerMap(Collections.emptyMap()); + } + + @Test(expected = RuntimeException.class) + public void createJdbcRecordHandlerMapNoDefault() + { + JDBCUtil.createJdbcRecordHandlerMap(ImmutableMap.builder() + .put("testCatalog1_connection_string", CONNECTION_STRING1) + .put("testCatalog2_connection_string", CONNECTION_STRING2) + .build()); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcArrowTypeConverterTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcArrowTypeConverterTest.java new file mode 100644 index 0000000000..0dab186f1a --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcArrowTypeConverterTest.java @@ -0,0 +1,55 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.junit.Assert; +import org.junit.Test; + +public class JdbcArrowTypeConverterTest +{ + @Test + public void toArrowType() + { + Assert.assertEquals(Types.MinorType.BIT.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.BIT, 0, 0)); + Assert.assertEquals(Types.MinorType.BIT.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.BOOLEAN, 0, 0)); + Assert.assertEquals(Types.MinorType.TINYINT.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.TINYINT, 0, 0)); + Assert.assertEquals(Types.MinorType.SMALLINT.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.SMALLINT, 0, 0)); + Assert.assertEquals(Types.MinorType.INT.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.INTEGER, 0, 0)); + Assert.assertEquals(Types.MinorType.BIGINT.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.BIGINT, 0, 0)); + Assert.assertEquals(Types.MinorType.FLOAT4.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.REAL, 0, 0)); + Assert.assertEquals(Types.MinorType.FLOAT4.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.FLOAT, 0, 0)); + Assert.assertEquals(Types.MinorType.FLOAT8.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.DOUBLE, 0, 0)); + Assert.assertEquals(new ArrowType.Decimal(5, 3), JdbcArrowTypeConverter.toArrowType(java.sql.Types.DECIMAL, 5, 3)); + Assert.assertEquals(Types.MinorType.VARCHAR.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.CHAR, 0, 0)); + Assert.assertEquals(Types.MinorType.VARCHAR.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.NCHAR, 0, 0)); + Assert.assertEquals(Types.MinorType.VARCHAR.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.VARCHAR, 0, 0)); + Assert.assertEquals(Types.MinorType.VARCHAR.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.NVARCHAR, 0, 0)); + Assert.assertEquals(Types.MinorType.VARCHAR.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.LONGVARCHAR, 0, 0)); + Assert.assertEquals(Types.MinorType.VARCHAR.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.LONGNVARCHAR, 0, 0)); + Assert.assertEquals(Types.MinorType.VARBINARY.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.BINARY, 0, 0)); + Assert.assertEquals(Types.MinorType.VARBINARY.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.VARBINARY, 0, 0)); + Assert.assertEquals(Types.MinorType.VARBINARY.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.LONGVARBINARY, 0, 0)); + Assert.assertEquals(Types.MinorType.DATEMILLI.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.DATE, 0, 0)); + Assert.assertEquals(Types.MinorType.TIMEMILLI.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.TIME, 0, 0)); + Assert.assertEquals(Types.MinorType.TIMESTAMPMILLI.getType(), JdbcArrowTypeConverter.toArrowType(java.sql.Types.TIMESTAMP, 0, 0)); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcMetadataHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcMetadataHandlerTest.java new file mode 100644 index 0000000000..e9467862ca --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcMetadataHandlerTest.java @@ -0,0 +1,227 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.connectors.athena.jdbc.TestBase; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.sql.Connection; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Types; +import java.util.concurrent.atomic.AtomicInteger; + +public class JdbcMetadataHandlerTest + extends TestBase +{ + private static final Schema PARTITION_SCHEMA = SchemaBuilder.newBuilder().addField("testPartitionCol", org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build(); + + private JdbcMetadataHandler jdbcMetadataHandler; + private JdbcConnectionFactory jdbcConnectionFactory; + private FederatedIdentity federatedIdentity; + private Connection connection; + private BlockAllocator blockAllocator; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + + @Before + public void setup() + { + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + this.connection = Mockito.mock(Connection.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(this.jdbcConnectionFactory.getConnection(Mockito.any(JdbcCredentialProvider.class))).thenReturn(this.connection); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + Mockito.when(this.secretsManager.getSecretValue(Mockito.eq(new GetSecretValueRequest().withSecretId("testSecret")))).thenReturn(new GetSecretValueResult().withSecretString("{\"username\": \"testUser\", \"password\": \"testPassword\"}")); + DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/${testSecret}", "testSecret"); + this.jdbcMetadataHandler = new JdbcMetadataHandler(databaseConnectionConfig, this.secretsManager, this.athena, jdbcConnectionFactory) + { + @Override + public Schema getPartitionSchema(final String catalogName) + { + return PARTITION_SCHEMA; + } + + @Override + public void getPartitions(final BlockWriter blockWriter, final GetTableLayoutRequest getTableLayoutRequest, QueryStatusChecker queryStatusChecker) + { + } + + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest getSplitsRequest) + { + return null; + } + }; + this.federatedIdentity = Mockito.mock(FederatedIdentity.class); + this.blockAllocator = Mockito.mock(BlockAllocator.class); + } + + @Test + public void getJdbcConnectionFactory() + { + Assert.assertEquals(this.jdbcConnectionFactory, this.jdbcMetadataHandler.getJdbcConnectionFactory()); + } + + @Test + public void doListSchemaNames() + throws SQLException + { + String[] schema = {"TABLE_SCHEM"}; + Object[][] values = {{"testDB"}, {"testdb2"}, {"information_schema"}}; + String[] expected = {"testDB", "testdb2"}; + AtomicInteger rowNumber = new AtomicInteger(-1); + ResultSet resultSet = mockResultSet(schema, values, rowNumber); + Mockito.when(connection.getMetaData().getSchemas()).thenReturn(resultSet); + ListSchemasResponse listSchemasResponse = this.jdbcMetadataHandler.doListSchemaNames(this.blockAllocator, new ListSchemasRequest(this.federatedIdentity, "testQueryId", "testCatalog")); + Assert.assertArrayEquals(expected, listSchemasResponse.getSchemas().toArray()); + } + + @Test + public void doListTables() + throws SQLException + { + String[] schema = {"TABLE_SCHEM", "TABLE_NAME"}; + Object[][] values = {{"testSchema", "testTable"}, {"testSchema", "testtable2"}}; + TableName[] expected = {new TableName("testSchema", "testTable"), new TableName("testSchema", "testtable2")}; + AtomicInteger rowNumber = new AtomicInteger(-1); + ResultSet resultSet = mockResultSet(schema, values, rowNumber); + + Mockito.when(connection.getMetaData().getTables("testCatalog", "testSchema", null, new String[] {"TABLE", "VIEW"})).thenReturn(resultSet); + Mockito.when(connection.getCatalog()).thenReturn("testCatalog"); + ListTablesResponse listTablesResponse = this.jdbcMetadataHandler.doListTables( + this.blockAllocator, new ListTablesRequest(this.federatedIdentity, "testQueryId", "testCatalog", "testSchema")); + Assert.assertArrayEquals(expected, listTablesResponse.getTables().toArray()); + } + + @Test + public void doListTablesEscaped() + throws SQLException + { + String[] schema = {"TABLE_SCHEM", "TABLE_NAME"}; + Object[][] values = {{"test_Schema", "testTable"}, {"test_Schema", "testtable2"}}; + TableName[] expected = {new TableName("test_Schema", "testTable"), new TableName("test_Schema", "testtable2")}; + AtomicInteger rowNumber = new AtomicInteger(-1); + ResultSet resultSet = mockResultSet(schema, values, rowNumber); + Mockito.when(connection.getMetaData().getTables("testCatalog", "test\\_Schema", null, new String[] {"TABLE", "VIEW"})).thenReturn(resultSet); + Mockito.when(connection.getCatalog()).thenReturn("testCatalog"); + Mockito.when(connection.getMetaData().getSearchStringEscape()).thenReturn("\\"); + ListTablesResponse listTablesResponse = this.jdbcMetadataHandler.doListTables( + this.blockAllocator, new ListTablesRequest(this.federatedIdentity, "testQueryId", "testCatalog", "test_Schema")); + Assert.assertArrayEquals(expected, listTablesResponse.getTables().toArray()); + } + + @Test(expected = IllegalArgumentException.class) + public void doListTablesEscapedException() + throws SQLException + { + Mockito.when(connection.getMetaData().getSearchStringEscape()).thenReturn("_"); + this.jdbcMetadataHandler.doListTables(this.blockAllocator, new ListTablesRequest(this.federatedIdentity, "testQueryId", "testCatalog", "test_Schema")); + } + + @Test + public void doGetTable() + throws SQLException + { + String[] schema = {"DATA_TYPE", "COLUMN_SIZE", "COLUMN_NAME", "DECIMAL_DIGITS", "NUM_PREC_RADIX"}; + Object[][] values = {{Types.INTEGER, 12, "testCol1", 0, 0}, {Types.VARCHAR, 25, "testCol2", 0, 0}}; + AtomicInteger rowNumber = new AtomicInteger(-1); + ResultSet resultSet = mockResultSet(schema, values, rowNumber); + + SchemaBuilder expectedSchemaBuilder = SchemaBuilder.newBuilder(); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder("testCol1", org.apache.arrow.vector.types.Types.MinorType.INT.getType()).build()); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder("testCol2", org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + PARTITION_SCHEMA.getFields().forEach(expectedSchemaBuilder::addField); + Schema expected = expectedSchemaBuilder.build(); + + TableName inputTableName = new TableName("testSchema", "testTable"); + Mockito.when(connection.getMetaData().getColumns("testCatalog", inputTableName.getSchemaName(), inputTableName.getTableName(), null)).thenReturn(resultSet); + Mockito.when(connection.getCatalog()).thenReturn("testCatalog"); + + GetTableResponse getTableResponse = this.jdbcMetadataHandler.doGetTable( + this.blockAllocator, new GetTableRequest(this.federatedIdentity, "testQueryId", "testCatalog", inputTableName)); + + Assert.assertEquals(expected, getTableResponse.getSchema()); + Assert.assertEquals(inputTableName, getTableResponse.getTableName()); + Assert.assertEquals("testCatalog", getTableResponse.getCatalogName()); + } + + @Test(expected = RuntimeException.class) + public void doGetTableNoColumns() + { + TableName inputTableName = new TableName("testSchema", "testTable"); + + this.jdbcMetadataHandler.doGetTable(this.blockAllocator, new GetTableRequest(this.federatedIdentity, "testQueryId", "testCatalog", inputTableName)); + } + + @Test(expected = RuntimeException.class) + public void doGetTableSQLException() + throws SQLException + { + TableName inputTableName = new TableName("testSchema", "testTable"); + Mockito.when(this.connection.getMetaData().getColumns(Mockito.anyString(), Mockito.anyString(), Mockito.anyString(), Mockito.anyString())) + .thenThrow(new SQLException()); + this.jdbcMetadataHandler.doGetTable(this.blockAllocator, new GetTableRequest(this.federatedIdentity, "testQueryId", "testCatalog", inputTableName)); + } + + @Test(expected = RuntimeException.class) + public void doListSchemaNamesSQLException() + throws SQLException + { + Mockito.when(this.connection.getMetaData().getSchemas()).thenThrow(new SQLException()); + this.jdbcMetadataHandler.doListSchemaNames(this.blockAllocator, new ListSchemasRequest(this.federatedIdentity, "testQueryId", "testCatalog")); + } + + @Test(expected = RuntimeException.class) + public void doListTablesSQLException() + throws SQLException + { + Mockito.when(this.connection.getMetaData().getTables(Mockito.anyString(), Mockito.anyString(), Mockito.anyString(), Mockito.any())).thenThrow(new SQLException()); + this.jdbcMetadataHandler.doListTables(this.blockAllocator, new ListTablesRequest(this.federatedIdentity, "testQueryId", "testCatalog", "testSchema")); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcRecordHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcRecordHandlerTest.java new file mode 100644 index 0000000000..aa84ab044d --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/manager/JdbcRecordHandlerTest.java @@ -0,0 +1,151 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.manager; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpiller; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.data.SpillConfig; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.ConstraintEvaluator; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.connectors.athena.jdbc.TestBase; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; +import org.mockito.stubbing.Answer; + +import java.io.ByteArrayInputStream; +import java.nio.charset.StandardCharsets; +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Types; +import java.util.concurrent.atomic.AtomicInteger; + +public class JdbcRecordHandlerTest + extends TestBase +{ + + private JdbcRecordHandler jdbcRecordHandler; + private Connection connection; + private JdbcConnectionFactory jdbcConnectionFactory; + private AmazonS3 amazonS3; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + private QueryStatusChecker queryStatusChecker; + private FederatedIdentity federatedIdentity; + private PreparedStatement preparedStatement; + + @Before + public void setup() + throws SQLException + { + this.connection = Mockito.mock(Connection.class, Mockito.RETURNS_DEEP_STUBS); + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + Mockito.when(this.jdbcConnectionFactory.getConnection(Mockito.any(JdbcCredentialProvider.class))).thenReturn(this.connection); + this.amazonS3 = Mockito.mock(AmazonS3.class); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + this.queryStatusChecker = Mockito.mock(QueryStatusChecker.class); + Mockito.when(this.secretsManager.getSecretValue(Mockito.eq(new GetSecretValueRequest().withSecretId("testSecret")))).thenReturn(new GetSecretValueResult().withSecretString("{\"username\": \"testUser\", \"password\": \"testPassword\"}")); + this.preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement("someSql")).thenReturn(this.preparedStatement); + DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/${testSecret}", "testSecret"); + this.jdbcRecordHandler = new JdbcRecordHandler(this.amazonS3, this.secretsManager, this.athena, databaseConnectionConfig, this.jdbcConnectionFactory) + { + @Override + public PreparedStatement buildSplitSql(Connection jdbcConnection, String catalogName, TableName tableName, Schema schema, Constraints constraints, Split split) + throws SQLException + { + return jdbcConnection.prepareStatement("someSql"); + } + }; + this.federatedIdentity = Mockito.mock(FederatedIdentity.class); + } + + @Test + public void readWithConstraint() + throws SQLException + { + ConstraintEvaluator constraintEvaluator = Mockito.mock(ConstraintEvaluator.class); + Mockito.when(constraintEvaluator.apply(Mockito.anyString(), Mockito.any())).thenReturn(true); + + TableName inputTableName = new TableName("testSchema", "testTable"); + SchemaBuilder expectedSchemaBuilder = SchemaBuilder.newBuilder(); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder("testCol1", org.apache.arrow.vector.types.Types.MinorType.INT.getType()).build()); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder("testCol2", org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder("testPartitionCol", org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + Schema fieldSchema = expectedSchemaBuilder.build(); + + BlockAllocator allocator = new BlockAllocatorImpl(); + S3SpillLocation s3SpillLocation = S3SpillLocation.newBuilder().withIsDirectory(true).build(); + + Split.Builder splitBuilder = Split.newBuilder(s3SpillLocation, null) + .add("testPartitionCol", String.valueOf("testPartitionValue")); + + Constraints constraints = Mockito.mock(Constraints.class, Mockito.RETURNS_DEEP_STUBS); + + String[] schema = {"testCol1", "testCol2"}; + int[] columnTypes = {Types.INTEGER, Types.VARCHAR}; + Object[][] values = {{1, "testVal1"}, {2, "testVal2"}}; + AtomicInteger rowNumber = new AtomicInteger(-1); + ResultSet resultSet = mockResultSet(schema, columnTypes, values, rowNumber); + Mockito.when(this.preparedStatement.executeQuery()).thenReturn(resultSet); + + SpillConfig spillConfig = Mockito.mock(SpillConfig.class); + Mockito.when(spillConfig.getSpillLocation()).thenReturn(s3SpillLocation); + BlockSpiller s3Spiller = new S3BlockSpiller(this.amazonS3, spillConfig, allocator, fieldSchema, constraintEvaluator); + ReadRecordsRequest readRecordsRequest = new ReadRecordsRequest(this.federatedIdentity, "testCatalog", "testQueryId", inputTableName, fieldSchema, splitBuilder.build(), constraints, 1024, 1024); + + Mockito.when(amazonS3.putObject(Mockito.anyString(), Mockito.anyString(), Mockito.any(), Mockito.any())).thenAnswer((Answer) invocation -> { + ByteArrayInputStream byteArrayInputStream = (ByteArrayInputStream) invocation.getArguments()[2]; + int n = byteArrayInputStream.available(); + byte[] bytes = new byte[n]; + byteArrayInputStream.read(bytes, 0, n); + String data = new String(bytes, StandardCharsets.UTF_8); + Assert.assertTrue(data.contains("testVal1") || data.contains("testVal2") || data.contains("testPartitionValue")); + return new PutObjectResult(); + }); + + this.jdbcRecordHandler.readWithConstraint(s3Spiller, readRecordsRequest, queryStatusChecker); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlMetadataHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlMetadataHandlerTest.java new file mode 100644 index 0000000000..378e47d958 --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlMetadataHandlerTest.java @@ -0,0 +1,273 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.mysql; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.connectors.athena.jdbc.TestBase; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Types; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.stream.Collectors; + +public class MySqlMetadataHandlerTest + extends TestBase +{ + private DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/user=A&password=B"); + private MySqlMetadataHandler mySqlMetadataHandler; + private JdbcConnectionFactory jdbcConnectionFactory; + private Connection connection; + private FederatedIdentity federatedIdentity; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + + @Before + public void setup() + { + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + this.connection = Mockito.mock(Connection.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(this.jdbcConnectionFactory.getConnection(Mockito.any(JdbcCredentialProvider.class))).thenReturn(this.connection); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + Mockito.when(this.secretsManager.getSecretValue(Mockito.eq(new GetSecretValueRequest().withSecretId("testSecret")))).thenReturn(new GetSecretValueResult().withSecretString("{\"username\": \"testUser\", \"password\": \"testPassword\"}")); + this.mySqlMetadataHandler = new MySqlMetadataHandler(databaseConnectionConfig, this.secretsManager, this.athena, this.jdbcConnectionFactory); + this.federatedIdentity = Mockito.mock(FederatedIdentity.class); + } + + @Test + public void getPartitionSchema() + { + Assert.assertEquals(SchemaBuilder.newBuilder() + .addField(MySqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build(), + this.mySqlMetadataHandler.getPartitionSchema("testCatalogName")); + } + + @Test + public void doGetTableLayout() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.mySqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(MySqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"partition_name"}; + int[] types = {Types.VARCHAR}; + Object[][] values = {{"p0"}, {"p1"}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.mySqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + Assert.assertEquals(values.length, getTableLayoutResponse.getPartitions().getRowCount()); + + List expectedValues = new ArrayList<>(); + for (int i = 0; i < getTableLayoutResponse.getPartitions().getRowCount(); i++) { + expectedValues.add(BlockUtils.rowToString(getTableLayoutResponse.getPartitions(), i)); + } + Assert.assertEquals(expectedValues, Arrays.asList("[partition_name : p0]", "[partition_name : p1]")); + + SchemaBuilder expectedSchemaBuilder = SchemaBuilder.newBuilder(); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder(MySqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + Schema expectedSchema = expectedSchemaBuilder.build(); + Assert.assertEquals(expectedSchema, getTableLayoutResponse.getPartitions().getSchema()); + Assert.assertEquals(tableName, getTableLayoutResponse.getTableName()); + + Mockito.verify(preparedStatement, Mockito.times(1)).setString(1, tableName.getTableName()); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(2, tableName.getSchemaName()); + } + + @Test + public void doGetTableLayoutWithNoPartitions() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.mySqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(MySqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"partition_name"}; + int[] types = {Types.VARCHAR}; + Object[][] values = {{}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.mySqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + Assert.assertEquals(values.length, getTableLayoutResponse.getPartitions().getRowCount()); + + List expectedValues = new ArrayList<>(); + for (int i = 0; i < getTableLayoutResponse.getPartitions().getRowCount(); i++) { + expectedValues.add(BlockUtils.rowToString(getTableLayoutResponse.getPartitions(), i)); + } + Assert.assertEquals(expectedValues, Collections.singletonList("[partition_name : *]")); + + SchemaBuilder expectedSchemaBuilder = SchemaBuilder.newBuilder(); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder(MySqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + Schema expectedSchema = expectedSchemaBuilder.build(); + Assert.assertEquals(expectedSchema, getTableLayoutResponse.getPartitions().getSchema()); + Assert.assertEquals(tableName, getTableLayoutResponse.getTableName()); + + Mockito.verify(preparedStatement, Mockito.times(1)).setString(1, tableName.getTableName()); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(2, tableName.getSchemaName()); + } + + @Test(expected = RuntimeException.class) + public void doGetTableLayoutWithSQLException() + throws Exception + { + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.mySqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + Connection connection = Mockito.mock(Connection.class, Mockito.RETURNS_DEEP_STUBS); + JdbcConnectionFactory jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + Mockito.when(jdbcConnectionFactory.getConnection(Mockito.any(JdbcCredentialProvider.class))).thenReturn(connection); + Mockito.when(connection.getMetaData().getSearchStringEscape()).thenThrow(new SQLException()); + MySqlMetadataHandler mySqlMetadataHandler = new MySqlMetadataHandler(databaseConnectionConfig, this.secretsManager, this.athena, jdbcConnectionFactory); + + mySqlMetadataHandler.doGetTableLayout(Mockito.mock(BlockAllocator.class), getTableLayoutRequest); + } + + @Test + public void doGetSplits() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(MySqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {MySqlMetadataHandler.PARTITION_COLUMN_NAME}; + int[] types = {Types.VARCHAR}; + Object[][] values = {{"p0"}, {"p1"}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + Schema partitionSchema = this.mySqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + GetTableLayoutResponse getTableLayoutResponse = this.mySqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + BlockAllocator splitBlockAllocator = new BlockAllocatorImpl(); + GetSplitsRequest getSplitsRequest = new GetSplitsRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, getTableLayoutResponse.getPartitions(), new ArrayList<>(partitionCols), constraints, null); + GetSplitsResponse getSplitsResponse = this.mySqlMetadataHandler.doGetSplits(splitBlockAllocator, getSplitsRequest); + + Set> expectedSplits = new HashSet<>(); + expectedSplits.add(Collections.singletonMap(MySqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, "p0")); + expectedSplits.add(Collections.singletonMap(MySqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, "p1")); + Assert.assertEquals(expectedSplits.size(), getSplitsResponse.getSplits().size()); + Set> actualSplits = getSplitsResponse.getSplits().stream().map(Split::getProperties).collect(Collectors.toSet()); + Assert.assertEquals(expectedSplits, actualSplits); + } + + @Test + public void doGetSplitsContinuation() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.mySqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(MySqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"partition_name"}; + int[] types = {Types.VARCHAR}; + Object[][] values = {{"p0"}, {"p1"}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + final String expectedQuery = String.format(MySqlMetadataHandler.GET_PARTITIONS_QUERY, tableName.getTableName(), tableName.getSchemaName()); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.mySqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + BlockAllocator splitBlockAllocator = new BlockAllocatorImpl(); + GetSplitsRequest getSplitsRequest = new GetSplitsRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, getTableLayoutResponse.getPartitions(), new ArrayList<>(partitionCols), constraints, "1"); + GetSplitsResponse getSplitsResponse = this.mySqlMetadataHandler.doGetSplits(splitBlockAllocator, getSplitsRequest); + + Set> expectedSplits = new HashSet<>(); + expectedSplits.add(Collections.singletonMap("partition_name", "p1")); + Assert.assertEquals(expectedSplits.size(), getSplitsResponse.getSplits().size()); + Set> actualSplits = getSplitsResponse.getSplits().stream().map(Split::getProperties).collect(Collectors.toSet()); + Assert.assertEquals(expectedSplits, actualSplits); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlRecordHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlRecordHandlerTest.java new file mode 100644 index 0000000000..882bdf772d --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/mysql/MySqlRecordHandlerTest.java @@ -0,0 +1,170 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.mysql; + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Marker; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcSplitQueryBuilder; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.util.Collections; + +public class MySqlRecordHandlerTest +{ + private MySqlRecordHandler mySqlRecordHandler; + private Connection connection; + private JdbcConnectionFactory jdbcConnectionFactory; + private JdbcSplitQueryBuilder jdbcSplitQueryBuilder; + private AmazonS3 amazonS3; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + + @Before + public void setup() + { + this.amazonS3 = Mockito.mock(AmazonS3.class); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + this.connection = Mockito.mock(Connection.class); + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + Mockito.when(this.jdbcConnectionFactory.getConnection(Mockito.mock(JdbcCredentialProvider.class))).thenReturn(this.connection); + jdbcSplitQueryBuilder = new MySqlQueryStringBuilder("`"); + final DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/user=A&password=B"); + + this.mySqlRecordHandler = new MySqlRecordHandler(databaseConnectionConfig, amazonS3, secretsManager, athena, jdbcConnectionFactory, jdbcSplitQueryBuilder); + } + + @Test + public void buildSplitSql() + throws SQLException + { + TableName tableName = new TableName("testSchema", "testTable"); + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol1", Types.MinorType.INT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol2", Types.MinorType.VARCHAR.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol3", Types.MinorType.BIGINT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol4", Types.MinorType.FLOAT4.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol5", Types.MinorType.SMALLINT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol6", Types.MinorType.TINYINT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol7", Types.MinorType.FLOAT8.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol8", Types.MinorType.BIT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("partition_name", Types.MinorType.VARCHAR.getType()).build()); + Schema schema = schemaBuilder.build(); + + Split split = Mockito.mock(Split.class); + Mockito.when(split.getProperties()).thenReturn(Collections.singletonMap("partition_name", "p0")); + Mockito.when(split.getProperty(Mockito.eq("partition_name"))).thenReturn("p0"); + + Range range1a = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range1a.isSingleValue()).thenReturn(true); + Mockito.when(range1a.getLow().getValue()).thenReturn(1); + Range range1b = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range1b.isSingleValue()).thenReturn(true); + Mockito.when(range1b.getLow().getValue()).thenReturn(2); + ValueSet valueSet1 = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(valueSet1.getRanges().getOrderedRanges()).thenReturn(ImmutableList.of(range1a, range1b)); + + ValueSet valueSet2 = getRangeSet(Marker.Bound.EXACTLY, "1", Marker.Bound.BELOW, "10"); + ValueSet valueSet3 = getRangeSet(Marker.Bound.ABOVE, 2L, Marker.Bound.EXACTLY, 20L); + ValueSet valueSet4 = getSingleValueSet(1.1F); + ValueSet valueSet5 = getSingleValueSet(1); + ValueSet valueSet6 = getSingleValueSet(0); + ValueSet valueSet7 = getSingleValueSet(1.2d); + ValueSet valueSet8 = getSingleValueSet(true); + + Constraints constraints = Mockito.mock(Constraints.class); + Mockito.when(constraints.getSummary()).thenReturn(new ImmutableMap.Builder() + .put("testCol1", valueSet1) + .put("testCol2", valueSet2) + .put("testCol3", valueSet3) + .put("testCol4", valueSet4) + .put("testCol5", valueSet5) + .put("testCol6", valueSet6) + .put("testCol7", valueSet7) + .put("testCol8", valueSet8) + .build()); + + String expectedSql = "SELECT `testCol1`, `testCol2`, `testCol3`, `testCol4`, `testCol5`, `testCol6`, `testCol7`, `testCol8` FROM `testSchema`.`testTable` PARTITION(p0) WHERE (`testCol1` IN (?,?)) AND ((`testCol2` >= ? AND `testCol2` < ?)) AND ((`testCol3` > ? AND `testCol3` <= ?)) AND (`testCol4` = ?) AND (`testCol5` = ?) AND (`testCol6` = ?) AND (`testCol7` = ?) AND (`testCol8` = ?)"; + PreparedStatement expectedPreparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(Mockito.eq(expectedSql))).thenReturn(expectedPreparedStatement); + + PreparedStatement preparedStatement = this.mySqlRecordHandler.buildSplitSql(this.connection, "testCatalogName", tableName, schema, constraints, split); + + Assert.assertEquals(expectedPreparedStatement, preparedStatement); + Mockito.verify(preparedStatement, Mockito.times(1)).setInt(1, 1); + Mockito.verify(preparedStatement, Mockito.times(1)).setInt(2, 2); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(3, "1"); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(4, "10"); + Mockito.verify(preparedStatement, Mockito.times(1)).setLong(5, 2L); + Mockito.verify(preparedStatement, Mockito.times(1)).setLong(6, 20L); + Mockito.verify(preparedStatement, Mockito.times(1)).setFloat(7, 1.1F); + Mockito.verify(preparedStatement, Mockito.times(1)).setShort(8, (short) 1); + Mockito.verify(preparedStatement, Mockito.times(1)).setByte(9, (byte) 0); + Mockito.verify(preparedStatement, Mockito.times(1)).setDouble(10, 1.2d); + Mockito.verify(preparedStatement, Mockito.times(1)).setBoolean(11, true); + } + + private ValueSet getSingleValueSet(Object value) { + Range range = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range.isSingleValue()).thenReturn(true); + Mockito.when(range.getLow().getValue()).thenReturn(value); + ValueSet valueSet = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(valueSet.getRanges().getOrderedRanges()).thenReturn(Collections.singletonList(range)); + return valueSet; + } + + private ValueSet getRangeSet(Marker.Bound lowerBound, Object lowerValue, Marker.Bound upperBound, Object upperValue) { + Range range = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range.isSingleValue()).thenReturn(false); + Mockito.when(range.getLow().getBound()).thenReturn(lowerBound); + Mockito.when(range.getLow().getValue()).thenReturn(lowerValue); + Mockito.when(range.getHigh().getBound()).thenReturn(upperBound); + Mockito.when(range.getHigh().getValue()).thenReturn(upperValue); + ValueSet valueSet = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(valueSet.getRanges().getOrderedRanges()).thenReturn(Collections.singletonList(range)); + return valueSet; + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlMetadataHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlMetadataHandlerTest.java new file mode 100644 index 0000000000..df917ed5af --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlMetadataHandlerTest.java @@ -0,0 +1,278 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.postgresql; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.connectors.athena.jdbc.TestBase; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcMetadataHandler; +import com.amazonaws.connectors.athena.jdbc.mysql.MySqlMetadataHandler; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.ResultSet; +import java.sql.SQLException; +import java.sql.Types; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Properties; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.stream.Collectors; + +public class PostGreSqlMetadataHandlerTest + extends TestBase +{ + private DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/user=A&password=B"); + private PostGreSqlMetadataHandler postGreSqlMetadataHandler; + private JdbcConnectionFactory jdbcConnectionFactory; + private Connection connection; + private FederatedIdentity federatedIdentity; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + + @Before + public void setup() + { + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + this.connection = Mockito.mock(Connection.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(this.jdbcConnectionFactory.getConnection(Mockito.any(JdbcCredentialProvider.class))).thenReturn(this.connection); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + Mockito.when(this.secretsManager.getSecretValue(Mockito.eq(new GetSecretValueRequest().withSecretId("testSecret")))).thenReturn(new GetSecretValueResult().withSecretString("{\"username\": \"testUser\", \"password\": \"testPassword\"}")); + this.postGreSqlMetadataHandler = new PostGreSqlMetadataHandler(databaseConnectionConfig, this.secretsManager, this.athena, this.jdbcConnectionFactory); + this.federatedIdentity = Mockito.mock(FederatedIdentity.class); + } + + @Test + public void getPartitionSchema() + { + Assert.assertEquals(SchemaBuilder.newBuilder() + .addField(PostGreSqlMetadataHandler.BLOCK_PARTITION_SCHEMA_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()) + .addField(PostGreSqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build(), + this.postGreSqlMetadataHandler.getPartitionSchema("testCatalogName")); + } + + @Test + public void doGetTableLayout() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.postGreSqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(PostGreSqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"child_schema", "child"}; + int[] types = {Types.VARCHAR, Types.VARCHAR}; + Object[][] values = {{"s0", "p0"}, {"s1", "p1"}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.postGreSqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + Assert.assertEquals(values.length, getTableLayoutResponse.getPartitions().getRowCount()); + + List expectedValues = new ArrayList<>(); + for (int i = 0; i < getTableLayoutResponse.getPartitions().getRowCount(); i++) { + expectedValues.add(BlockUtils.rowToString(getTableLayoutResponse.getPartitions(), i)); + } + Assert.assertEquals(expectedValues, Arrays.asList("[partition_schema_name : s0], [partition_name : p0]", "[partition_schema_name : s1], [partition_name : p1]")); + + SchemaBuilder expectedSchemaBuilder = SchemaBuilder.newBuilder(); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder(PostGreSqlMetadataHandler.BLOCK_PARTITION_SCHEMA_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder(PostGreSqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + Schema expectedSchema = expectedSchemaBuilder.build(); + Assert.assertEquals(expectedSchema, getTableLayoutResponse.getPartitions().getSchema()); + Assert.assertEquals(tableName, getTableLayoutResponse.getTableName()); + + Mockito.verify(preparedStatement, Mockito.times(1)).setString(1, tableName.getSchemaName()); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(2, tableName.getTableName()); + } + + @Test + public void doGetTableLayoutWithNoPartitions() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.postGreSqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(PostGreSqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"child_schema", "child"}; + int[] types = {Types.VARCHAR, Types.VARCHAR}; + Object[][] values = {{}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.postGreSqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + Assert.assertEquals(1, getTableLayoutResponse.getPartitions().getRowCount()); + + List expectedValues = new ArrayList<>(); + for (int i = 0; i < getTableLayoutResponse.getPartitions().getRowCount(); i++) { + expectedValues.add(BlockUtils.rowToString(getTableLayoutResponse.getPartitions(), i)); + } + Assert.assertEquals(expectedValues, Collections.singletonList("[partition_schema_name : *], [partition_name : *]")); + + SchemaBuilder expectedSchemaBuilder = SchemaBuilder.newBuilder(); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder(PostGreSqlMetadataHandler.BLOCK_PARTITION_SCHEMA_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + expectedSchemaBuilder.addField(FieldBuilder.newBuilder(PostGreSqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME, org.apache.arrow.vector.types.Types.MinorType.VARCHAR.getType()).build()); + Schema expectedSchema = expectedSchemaBuilder.build(); + Assert.assertEquals(expectedSchema, getTableLayoutResponse.getPartitions().getSchema()); + Assert.assertEquals(tableName, getTableLayoutResponse.getTableName()); + + Mockito.verify(preparedStatement, Mockito.times(1)).setString(1, tableName.getSchemaName()); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(2, tableName.getTableName()); + } + + @Test(expected = RuntimeException.class) + public void doGetTableLayoutWithSQLException() + throws Exception + { + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.postGreSqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + Connection connection = Mockito.mock(Connection.class, Mockito.RETURNS_DEEP_STUBS); + JdbcConnectionFactory jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + Mockito.when(jdbcConnectionFactory.getConnection(Mockito.any(JdbcCredentialProvider.class))).thenReturn(connection); + Mockito.when(connection.getMetaData().getSearchStringEscape()).thenThrow(new SQLException()); + PostGreSqlMetadataHandler postGreSqlMetadataHandler = new PostGreSqlMetadataHandler(databaseConnectionConfig, this.secretsManager, this.athena, jdbcConnectionFactory); + + postGreSqlMetadataHandler.doGetTableLayout(Mockito.mock(BlockAllocator.class), getTableLayoutRequest); + } + + @Test + public void doGetSplits() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.postGreSqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(PostGreSqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"child_schema", "child"}; + int[] types = {Types.VARCHAR, Types.VARCHAR}; + Object[][] values = {{"s0", "p0"}, {"s1", "p1"}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.postGreSqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + BlockAllocator splitBlockAllocator = new BlockAllocatorImpl(); + GetSplitsRequest getSplitsRequest = new GetSplitsRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, getTableLayoutResponse.getPartitions(), new ArrayList<>(partitionCols), constraints, null); + GetSplitsResponse getSplitsResponse = this.postGreSqlMetadataHandler.doGetSplits(splitBlockAllocator, getSplitsRequest); + + Set> expectedSplits = new HashSet<>(); + expectedSplits.add(ImmutableMap.of("partition_schema_name", "s0", "partition_name", "p0")); + expectedSplits.add(ImmutableMap.of("partition_schema_name", "s1", "partition_name", "p1")); + Assert.assertEquals(expectedSplits.size(), getSplitsResponse.getSplits().size()); + Set> actualSplits = getSplitsResponse.getSplits().stream().map(Split::getProperties).collect(Collectors.toSet()); + Assert.assertEquals(expectedSplits, actualSplits); + } + + @Test + public void doGetSplitsContinuation() + throws Exception + { + BlockAllocator blockAllocator = new BlockAllocatorImpl(); + Constraints constraints = Mockito.mock(Constraints.class); + TableName tableName = new TableName("testSchema", "testTable"); + Schema partitionSchema = this.postGreSqlMetadataHandler.getPartitionSchema("testCatalogName"); + Set partitionCols = partitionSchema.getFields().stream().map(Field::getName).collect(Collectors.toSet()); + GetTableLayoutRequest getTableLayoutRequest = new GetTableLayoutRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, constraints, partitionSchema, partitionCols); + + PreparedStatement preparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(PostGreSqlMetadataHandler.GET_PARTITIONS_QUERY)).thenReturn(preparedStatement); + + String[] columns = {"child_schema", "child"}; + int[] types = {Types.VARCHAR, Types.VARCHAR}; + Object[][] values = {{"s0", "p0"}, {"s1", "p1"}}; + ResultSet resultSet = mockResultSet(columns, types, values, new AtomicInteger(-1)); + final String expectedQuery = String.format(PostGreSqlMetadataHandler.GET_PARTITIONS_QUERY, tableName.getTableName(), tableName.getSchemaName()); + Mockito.when(preparedStatement.executeQuery()).thenReturn(resultSet); + + Mockito.when(this.connection.getMetaData().getSearchStringEscape()).thenReturn(null); + + GetTableLayoutResponse getTableLayoutResponse = this.postGreSqlMetadataHandler.doGetTableLayout(blockAllocator, getTableLayoutRequest); + + BlockAllocator splitBlockAllocator = new BlockAllocatorImpl(); + GetSplitsRequest getSplitsRequest = new GetSplitsRequest(this.federatedIdentity, "testQueryId", "testCatalogName", tableName, getTableLayoutResponse.getPartitions(), new ArrayList<>(partitionCols), constraints, "1"); + GetSplitsResponse getSplitsResponse = this.postGreSqlMetadataHandler.doGetSplits(splitBlockAllocator, getSplitsRequest); + + Set> expectedSplits = new HashSet<>(); + expectedSplits.add(ImmutableMap.of("partition_schema_name", "s1", "partition_name", "p1")); + Assert.assertEquals(expectedSplits.size(), getSplitsResponse.getSplits().size()); + Set> actualSplits = getSplitsResponse.getSplits().stream().map(Split::getProperties).collect(Collectors.toSet()); + Assert.assertEquals(expectedSplits, actualSplits); + } +} diff --git a/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlRecordHandlerTest.java b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlRecordHandlerTest.java new file mode 100644 index 0000000000..7de4d667fe --- /dev/null +++ b/athena-jdbc/src/test/java/com/amazonaws/connectors/athena/jdbc/postgresql/PostGreSqlRecordHandlerTest.java @@ -0,0 +1,172 @@ +/*- + * #%L + * athena-jdbc + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.connectors.athena.jdbc.postgresql; + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Marker; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.connectors.athena.jdbc.TestBase; +import com.amazonaws.connectors.athena.jdbc.connection.DatabaseConnectionConfig; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcConnectionFactory; +import com.amazonaws.connectors.athena.jdbc.connection.JdbcCredentialProvider; +import com.amazonaws.connectors.athena.jdbc.manager.JdbcSplitQueryBuilder; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableList; +import com.google.common.collect.ImmutableMap; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.Assert; +import org.junit.Before; +import org.junit.Test; +import org.mockito.Mockito; + +import java.sql.Connection; +import java.sql.PreparedStatement; +import java.sql.SQLException; +import java.util.Collections; + +public class PostGreSqlRecordHandlerTest extends TestBase +{ + private PostGreSqlRecordHandler postGreSqlRecordHandler; + private Connection connection; + private JdbcConnectionFactory jdbcConnectionFactory; + private JdbcSplitQueryBuilder jdbcSplitQueryBuilder; + private AmazonS3 amazonS3; + private AWSSecretsManager secretsManager; + private AmazonAthena athena; + + @Before + public void setup() + { + this.amazonS3 = Mockito.mock(AmazonS3.class); + this.secretsManager = Mockito.mock(AWSSecretsManager.class); + this.athena = Mockito.mock(AmazonAthena.class); + this.connection = Mockito.mock(Connection.class); + this.jdbcConnectionFactory = Mockito.mock(JdbcConnectionFactory.class); + Mockito.when(this.jdbcConnectionFactory.getConnection(Mockito.mock(JdbcCredentialProvider.class))).thenReturn(this.connection); + jdbcSplitQueryBuilder = new PostGreSqlQueryStringBuilder("\""); + final DatabaseConnectionConfig databaseConnectionConfig = new DatabaseConnectionConfig("testCatalog", JdbcConnectionFactory.DatabaseEngine.MYSQL, + "mysql://jdbc:mysql://hostname/user=A&password=B"); + + this.postGreSqlRecordHandler = new PostGreSqlRecordHandler(databaseConnectionConfig, amazonS3, secretsManager, athena, jdbcConnectionFactory, jdbcSplitQueryBuilder); + } + + @Test + public void buildSplitSql() + throws SQLException + { + TableName tableName = new TableName("testSchema", "testTable"); + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol1", Types.MinorType.INT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol2", Types.MinorType.VARCHAR.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol3", Types.MinorType.BIGINT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol4", Types.MinorType.FLOAT4.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol5", Types.MinorType.SMALLINT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol6", Types.MinorType.TINYINT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol7", Types.MinorType.FLOAT8.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("testCol8", Types.MinorType.BIT.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("partition_schema_name", Types.MinorType.VARCHAR.getType()).build()); + schemaBuilder.addField(FieldBuilder.newBuilder("partition_name", Types.MinorType.VARCHAR.getType()).build()); + Schema schema = schemaBuilder.build(); + + Split split = Mockito.mock(Split.class); + Mockito.when(split.getProperties()).thenReturn(ImmutableMap.of("partition_schema_name", "s0", "partition_name", "p0")); + Mockito.when(split.getProperty(Mockito.eq(PostGreSqlMetadataHandler.BLOCK_PARTITION_SCHEMA_COLUMN_NAME))).thenReturn("s0"); + Mockito.when(split.getProperty(Mockito.eq(PostGreSqlMetadataHandler.BLOCK_PARTITION_COLUMN_NAME))).thenReturn("p0"); + + Range range1a = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range1a.isSingleValue()).thenReturn(true); + Mockito.when(range1a.getLow().getValue()).thenReturn(1); + Range range1b = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range1b.isSingleValue()).thenReturn(true); + Mockito.when(range1b.getLow().getValue()).thenReturn(2); + ValueSet valueSet1 = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(valueSet1.getRanges().getOrderedRanges()).thenReturn(ImmutableList.of(range1a, range1b)); + + ValueSet valueSet2 = getRangeSet(Marker.Bound.EXACTLY, "1", Marker.Bound.BELOW, "10"); + ValueSet valueSet3 = getRangeSet(Marker.Bound.ABOVE, 2L, Marker.Bound.EXACTLY, 20L); + ValueSet valueSet4 = getSingleValueSet(1.1F); + ValueSet valueSet5 = getSingleValueSet(1); + ValueSet valueSet6 = getSingleValueSet(0); + ValueSet valueSet7 = getSingleValueSet(1.2d); + ValueSet valueSet8 = getSingleValueSet(true); + + Constraints constraints = Mockito.mock(Constraints.class); + Mockito.when(constraints.getSummary()).thenReturn(new ImmutableMap.Builder() + .put("testCol1", valueSet1) + .put("testCol2", valueSet2) + .put("testCol3", valueSet3) + .put("testCol4", valueSet4) + .put("testCol5", valueSet5) + .put("testCol6", valueSet6) + .put("testCol7", valueSet7) + .put("testCol8", valueSet8) + .build()); + + String expectedSql = "SELECT \"testCol1\", \"testCol2\", \"testCol3\", \"testCol4\", \"testCol5\", \"testCol6\", \"testCol7\", \"testCol8\" FROM \"s0\".\"p0\" WHERE (\"testCol1\" IN (?,?)) AND ((\"testCol2\" >= ? AND \"testCol2\" < ?)) AND ((\"testCol3\" > ? AND \"testCol3\" <= ?)) AND (\"testCol4\" = ?) AND (\"testCol5\" = ?) AND (\"testCol6\" = ?) AND (\"testCol7\" = ?) AND (\"testCol8\" = ?)"; + PreparedStatement expectedPreparedStatement = Mockito.mock(PreparedStatement.class); + Mockito.when(this.connection.prepareStatement(Mockito.eq(expectedSql))).thenReturn(expectedPreparedStatement); + + PreparedStatement preparedStatement = this.postGreSqlRecordHandler.buildSplitSql(this.connection, "testCatalogName", tableName, schema, constraints, split); + + Assert.assertEquals(expectedPreparedStatement, preparedStatement); + Mockito.verify(preparedStatement, Mockito.times(1)).setInt(1, 1); + Mockito.verify(preparedStatement, Mockito.times(1)).setInt(2, 2); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(3, "1"); + Mockito.verify(preparedStatement, Mockito.times(1)).setString(4, "10"); + Mockito.verify(preparedStatement, Mockito.times(1)).setLong(5, 2L); + Mockito.verify(preparedStatement, Mockito.times(1)).setLong(6, 20L); + Mockito.verify(preparedStatement, Mockito.times(1)).setFloat(7, 1.1F); + Mockito.verify(preparedStatement, Mockito.times(1)).setShort(8, (short) 1); + Mockito.verify(preparedStatement, Mockito.times(1)).setByte(9, (byte) 0); + Mockito.verify(preparedStatement, Mockito.times(1)).setDouble(10, 1.2d); + Mockito.verify(preparedStatement, Mockito.times(1)).setBoolean(11, true); + } + + private ValueSet getSingleValueSet(Object value) { + Range range = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range.isSingleValue()).thenReturn(true); + Mockito.when(range.getLow().getValue()).thenReturn(value); + ValueSet valueSet = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(valueSet.getRanges().getOrderedRanges()).thenReturn(Collections.singletonList(range)); + return valueSet; + } + + private ValueSet getRangeSet(Marker.Bound lowerBound, Object lowerValue, Marker.Bound upperBound, Object upperValue) { + Range range = Mockito.mock(Range.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(range.isSingleValue()).thenReturn(false); + Mockito.when(range.getLow().getBound()).thenReturn(lowerBound); + Mockito.when(range.getLow().getValue()).thenReturn(lowerValue); + Mockito.when(range.getHigh().getBound()).thenReturn(upperBound); + Mockito.when(range.getHigh().getValue()).thenReturn(upperValue); + ValueSet valueSet = Mockito.mock(SortedRangeSet.class, Mockito.RETURNS_DEEP_STUBS); + Mockito.when(valueSet.getRanges().getOrderedRanges()).thenReturn(Collections.singletonList(range)); + return valueSet; + } +} diff --git a/athena-redis/LICENSE.txt b/athena-redis/LICENSE.txt new file mode 100644 index 0000000000..418de4c108 --- /dev/null +++ b/athena-redis/LICENSE.txt @@ -0,0 +1,174 @@ +Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. \ No newline at end of file diff --git a/athena-redis/README.md b/athena-redis/README.md new file mode 100644 index 0000000000..3580770376 --- /dev/null +++ b/athena-redis/README.md @@ -0,0 +1,70 @@ +# Amazon Athena Redis Connector + +This connector enables Amazon Athena to communicate with your Redis instance(s), making your Redis data accessible via SQL. + +Unlike traditional relational data stores, Redis does not have the concept of a table or a column. Instead, Redis offers key-value access patterns where the key is essentially a 'string' and the value is one of: string, z-set, hmap. The Athena Redis Connector allows you to configure virtual tables using the Glue Data Catalog for schema and special table properties to tell the Athena Redis Connector how to map your Redis key-values into a table. You can read more on this below in the 'Setting Up Tables Section'. + + +## Usage + +### Parameters + +The Athena Redis Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) +5. **glue_catalog** - (Optional) Can be used to target a cross-account Glue catalog. By default the connector will attempt to get metadata from its own Glue account. + +### Setting Up Databases & Tables + +To enable a Glue Table for use with Redis, you can set the following properties on the Table. redis-endpoint , redis-value-type, and one of redis-keys-zset or redis-key-prefix. Also note that any Glue database which may contain redis tables should have "redis-db-flag" somewhere in the URI property of the Database. You can set this from the Glue Console by editing the database. + +1. **redis-endpoint** - The hostname:port:password of the redis server that data for this table should come from. (e.g. athena-federation-demo.cache.amazonaws.com:6379) Alternatively, you can store the endpoint or part of the endpoint in SecretsManager by using ${secret_name} as the table property value. +2. **redis-keys-zset** - A comma separated list of keys whose value is a zset. Each of the values in the zset is then treated as a key that is part of this table. You must set either this or redis-key-prefix. (e.g. active-orders,pending-orders) +3. **redis-key-prefix** - A comma separated list of key prefixes to scan for values that should be part of this table. You must set either this or redis-keys-zset on the table. (e.g. accounts-*,acct-) +4. **redis-value-type** - (required) Defines how the value for the keys defined by either redis-key-prefix or redis-keys-zset will be mapped to your table. literal maps to a single column. zset also maps to a single column but each key can essentially store N rows. hash allows for each key to be a row with multiple columns. (e.g. hash or literal or zset) + +### Data Types + +All Redis values are retrieved as the basic String data type. From there they are converted to one of the below Apache Arrow data types used by the Athena Query Federation SDK based on how you've defined your table(s) in Glue's DataCatalog. + +|Glue DataType|Apache Arrow Type| +|-------------|-----------------| +|int|INT| +|string|VARCHAR| +|bigint|BIGINT| +|double|FLOAT8| +|float|FLOAT4| +|smallint|SMALLINT| +|tinyint|TINYINT| +|boolean|BIT| +|binary|VARBINARY| + +### Required Permissions + +Review the "Policies" section of the athena-redis.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +2. SecretsManager Read Access - If you choose to store redis-endpoint details in SecretsManager you will need to grant the connector access to those secrets. +3. Glue Data Catalog - Since Redis does not have a meta-data store, the connector requires Read-Only access to Glue's DataCatalog for obtaining Redis key to table/column mappings. +4. VPC Access - In order to connect to your VPC for the purposes of communicating with your Redis instance(s), the connector needs the ability to attach/detach an interface to the VPC. +5. CloudWatch Logs - This is a somewhat implicit permission when deploying a Lambda function but it needs access to cloudwatch logs for storing logs. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use the Amazon Athena Redis Connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-redis dir, run `mvn clean install`. +3. From the athena-redis dir, run `../tools/publish.sh S3_BUCKET_NAME athena-redis` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) + +## Performance + +The Athena Redis Connector will attempt to parallelize queries against your Redis instance depending on the type of table you've defined (zset keys vs. prefix keys). Predicate Pushdown is performed within the Lambda function. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-redis/athena-redis.yaml b/athena-redis/athena-redis.yaml new file mode 100644 index 0000000000..c2972bddff --- /dev/null +++ b/athena-redis/athena-redis.yaml @@ -0,0 +1,93 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaRedisConnector + Description: 'This connector enables Amazon Athena to communicate with your Redis instance(s), making your Redis data accessible via SQL.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + Default: athena-federation-spill + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String + SecurityGroupIds: + Description: 'One or more SecurityGroup IDs corresponding to the SecurityGroup that should be applied to the Lambda function. (e.g. sg1,sg2,sg3)' + Type: 'List' + SubnetIds: + Description: 'One or more Subnet IDs corresponding to the Subnet that the Lambda function can use to access you data source. (e.g. subnet1,subnet2)' + Type: 'List' + SecretNameOrPrefix: + Description: 'The name or prefix of a set of names within Secrets Manager that this function should have access to. (e.g. redis-*).' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.redis.RedisCompositeHandler" + CodeUri: "./target/athena-redis-1.0.jar" + Description: "Enables Amazon Athena to communicate with Redis, making your Redis data accessible via SQL" + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - secretsmanager:GetSecretValue + Effect: Allow + Resource: !Sub 'arn:aws:secretsmanager:*:*:secret:${SecretNameOrPrefix}' + Version: '2012-10-17' + - Statement: + - Action: + - glue:GetTableVersions + - glue:GetPartitions + - glue:GetTables + - glue:GetTableVersion + - glue:GetDatabases + - glue:GetTable + - glue:GetPartition + - glue:GetDatabase + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket + #VPCAccessPolicy allows our connector to run in a VPC so that it can access your data source. + - VPCAccessPolicy: {} + VpcConfig: + SecurityGroupIds: !Ref SecurityGroupIds + SubnetIds: !Ref SubnetIds \ No newline at end of file diff --git a/athena-redis/pom.xml b/athena-redis/pom.xml new file mode 100644 index 0000000000..eec0b851d7 --- /dev/null +++ b/athena-redis/pom.xml @@ -0,0 +1,61 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-redis + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + redis.clients + jedis + 3.0.0 + + + com.amazonaws + aws-java-sdk-glue + 1.11.490 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/JedisPoolFactory.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/JedisPoolFactory.java new file mode 100644 index 0000000000..dddbac9dec --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/JedisPoolFactory.java @@ -0,0 +1,87 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import redis.clients.jedis.Jedis; +import redis.clients.jedis.JedisPool; +import redis.clients.jedis.JedisPoolConfig; + +import java.util.HashMap; +import java.util.Map; + +/** + * Creates and Caches JedisPool Instances, using the connection string as the cache key. + * + * @Note Connection String format is expected to be host:port or host:port:password_token + */ +public class JedisPoolFactory +{ + private static final Logger logger = LoggerFactory.getLogger(JedisPoolFactory.class); + + //Realistically we wouldn't need more than 1 but using 4 to give the pool some wiggle room for + //connections that are dying / starting to avoid impacting getting a connection quickly. + private static final int MAX_CONS = 4; + private static final int CONNECTION_TIMEOUT_MS = 2_000; + + private final Map clientCache = new HashMap<>(); + + /** + * Gets or Creates a Jedis instance for the given connection string. + * @param conStr Redis connection details, format is expected to be host:port or host:port:password_token + * @return A Jedis connection if the connection succeeded, else the function will throw. + */ + public synchronized Jedis getOrCreateConn(String conStr) + { + JedisPool pool = clientCache.get(conStr); + if (pool == null) { + String[] endpointParts = conStr.split(":"); + if (endpointParts.length == 2) { + pool = getOrCreateCon(endpointParts[0], Integer.valueOf(endpointParts[1])); + } + else if (endpointParts.length == 3) { + pool = getOrCreateCon(endpointParts[0], Integer.valueOf(endpointParts[1]), endpointParts[2]); + } + else { + throw new IllegalArgumentException("Redis endpoint format error."); + } + + clientCache.put(conStr, pool); + } + return pool.getResource(); + } + + private JedisPool getOrCreateCon(String host, int port) + { + logger.info("getOrCreateCon: Creating connection pool."); + JedisPoolConfig poolConfig = new JedisPoolConfig(); + poolConfig.setMaxTotal(MAX_CONS); + return new JedisPool(poolConfig, host, port, CONNECTION_TIMEOUT_MS); + } + + private JedisPool getOrCreateCon(String host, int port, String passwordToken) + { + logger.info("getOrCreateCon: Creating connection pool with password."); + JedisPoolConfig poolConfig = new JedisPoolConfig(); + poolConfig.setMaxTotal(MAX_CONS); + return new JedisPool(poolConfig, host, port, CONNECTION_TIMEOUT_MS, passwordToken); + } +} diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/KeyType.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/KeyType.java new file mode 100644 index 0000000000..412ff86b66 --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/KeyType.java @@ -0,0 +1,74 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import java.util.HashMap; +import java.util.Map; + +/** + * Defines the support key types that can be used to define the keys that comprise a Redis table in glue. + */ +public enum KeyType +{ + /** + * Indicates that the KeyType is a prefix and so all Redis keys matching this prefix are in scope for the Table. + */ + PREFIX("prefix"), + + /** + * Indicates that the KeyType is a zset and so all Keys that match the value with be zsets and as such we + * should take all the values in those keys and treat them as keys that are in scope for the Table. + * + * For example: my_key_list is a a key which points to a zset that contains: key1, key2, key3. So when I query + * this table. We lookup my_key_list and for each value (key1, key2, key3) in that zset we lookup + * the value. So our table contains the values stored at key1, key2, key3. + */ + ZSET("zset"); + + private static final Map TYPE_MAP = new HashMap<>(); + + static { + for (KeyType next : KeyType.values()) { + TYPE_MAP.put(next.id, next); + } + } + + private String id; + + KeyType(String id) + { + this.id = id; + } + + public String getId() + { + return id; + } + + public static KeyType fromId(String id) + { + KeyType result = TYPE_MAP.get(id); + if (result == null) { + throw new IllegalArgumentException("Unknown KeyType for id: " + id); + } + + return result; + } +} diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisCompositeHandler.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisCompositeHandler.java new file mode 100644 index 0000000000..b09fb81493 --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisCompositeHandler.java @@ -0,0 +1,35 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +/** + * Boilerplate composite handler that allows us to use a single Lambda function for both + * Metadata and Data. In this case we just compose RedisMetadataHandler and RedisRecordHandler. + */ +public class RedisCompositeHandler + extends CompositeHandler +{ + public RedisCompositeHandler() + { + super(new RedisMetadataHandler(), new RedisRecordHandler()); + } +} diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisMetadataHandler.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisMetadataHandler.java new file mode 100644 index 0000000000..ae5f777f47 --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisMetadataHandler.java @@ -0,0 +1,367 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.handlers.GlueMetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.glue.DefaultGlueType; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.glue.AWSGlueClientBuilder; +import com.amazonaws.services.glue.model.Database; +import com.amazonaws.services.glue.model.Table; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.complex.reader.VarCharReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.util.Text; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import redis.clients.jedis.Jedis; +import redis.clients.jedis.ScanParams; +import redis.clients.jedis.ScanResult; + +import java.util.Arrays; +import java.util.HashSet; +import java.util.Map; +import java.util.Set; + +import static redis.clients.jedis.ScanParams.SCAN_POINTER_START; + +/** + * Handles metadata requests for the Athena Redis Connector using Glue for schema. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Uses Glue table properties (redis-endpoint, redis-value-type, redis-key-prefix, and redis-keys-zset) to + * provide schema as well as connectivity details to Redis. + * 2. Attempts to resolve sensitive fields such as redis-endpoint via SecretsManager so that you can substitute + * variables with values from by doing something like hostname:port:password=${my_secret} + */ +public class RedisMetadataHandler + extends GlueMetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(RedisMetadataHandler.class); + + private static final String SOURCE_TYPE = "redis"; + private static final String END_CURSOR = "0"; + //Controls the max splits to generate, relevant keys are spread across this many splits where possible. + private static final long REDIS_MAX_SPLITS = 10; + //The page size for Jedis scans. + private static final int SCAN_COUNT_SIZE = 100; + protected static final String KEY_COLUMN_NAME = "_key_"; + protected static final String SPLIT_START_INDEX = "start-index"; + protected static final String SPLIT_END_INDEX = "end-index"; + + //Defines the table property name used to set the Redis Key Type for the table. (e.g. prefix, zset) + protected static final String KEY_TYPE = "redis-key-type"; + //Defines the table property name used to set the Redis value type for the table. (e.g. liternal, zset, hash) + protected static final String VALUE_TYPE_TABLE_PROP = "redis-value-type"; + //Defines the table property name used to configure one or more key prefixes to include in the + //table (e.g. key-prefix-1-*, key-prefix-2-*) + protected static final String KEY_PREFIX_TABLE_PROP = "redis-key-prefix"; + //Defines the table property name used to configure one or more zset keys whos values should be used as keys + //to include in the table. + protected static final String ZSET_KEYS_TABLE_PROP = "redis-keys-zset"; + protected static final String KEY_PREFIX_SEPERATOR = ","; + //Defines the table property name used to configure the redis enpoint to query for the data in that table. + //Connection String format is expected to be host:port or host:port:password_token + protected static final String REDIS_ENDPOINT_PROP = "redis-endpoint"; + //Defines the value that should be present in the Glue Database URI to enable the DB for Redis. + protected static final String REDIS_DB_FLAG = "redis-db-flag"; + + //Used to filter out Glue tables which lack a redis endpoint. + private static final TableFilter TABLE_FILTER = (Table table) -> table.getParameters().containsKey(REDIS_ENDPOINT_PROP); + //Used to filter out Glue databases which lack the REDIS_DB_FLAG in the URI. + private static final DatabaseFilter DB_FILTER = (Database database) -> (database.getLocationUri() != null && database.getLocationUri().contains(REDIS_DB_FLAG)); + + private final AWSGlue awsGlue; + private final JedisPoolFactory jedisPoolFactory; + + public RedisMetadataHandler() + { + super(AWSGlueClientBuilder.standard().build(), SOURCE_TYPE); + this.awsGlue = getAwsGlue(); + this.jedisPoolFactory = new JedisPoolFactory(); + } + + @VisibleForTesting + protected RedisMetadataHandler(AWSGlue awsGlue, + EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + JedisPoolFactory jedisPoolFactory, + String spillBucket, + String spillPrefix) + { + super(awsGlue, keyFactory, secretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + this.awsGlue = awsGlue; + this.jedisPoolFactory = jedisPoolFactory; + } + + /** + * Used to obtain a Redis client connection for the provided endpoint. + * + * @param rawEndpoint The value from the REDIS_ENDPOINT_PROP on the table being queried. + * @return A Jedis client connection. + * @notes This method first attempts to resolve any secrets (noted by ${secret_name}) using SecretsManager. + */ + private Jedis getOrCreateClient(String rawEndpoint) + { + String endpoint = resolveSecrets(rawEndpoint); + return jedisPoolFactory.getOrCreateConn(endpoint); + } + + /** + * @see GlueMetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator blockAllocator, ListSchemasRequest request) + throws Exception + { + return doListSchemaNames(blockAllocator, request, DB_FILTER); + } + + /** + * @see GlueMetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator blockAllocator, ListTablesRequest request) + throws Exception + { + return super.doListTables(blockAllocator, request, TABLE_FILTER); + } + + /** + * Retrieves the schema for the request Table from Glue then enriches that result with Redis specific + * metadata and columns. + */ + @Override + public GetTableResponse doGetTable(BlockAllocator blockAllocator, GetTableRequest request) + throws Exception + { + GetTableResponse response = super.doGetTable(blockAllocator, request); + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + response.getSchema().getFields().forEach((Field field) -> + schemaBuilder.addField(field.getName(), field.getType(), field.getChildren()) + ); + + response.getSchema().getCustomMetadata().entrySet().forEach((Map.Entry meta) -> + schemaBuilder.addMetadata(meta.getKey(), meta.getValue())); + + schemaBuilder.addField(KEY_COLUMN_NAME, Types.MinorType.VARCHAR.getType()); + + return new GetTableResponse(response.getCatalogName(), response.getTableName(), schemaBuilder.build()); + } + + @Override + public void enhancePartitionSchema(SchemaBuilder partitionSchemaBuilder, GetTableLayoutRequest request) + { + partitionSchemaBuilder.addStringField(REDIS_ENDPOINT_PROP) + .addStringField(VALUE_TYPE_TABLE_PROP) + .addStringField(KEY_PREFIX_TABLE_PROP) + .addStringField(ZSET_KEYS_TABLE_PROP); + } + + /** + * Even though our table doesn't support complex layouts or partitioning, we need to convey that there is at least + * 1 partition to read as part of the query or Athena will assume partition pruning found no candidate layouts to read. + * We also use this 1 partition to carry settings that we will need in order to generate splits. + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + Map properties = request.getSchema().getCustomMetadata(); + blockWriter.writeRows((Block block, int rowNum) -> { + block.setValue(REDIS_ENDPOINT_PROP, rowNum, properties.get(REDIS_ENDPOINT_PROP)); + block.setValue(VALUE_TYPE_TABLE_PROP, rowNum, properties.get(VALUE_TYPE_TABLE_PROP)); + block.setValue(KEY_PREFIX_TABLE_PROP, rowNum, properties.get(KEY_PREFIX_TABLE_PROP)); + block.setValue(ZSET_KEYS_TABLE_PROP, rowNum, properties.get(ZSET_KEYS_TABLE_PROP)); + return 1; + }); + } + + /** + * If the table is comprised of multiple key prefixes, then we parallelize those by making them each a split. + * + * @note This function essentially takes each key-prefix and makes it a split. For zset keys, it breaks each zset + * into a max of N split that we have configured to generate as defined by REDIS_MAX_SPLITS. + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator blockAllocator, GetSplitsRequest request) + { + if (request.getPartitions().getRowCount() != 1) { + throw new RuntimeException("Unexpected number of partitions encountered."); + } + + Block partitions = request.getPartitions(); + String redisEndpoint = getValue(partitions, 0, REDIS_ENDPOINT_PROP); + String redisValueType = getValue(partitions, 0, VALUE_TYPE_TABLE_PROP); + + logger.info("doGetSplits: Preparing splits for {}", BlockUtils.rowToString(partitions, 0)); + + KeyType keyType = null; + Set splitInputs = new HashSet<>(); + + String keyPrefix = getValue(partitions, 0, KEY_PREFIX_TABLE_PROP); + if (keyPrefix != null) { + //Add the prefixes to the list and set the key type. + splitInputs.addAll(Arrays.asList(keyPrefix.split(KEY_PREFIX_SEPERATOR))); + keyType = KeyType.PREFIX; + } + else { + String[] partitionPrefixes = getValue(partitions, 0, ZSET_KEYS_TABLE_PROP).split(KEY_PREFIX_SEPERATOR); + + ScanResult keyCursor = null; + //Add all the values in the ZSETs ad keys to scan + for (String next : partitionPrefixes) { + do { + keyCursor = loadKeys(redisEndpoint, next, keyCursor, splitInputs); + } + while (keyCursor != null && !END_CURSOR.equals(keyCursor.getCursor())); + } + keyType = KeyType.ZSET; + } + + Set splits = new HashSet<>(); + for (String next : splitInputs) { + splits.addAll(makeSplits(request, redisEndpoint, next, keyType, redisValueType)); + } + + return new GetSplitsResponse(request.getCatalogName(), splits, null); + } + + /** + * For a given key prefix this method attempts to break up all the matching keys into N buckets (aka N splits). + * + * @param request + * @param endpoint The redis endpoint to query. + * @param keyPrefix The key prefix to scan. + * @param keyType The KeyType (prefix or zset). + * @param valueType The ValueType, used for mapping the values stored at each key to a result row when the split is processed. + * @return A Set of splits to optionally parallelize reading the values associated with the keyPrefix. + */ + private Set makeSplits(GetSplitsRequest request, String endpoint, String keyPrefix, KeyType keyType, String valueType) + { + Set splits = new HashSet<>(); + long numberOfKeys = 1; + + if (keyType == KeyType.ZSET) { + try (Jedis client = getOrCreateClient(endpoint)) { + numberOfKeys = client.zcount(keyPrefix, "-inf", "+inf"); + logger.info("makeSplits: ZCOUNT[{}] found [{}]", keyPrefix, numberOfKeys); + } + } + + long stride = (numberOfKeys > REDIS_MAX_SPLITS) ? 1 + (numberOfKeys / REDIS_MAX_SPLITS) : numberOfKeys; + + for (long startIndex = 0; startIndex < numberOfKeys; startIndex += stride) { + long endIndex = startIndex + stride - 1; + if (endIndex >= numberOfKeys) { + endIndex = -1; + } + + //Every split must have a unique location if we wish to spill to avoid failures + SpillLocation spillLocation = makeSpillLocation(request); + + Split split = Split.newBuilder(spillLocation, makeEncryptionKey()) + .add(KEY_PREFIX_TABLE_PROP, keyPrefix) + .add(KEY_TYPE, keyType.getId()) + .add(VALUE_TYPE_TABLE_PROP, valueType) + .add(REDIS_ENDPOINT_PROP, endpoint) + .add(SPLIT_START_INDEX, String.valueOf(startIndex)) + .add(SPLIT_END_INDEX, String.valueOf(endIndex)) + .build(); + + splits.add(split); + + logger.info("makeSplits: Split[{}]", split); + } + + return splits; + } + + /** + * For the given zset prefix, find all values and treat each of those values are a key to scan before returning + * the scan continuation token. + * + * @param connStr The Jedis connection string for the table. + * @param prefix The zset key prefix to scan. + * @param redisCursor The previous Redis cursor (aka continuation token). + * @param keys The collections of keys we collected so far. Any new keys we find are added to this. + * @return The Redis cursor to use when continuing the scan. + */ + private ScanResult loadKeys(String connStr, String prefix, ScanResult redisCursor, Set keys) + { + try (Jedis client = getOrCreateClient(connStr)) { + String cursor = (redisCursor == null) ? SCAN_POINTER_START : redisCursor.getCursor(); + ScanParams scanParam = new ScanParams(); + scanParam.count(SCAN_COUNT_SIZE); + scanParam.match(prefix); + + ScanResult newCursor = client.scan(cursor, scanParam); + keys.addAll(newCursor.getResult()); + return newCursor; + } + } + + /** + * Overrides the default Glue Type to Apache Arrow Type mapping so that we can fail fast on tables which define + * types that are not supported by this connector. + */ + @Override + protected Field convertField(String name, String type) + { + return FieldBuilder.newBuilder(name, DefaultGlueType.fromId(type).getArrowType()).build(); + } + + private String getValue(Block block, int row, String fieldName) + { + VarCharReader reader = block.getFieldReader(fieldName); + reader.setPosition(row); + if (reader.isSet()) { + Text result = reader.readText(); + return (result == null) ? null : result.toString(); + } + + return null; + } +} diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisRecordHandler.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisRecordHandler.java new file mode 100644 index 0000000000..480640066f --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/RedisRecordHandler.java @@ -0,0 +1,259 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import redis.clients.jedis.Jedis; +import redis.clients.jedis.ScanParams; +import redis.clients.jedis.ScanResult; +import redis.clients.jedis.Tuple; + +import java.util.Collections; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.Set; +import java.util.concurrent.atomic.AtomicLong; +import java.util.stream.Collectors; + +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_COLUMN_NAME; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_PREFIX_TABLE_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_TYPE; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.REDIS_ENDPOINT_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.SPLIT_END_INDEX; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.SPLIT_START_INDEX; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.VALUE_TYPE_TABLE_PROP; +import static redis.clients.jedis.ScanParams.SCAN_POINTER_START; + +/** + * Handles data read record requests for the Athena Redis Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Supporting literal, zset, and hash value types. + * 2. Attempts to resolve sensitive configuration fields such as redis-endpoint via SecretsManager so that you can + * substitute variables with values from by doing something like hostname:port:password=${my_secret} + */ +public class RedisRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(RedisRecordHandler.class); + + private static final String SOURCE_TYPE = "redis"; + private static final String END_CURSOR = "0"; + + //The page size for Jedis scans. + private static final int SCAN_COUNT_SIZE = 100; + + private final JedisPoolFactory jedisPoolFactory; + private final AmazonS3 amazonS3; + + public RedisRecordHandler() + { + this(AmazonS3ClientBuilder.standard().build(), + AWSSecretsManagerClientBuilder.defaultClient(), + AmazonAthenaClientBuilder.defaultClient(), + new JedisPoolFactory()); + } + + @VisibleForTesting + protected RedisRecordHandler(AmazonS3 amazonS3, + AWSSecretsManager secretsManager, + AmazonAthena athena, + JedisPoolFactory jedisPoolFactory) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + this.amazonS3 = amazonS3; + this.jedisPoolFactory = jedisPoolFactory; + } + + /** + * Used to obtain a Redis client connection for the provided endpoint. + * + * @param rawEndpoint The value from the REDIS_ENDPOINT_PROP on the table being queried. + * @return A Jedis client connection. + * @notes This method first attempts to resolve any secrets (noted by ${secret_name}) using SecretsManager. + */ + private Jedis getOrCreateClient(String rawEndpoint) + { + String endpoint = resolveSecrets(rawEndpoint); + return jedisPoolFactory.getOrCreateConn(endpoint); + } + + /** + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + { + Split split = recordsRequest.getSplit(); + ScanResult keyCursor = null; + + final AtomicLong rowsMatched = new AtomicLong(0); + int numRows = 0; + do { + Set keys = new HashSet<>(); + //Load all the keys associated with this split + keyCursor = loadKeys(split, keyCursor, keys); + + //Scan the data associated with all the keys. + for (String nextKey : keys) { + if (!queryStatusChecker.isQueryRunning()) { + return; + } + try (Jedis client = getOrCreateClient(split.getProperty(REDIS_ENDPOINT_PROP))) { + ValueType valueType = ValueType.fromId(split.getProperty(VALUE_TYPE_TABLE_PROP)); + List fieldList = recordsRequest.getSchema().getFields().stream() + .filter((Field next) -> !KEY_COLUMN_NAME.equals(next.getName())).collect(Collectors.toList()); + + switch (valueType) { + case LITERAL: //The key value is a row with single column + loadLiteralRow(client, nextKey, spiller, fieldList); + break; + case HASH: + loadHashRow(client, nextKey, spiller, fieldList); + break; + case ZSET: + loadZSetRows(client, nextKey, spiller, fieldList); + break; + default: + throw new RuntimeException("Unsupported value type " + valueType); + } + } + } + } + while (keyCursor != null && !END_CURSOR.equals(keyCursor.getCursor())); + } + + /** + * For the given key prefix, find all actual keys depending on the type of the key. + * + * @param split The split for this request, mostly used to get the redis endpoint and config details. + * @param redisCursor The previous Redis cursor (aka continuation token). + * @param keys The collections of keys we collected so far. Any new keys we find are added to this. + * @return The Redis cursor to use when continuing the scan. + */ + private ScanResult loadKeys(Split split, ScanResult redisCursor, Set keys) + { + try (Jedis client = getOrCreateClient(split.getProperty(REDIS_ENDPOINT_PROP))) { + KeyType keyType = KeyType.fromId(split.getProperty(KEY_TYPE)); + String keyPrefix = split.getProperty(KEY_PREFIX_TABLE_PROP); + if (keyType == KeyType.ZSET) { + long start = Long.valueOf(split.getProperty(SPLIT_START_INDEX)); + long end = Long.valueOf(split.getProperty(SPLIT_END_INDEX)); + keys.addAll(client.zrange(keyPrefix, start, end)); + return new ScanResult(END_CURSOR, Collections.EMPTY_LIST); + } + else { + String cursor = (redisCursor == null) ? SCAN_POINTER_START : redisCursor.getCursor(); + ScanParams scanParam = new ScanParams(); + scanParam.count(SCAN_COUNT_SIZE); + scanParam.match(split.getProperty(KEY_PREFIX_TABLE_PROP)); + + ScanResult newCursor = client.scan(cursor, scanParam); + keys.addAll(newCursor.getResult()); + return newCursor; + } + } + } + + private void loadLiteralRow(Jedis client, String keyString, BlockSpiller spiller, List fieldList) + { + spiller.writeRows((Block block, int row) -> { + if (fieldList.size() != 1) { + throw new RuntimeException("Ambiguous field mapping, more than 1 field for literal value type."); + } + + Field field = fieldList.get(0); + Object value = ValueConverter.convert(field, client.get(keyString)); + boolean literalMatched = block.offerValue(KEY_COLUMN_NAME, row, keyString); + literalMatched &= block.offerValue(field.getName(), row, value); + return literalMatched ? 1 : 0; + }); + } + + private void loadHashRow(Jedis client, String keyString, BlockSpiller spiller, List fieldList) + { + spiller.writeRows((Block block, int row) -> { + boolean hashMatched = block.offerValue(KEY_COLUMN_NAME, row, keyString); + + Map rawValues = new HashMap<>(); + //Glue only supports lowercase column names / also could do a better job only fetching the columns + //that are needed + client.hgetAll(keyString).forEach((key, entry) -> rawValues.put(key.toLowerCase(), entry)); + + for (Field hfield : fieldList) { + Object hvalue = ValueConverter.convert(hfield, rawValues.get(hfield.getName())); + if (hashMatched && !block.offerValue(hfield.getName(), row, hvalue)) { + return 0; + } + } + + return 1; + }); + } + + private void loadZSetRows(Jedis client, String keyString, BlockSpiller spiller, List fieldList) + { + if (fieldList.size() != 1) { + throw new RuntimeException("Ambiguous field mapping, more than 1 field for ZSET value type."); + } + + Field zfield = fieldList.get(0); + String cursor = SCAN_POINTER_START; + do { + ScanResult result = client.zscan(keyString, cursor); + cursor = result.getCursor(); + for (Tuple nextElement : result.getResult()) { + spiller.writeRows((Block block, int rowNum) -> { + Object zvalue = ValueConverter.convert(zfield, nextElement.getElement()); + boolean zsetMatched = block.offerValue(KEY_COLUMN_NAME, rowNum, keyString); + zsetMatched &= block.offerValue(zfield.getName(), rowNum, zvalue); + return zsetMatched ? 1 : 0; + }); + } + } + while (cursor != null && !END_CURSOR.equals(cursor)); + } + + /** + * @param split The split for this request, mostly used to get the redis endpoint and config details. + * @param keyString The key to read. + * @param spiller The BlockSpiller to write results into. + * @param startPos The starting postion in the block + * @return The number of rows created in the result block. + */ +} diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/ValueConverter.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/ValueConverter.java new file mode 100644 index 0000000000..3225f1e0ef --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/ValueConverter.java @@ -0,0 +1,77 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; + +import java.io.UnsupportedEncodingException; + +/** + * Used to convert from Redis' native value/type system to the Apache Arrow type that was configured + * for the particular field. + */ +public class ValueConverter +{ + private ValueConverter() {} + + /** + * Allows for coercing types in the event that schema has evolved or there were other data issues. + * @param field The Apache Arrow field that the value belongs to. + * @param origVal The original value from Redis (before any conversion or coercion). + * @return The coerced value. + */ + public static Object convert(Field field, String origVal) + { + if (origVal == null) { + return origVal; + } + + ArrowType arrowType = field.getType(); + Types.MinorType minorType = Types.getMinorTypeForArrowType(arrowType); + + switch (minorType) { + case VARCHAR: + return origVal; + case INT: + case SMALLINT: + case TINYINT: + return Integer.valueOf(origVal); + case BIGINT: + return Long.valueOf(origVal); + case FLOAT8: + return Double.valueOf(origVal); + case FLOAT4: + return Float.valueOf(origVal); + case BIT: + return Boolean.valueOf(origVal); + case VARBINARY: + try { + return origVal.getBytes("UTF-8"); + } + catch (UnsupportedEncodingException ex) { + throw new RuntimeException(ex); + } + default: + throw new RuntimeException("Unsupported type conversation " + minorType + " field: " + field.getName()); + } + } +} diff --git a/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/ValueType.java b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/ValueType.java new file mode 100644 index 0000000000..636dd9b869 --- /dev/null +++ b/athena-redis/src/main/java/com/amazonaws/athena/connectors/redis/ValueType.java @@ -0,0 +1,74 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import java.util.HashMap; +import java.util.Map; + +/** + * Defines the supported value types that can be used to define a Redis table in Glue and thus mapped to rows. + */ +public enum ValueType +{ + /** + * The value is a single, literal value which requires no interpretation before conversion. + */ + LITERAL("literal"), + /** + * The value is actually a set of literal values and so we should treat the value as a list of rows, converting + * each value independently. + */ + ZSET("zset"), + /** + * The value is a single multi-column row and the values in the hash should be mapped to columns in the table but each + * value is still 1 row. + */ + HASH("hash"); + + private static final Map TYPE_MAP = new HashMap<>(); + + static { + for (ValueType next : ValueType.values()) { + TYPE_MAP.put(next.id, next); + } + } + + private String id; + + ValueType(String id) + { + this.id = id; + } + + public String getId() + { + return id; + } + + public static ValueType fromId(String id) + { + ValueType result = TYPE_MAP.get(id); + if (result == null) { + throw new IllegalArgumentException("Unknown ValueType for id: " + id); + } + + return result; + } +} diff --git a/athena-redis/src/test/java/com/amazonaws/athena/connectors/redis/RedisMetadataHandlerTest.java b/athena-redis/src/test/java/com/amazonaws/athena/connectors/redis/RedisMetadataHandlerTest.java new file mode 100644 index 0000000000..02b9c9e0b4 --- /dev/null +++ b/athena-redis/src/test/java/com/amazonaws/athena/connectors/redis/RedisMetadataHandlerTest.java @@ -0,0 +1,279 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.glue.AWSGlue; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import redis.clients.jedis.Jedis; +import redis.clients.jedis.ScanParams; +import redis.clients.jedis.ScanResult; + +import java.util.ArrayList; +import java.util.HashMap; +import java.util.HashSet; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_PREFIX_TABLE_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.REDIS_ENDPOINT_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.VALUE_TYPE_TABLE_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.ZSET_KEYS_TABLE_PROP; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.times; +import static org.mockito.Mockito.verify; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class RedisMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(RedisMetadataHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String endpoint = "${endpoint}"; + private String decodedEndpoint = "endpoint:123"; + private RedisMetadataHandler handler; + private BlockAllocator allocator; + + @Mock + private Jedis mockClient; + + @Mock + private AWSGlue mockGlue; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Mock + private JedisPoolFactory mockFactory; + + @Before + public void setUp() + throws Exception + { + when(mockFactory.getOrCreateConn(eq(decodedEndpoint))).thenReturn(mockClient); + + handler = new RedisMetadataHandler(mockGlue, new LocalKeyFactory(), mockSecretsManager, mockAthena, mockFactory, "bucket", "prefix"); + allocator = new BlockAllocatorImpl(); + + when(mockSecretsManager.getSecretValue(any(GetSecretValueRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + GetSecretValueRequest request = invocation.getArgumentAt(0, GetSecretValueRequest.class); + if ("endpoint".equalsIgnoreCase(request.getSecretId())) { + return new GetSecretValueResult().withSecretString(decodedEndpoint); + } + throw new RuntimeException("Unknown secret " + request.getSecretId()); + }); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + Schema schema = SchemaBuilder.newBuilder().build(); + + GetTableLayoutRequest req = new GetTableLayoutRequest(identity, "queryId", "default", + new TableName("schema1", "table1"), + new Constraints(new HashMap<>()), + schema, + new HashSet<>()); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout - {}", res); + Block partitions = res.getPartitions(); + for (int row = 0; row < partitions.getRowCount() && row < 10; row++) { + logger.info("doGetTableLayout:{} {}", row, BlockUtils.rowToString(partitions, row)); + } + + assertTrue(partitions.getRowCount() > 0); + assertEquals(4, partitions.getFields().size()); + + logger.info("doGetTableLayout: partitions[{}]", partitions.getRowCount()); + } + + @Test + public void doGetSplitsZset() + { + logger.info("doGetSplitsPrefix: enter"); + + //3 prefixes for this table + String prefixes = "prefix1-*,prefix2-*, prefix3-*"; + + //4 zsets per prefix + when(mockClient.scan(anyString(), any(ScanParams.class))).then((InvocationOnMock invocationOnMock) -> { + String cursor = (String) invocationOnMock.getArguments()[0]; + if (cursor == null || cursor.equals("0")) { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("1", result); + } + else { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("0", result); + } + }); + + //100 keys per zset + when(mockClient.zcount(anyString(), anyString(), anyString())).thenReturn(200L); + + List partitionCols = new ArrayList<>(); + + Schema schema = SchemaBuilder.newBuilder() + .addField("partitionId", Types.MinorType.INT.getType()) + .addStringField(REDIS_ENDPOINT_PROP) + .addStringField(VALUE_TYPE_TABLE_PROP) + .addStringField(KEY_PREFIX_TABLE_PROP) + .addStringField(ZSET_KEYS_TABLE_PROP) + .build(); + + Block partitions = allocator.createBlock(schema); + partitions.setValue(REDIS_ENDPOINT_PROP, 0, endpoint); + partitions.setValue(VALUE_TYPE_TABLE_PROP, 0, null); + partitions.setValue(KEY_PREFIX_TABLE_PROP, 0, null); + partitions.setValue(ZSET_KEYS_TABLE_PROP, 0, prefixes); + partitions.setRowCount(1); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName("schema", "table_name"), + partitions, + partitionCols, + new Constraints(new HashMap<>()), + null); + + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + + logger.info("doGetSplitsPrefix: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplitsPrefix: continuationToken[{}] - numSplits[{}]", + new Object[] {continuationToken, response.getSplits().size()}); + + assertEquals("Continuation criteria violated", 120, response.getSplits().size()); + assertTrue("Continuation criteria violated", response.getContinuationToken() == null); + + verify(mockClient, times(6)).scan(anyString(), any(ScanParams.class)); + logger.info("doGetSplitsPrefix: exit"); + } + + @Test + public void doGetSplitsPrefix() + { + logger.info("doGetSplitsPrefix: enter"); + + Schema schema = SchemaBuilder.newBuilder() + .addField("partitionId", Types.MinorType.INT.getType()) + .addStringField(REDIS_ENDPOINT_PROP) + .addStringField(VALUE_TYPE_TABLE_PROP) + .addStringField(KEY_PREFIX_TABLE_PROP) + .addStringField(ZSET_KEYS_TABLE_PROP) + .build(); + + Block partitions = allocator.createBlock(schema); + partitions.setValue(REDIS_ENDPOINT_PROP, 0, endpoint); + partitions.setValue(VALUE_TYPE_TABLE_PROP, 0, null); + partitions.setValue(KEY_PREFIX_TABLE_PROP, 0, "prefix1-*,prefix2-*, prefix3-*"); + partitions.setValue(ZSET_KEYS_TABLE_PROP, 0, null); + partitions.setRowCount(1); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName("schema", "table_name"), + partitions, + new ArrayList<>(), + new Constraints(new HashMap<>()), + null); + + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + + logger.info("doGetSplitsPrefix: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplitsPrefix: continuationToken[{}] - numSplits[{}]", + new Object[] {continuationToken, response.getSplits().size()}); + + assertTrue("Continuation criteria violated", response.getSplits().size() == 3); + assertTrue("Continuation criteria violated", response.getContinuationToken() == null); + + logger.info("doGetSplitsPrefix: exit"); + } +} diff --git a/athena-redis/src/test/java/com/amazonaws/athena/connectors/redis/RedisRecordHandlerTest.java b/athena-redis/src/test/java/com/amazonaws/athena/connectors/redis/RedisRecordHandlerTest.java new file mode 100644 index 0000000000..4183ed6c11 --- /dev/null +++ b/athena-redis/src/test/java/com/amazonaws/athena/connectors/redis/RedisRecordHandlerTest.java @@ -0,0 +1,475 @@ +/*- + * #%L + * athena-redis + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.redis; + +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.model.GetSecretValueRequest; +import com.amazonaws.services.secretsmanager.model.GetSecretValueResult; +import com.google.common.collect.ImmutableList; +import com.google.common.io.ByteStreams; +import org.apache.arrow.vector.complex.reader.FieldReader; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import redis.clients.jedis.Jedis; +import redis.clients.jedis.ScanParams; +import redis.clients.jedis.ScanResult; +import redis.clients.jedis.Tuple; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; +import java.util.concurrent.atomic.AtomicLong; + +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_COLUMN_NAME; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_PREFIX_TABLE_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.KEY_TYPE; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.REDIS_ENDPOINT_PROP; +import static com.amazonaws.athena.connectors.redis.RedisMetadataHandler.VALUE_TYPE_TABLE_PROP; +import static org.junit.Assert.*; +import static org.mockito.Matchers.any; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Matchers.eq; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class RedisRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(RedisRecordHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private String endpoint = "${endpoint}"; + private String decodedEndpoint = "endpoint:123"; + private RedisRecordHandler handler; + private BlockAllocator allocator; + private List mockS3Storage = new ArrayList<>(); + private AmazonS3 amazonS3; + private S3BlockSpillReader spillReader; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + + @Mock + private Jedis mockClient; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private JedisPoolFactory mockFactory; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + { + logger.info("setUpBefore - enter"); + when(mockFactory.getOrCreateConn(eq(decodedEndpoint))).thenReturn(mockClient); + + allocator = new BlockAllocatorImpl(); + + amazonS3 = mock(AmazonS3.class); + + when(amazonS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + }); + + when(amazonS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + }); + + when(mockSecretsManager.getSecretValue(any(GetSecretValueRequest.class))) + .thenAnswer((InvocationOnMock invocation) -> { + GetSecretValueRequest request = invocation.getArgumentAt(0, GetSecretValueRequest.class); + if ("endpoint".equalsIgnoreCase(request.getSecretId())) { + return new GetSecretValueResult().withSecretString(decodedEndpoint); + } + throw new RuntimeException("Unknown secret " + request.getSecretId()); + }); + + handler = new RedisRecordHandler(amazonS3, mockSecretsManager, mockAthena, mockFactory); + spillReader = new S3BlockSpillReader(amazonS3, allocator); + + logger.info("setUpBefore - exit"); + } + + @After + public void after() + { + allocator.close(); + } + + @Test + public void doReadRecordsLiteral() + throws Exception + { + logger.info("doReadRecordsLiteral: enter"); + + //4 keys per prefix + when(mockClient.scan(anyString(), any(ScanParams.class))).then((InvocationOnMock invocationOnMock) -> { + String cursor = (String) invocationOnMock.getArguments()[0]; + if (cursor == null || cursor.equals("0")) { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("1", result); + } + else { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("0", result); + } + }); + + AtomicLong value = new AtomicLong(0); + when(mockClient.get(anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> String.valueOf(value.getAndIncrement())); + + String catalog = "catalog1"; + String schema = "schema1"; + String table = "table1"; + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split split = Split.newBuilder(splitLoc, keyFactory.create()) + .add(REDIS_ENDPOINT_PROP, endpoint) + .add(KEY_TYPE, KeyType.PREFIX.getId()) + .add(KEY_PREFIX_TABLE_PROP, "key-*") + .add(VALUE_TYPE_TABLE_PROP, ValueType.LITERAL.getId()) + .build(); + + Schema schemaForRead = SchemaBuilder.newBuilder() + .addField("_key_", Types.MinorType.VARCHAR.getType()) + .addField("intcol", Types.MinorType.INT.getType()) + .build(); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("intcol", SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 1)), false)); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + split, + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsLiteral: rows[{}]", response.getRecordCount()); + + logger.info("doReadRecordsLiteral: {}", BlockUtils.rowToString(response.getRecords(), 0)); + assertTrue(response.getRecords().getRowCount() == 2); + + FieldReader keyReader = response.getRecords().getFieldReader(KEY_COLUMN_NAME); + keyReader.setPosition(0); + assertNotNull(keyReader.readText().toString()); + + FieldReader intCol = response.getRecords().getFieldReader("intcol"); + intCol.setPosition(0); + assertNotNull(intCol.readInteger()); + + logger.info("doReadRecordsLiteral: exit"); + } + + @Test + public void doReadRecordsHash() + throws Exception + { + logger.info("doReadRecordsHash: enter"); + + //4 keys per prefix + when(mockClient.scan(anyString(), any(ScanParams.class))).then((InvocationOnMock invocationOnMock) -> { + String cursor = (String) invocationOnMock.getArguments()[0]; + if (cursor == null || cursor.equals("0")) { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("1", result); + } + else { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("0", result); + } + }); + + //4 columns per key + AtomicLong intColVal = new AtomicLong(0); + when(mockClient.hgetAll(anyString())).then((InvocationOnMock invocationOnMock) -> { + Map result = new HashMap<>(); + result.put("intcol", String.valueOf(intColVal.getAndIncrement())); + result.put("stringcol", UUID.randomUUID().toString()); + result.put("extracol", UUID.randomUUID().toString()); + return result; + }); + + AtomicLong value = new AtomicLong(0); + when(mockClient.get(anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> String.valueOf(value.getAndIncrement())); + + String catalog = "catalog1"; + String schema = "schema1"; + String table = "table1"; + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split split = Split.newBuilder(splitLoc, keyFactory.create()) + .add(REDIS_ENDPOINT_PROP, endpoint) + .add(KEY_TYPE, KeyType.PREFIX.getId()) + .add(KEY_PREFIX_TABLE_PROP, "key-*") + .add(VALUE_TYPE_TABLE_PROP, ValueType.HASH.getId()) + .build(); + + Schema schemaForRead = SchemaBuilder.newBuilder() + .addField("_key_", Types.MinorType.VARCHAR.getType()) + .addField("intcol", Types.MinorType.INT.getType()) + .addField("stringcol", Types.MinorType.VARCHAR.getType()) + .build(); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("intcol", SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 1)), false)); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + split, + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsHash: rows[{}]", response.getRecordCount()); + + logger.info("doReadRecordsHash: {}", BlockUtils.rowToString(response.getRecords(), 0)); + assertTrue(response.getRecords().getRowCount() == 5); + assertTrue(response.getRecords().getFields().size() == schemaForRead.getFields().size()); + + FieldReader keyReader = response.getRecords().getFieldReader(KEY_COLUMN_NAME); + keyReader.setPosition(0); + assertNotNull(keyReader.readText()); + + FieldReader intCol = response.getRecords().getFieldReader("intcol"); + intCol.setPosition(0); + assertNotNull(intCol.readInteger()); + + FieldReader stringCol = response.getRecords().getFieldReader("stringcol"); + stringCol.setPosition(0); + assertNotNull(stringCol.readText()); + + logger.info("doReadRecordsHash: exit"); + } + + @Test + public void doReadRecordsZset() + throws Exception + { + logger.info("doReadRecordsZset: enter"); + + //4 keys per prefix + when(mockClient.scan(anyString(), any(ScanParams.class))).then((InvocationOnMock invocationOnMock) -> { + String cursor = (String) invocationOnMock.getArguments()[0]; + if (cursor == null || cursor.equals("0")) { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("1", result); + } + else { + List result = new ArrayList<>(); + result.add(UUID.randomUUID().toString()); + return new ScanResult<>("0", result); + } + }); + + //4 rows per key + when(mockClient.zscan(anyString(), anyString())).then((InvocationOnMock invocationOnMock) -> { + String cursor = (String) invocationOnMock.getArguments()[1]; + if (cursor == null || cursor.equals("0")) { + List result = new ArrayList<>(); + result.add(new Tuple("1", 0.0D)); + result.add(new Tuple("2", 0.0D)); + result.add(new Tuple("3", 0.0D)); + return new ScanResult<>("1", result); + } + else { + List result = new ArrayList<>(); + result.add(new Tuple("4", 0.0D)); + return new ScanResult<>("0", result); + } + }); + + AtomicLong value = new AtomicLong(0); + when(mockClient.get(anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> String.valueOf(value.getAndIncrement())); + + String catalog = "catalog1"; + String schema = "schema1"; + String table = "table1"; + + S3SpillLocation splitLoc = S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(); + + Split split = Split.newBuilder(splitLoc, keyFactory.create()) + .add(REDIS_ENDPOINT_PROP, endpoint) + .add(KEY_TYPE, KeyType.PREFIX.getId()) + .add(KEY_PREFIX_TABLE_PROP, "key-*") + .add(VALUE_TYPE_TABLE_PROP, ValueType.ZSET.getId()) + .build(); + + Schema schemaForRead = SchemaBuilder.newBuilder() + .addField("_key_", Types.MinorType.VARCHAR.getType()) + .addField("intcol", Types.MinorType.INT.getType()) + .build(); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("intcol", SortedRangeSet.copyOf(Types.MinorType.INT.getType(), + ImmutableList.of(Range.greaterThan(allocator, Types.MinorType.INT.getType(), 1)), false)); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + catalog, + "queryId-" + System.currentTimeMillis(), + new TableName(schema, table), + schemaForRead, + split, + new Constraints(constraintsMap), + 100_000_000_000L, //100GB don't expect this to spill + 100_000_000_000L + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsZset: rows[{}]", response.getRecordCount()); + + logger.info("doReadRecordsZset: {}", BlockUtils.rowToString(response.getRecords(), 0)); + assertTrue(response.getRecords().getRowCount() == 12); + + FieldReader keyReader = response.getRecords().getFieldReader(KEY_COLUMN_NAME); + keyReader.setPosition(0); + assertNotNull(keyReader.readText()); + + FieldReader intCol = response.getRecords().getFieldReader("intcol"); + intCol.setPosition(0); + assertNotNull(intCol.readInteger()); + + logger.info("doReadRecordsZset: exit"); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-tpcds/LICENSE.txt b/athena-tpcds/LICENSE.txt new file mode 100644 index 0000000000..67db858821 --- /dev/null +++ b/athena-tpcds/LICENSE.txt @@ -0,0 +1,175 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. diff --git a/athena-tpcds/README.md b/athena-tpcds/README.md new file mode 100644 index 0000000000..7013a5e437 --- /dev/null +++ b/athena-tpcds/README.md @@ -0,0 +1,135 @@ +# Amazon Athena TPC-DS Connector + +This connector enables Amazon Athena to communicate with a source of randomly generated TPC-DS data for use in benchmarking and functional testing. + +## Usage + +### Parameters + +The Athena TPC-DS Connector exposes several configuration options via Lambda environment variables. More detail on the available parameters can be found below. + +1. **spill_bucket** - When the data returned by your Lambda function exceeds Lambda’s limits, this is the bucket that the data will be written to for Athena to read the excess from. (e.g. my_bucket) +2. **spill_prefix** - (Optional) Defaults to sub-folder in your bucket called 'athena-federation-spill'. Used in conjunction with spill_bucket, this is the path within the above bucket that large responses are spilled to. You should configure an S3 lifecycle on this location to delete old spills after X days/Hours. +3. **kms_key_id** - (Optional) By default any data that is spilled to S3 is encrypted using AES-GCM and a randomly generated key. Setting a KMS Key ID allows your Lambda function to use KMS for key generation for a stronger source of encryption keys. (e.g. a7e63k4b-8loc-40db-a2a1-4d0en2cd8331) +4. **disable_spill_encryption** - (Optional) Defaults to False so that any data that is spilled to S3 is encrypted using AES-GMC either with a randomly generated key or using KMS to generate keys. Setting this to false will disable spill encryption. You may wish to disable this for improved performance, especially if your spill location in S3 uses S3 Server Side Encryption. (e.g. True or False) + +### Databases & Tables + +The Athena TPC-DS Connector generates a TPC-DS compliant database at one of four ("tpcds1", "tpcds10", "tpcds100", "tpcds250", "tpcds1000") scale factors. + +For a complete list of tables and columns please use `show tables` and `describe table` queries and a summary of tables below. You can find copies of TPC-DS queries that are compatible with this generated schema and data in the src/main/resources/queries directory of this module. + +1. call_center +1. catalog_page +1. catalog_returns +1. catalog_sales +1. customer +1. customer_address +1. customer_demographics +1. date_dim +1. dbgen_version +1. household_demographics +1. income_band +1. inventory +1. item +1. promotion +1. reason +1. ship_mode +1. store +1. store_returns +1. store_sales +1. time_dim +1. warehouse +1. web_page +1. web_returns +1. web_sales +1. web_site + +The below query is one example that is setup for use with a catalog called tpcds. + +```sql +SELECT + cd_gender, + cd_marital_status, + cd_education_status, + count(*) cnt1, + cd_purchase_estimate, + count(*) cnt2, + cd_credit_rating, + count(*) cnt3, + cd_dep_count, + count(*) cnt4, + cd_dep_employed_count, + count(*) cnt5, + cd_dep_college_count, + count(*) cnt6 +FROM + "lambda:tpcds".tpcds1.customer c, "lambda:tpcds".tpcds1.customer_address ca, "lambda:tpcds".tpcds1.customer_demographics +WHERE + c.c_current_addr_sk = ca.ca_address_sk AND + ca_county IN ('Rush County', 'Toole County', 'Jefferson County', + 'Dona Ana County', 'La Porte County') AND + cd_demo_sk = c.c_current_cdemo_sk AND + exists(SELECT * + FROM "lambda:tpcds".tpcds1.store_sales, "lambda:tpcds".tpcds1.date_dim + WHERE c.c_customer_sk = ss_customer_sk AND + ss_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_moy BETWEEN 1 AND 1 + 3) AND + (exists(SELECT * + FROM "lambda:tpcds".tpcds1.web_sales, "lambda:tpcds".tpcds1.date_dim + WHERE c.c_customer_sk = ws_bill_customer_sk AND + ws_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_moy BETWEEN 1 AND 1 + 3) OR + exists(SELECT * + FROM "lambda:tpcds".tpcds1.catalog_sales, "lambda:tpcds".tpcds1.date_dim + WHERE c.c_customer_sk = cs_ship_customer_sk AND + cs_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_moy BETWEEN 1 AND 1 + 3)) +GROUP BY cd_gender, + cd_marital_status, + cd_education_status, + cd_purchase_estimate, + cd_credit_rating, + cd_dep_count, + cd_dep_employed_count, + cd_dep_college_count +ORDER BY cd_gender, + cd_marital_status, + cd_education_status, + cd_purchase_estimate, + cd_credit_rating, + cd_dep_count, + cd_dep_employed_count, + cd_dep_college_count +LIMIT 100 +``` + +### Required Permissions + +Review the "Policies" section of the athena-tpcds.yaml file for full details on the IAM Policies required by this connector. A brief summary is below. + +1. S3 Write Access - In order to successfully handle large queries, the connector requires write access to a location in S3. +1. Athena GetQueryExecution - The connector uses this access to fast-fail when the upstream Athena query has terminated. + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-tpcds dir, run `mvn clean install`. +3. From the athena-tpcds dir, run `../tools/publish.sh S3_BUCKET_NAME athena-tpcds` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) +4. Try running a query like the one below in Athena: +```sql +select * from "lambda:".schema.table limit 100 +``` + +## Performance + +The Athena tpcds Connector will attempt to parallelize queries based on the scale factor you have choosen. Predicate Pushdown is performed within the Lambda function. + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-tpcds/athena-tpcds.yaml b/athena-tpcds/athena-tpcds.yaml new file mode 100644 index 0000000000..47d0bf071a --- /dev/null +++ b/athena-tpcds/athena-tpcds.yaml @@ -0,0 +1,64 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaTPCDSConnector + Description: 'This connector enables Amazon Athena to communicate with a randomly generated TPC-DS data source.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + AthenaCatalogName: + Description: 'The name you will give to this catalog in Athena. It will also be used as the function name.' + Type: String + SpillBucket: + Description: 'The bucket where this function can spill data.' + Type: String + SpillPrefix: + Description: 'The bucket prefix where this function can spill large responses.' + Type: String + Default: athena-spill + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number + DisableSpillEncryption: + Description: "WARNING: If set to 'true' encryption for spilled data is disabled." + Default: 'false' + Type: String +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + disable_spill_encryption: !Ref DisableSpillEncryption + spill_bucket: !Ref SpillBucket + spill_prefix: !Ref SpillPrefix + FunctionName: !Ref AthenaCatalogName + Handler: "com.amazonaws.athena.connectors.tpcds.TPCDSCompositeHandler" + CodeUri: "./target/athena-tpcds-1.0.jar" + Description: "This connector enables Amazon Athena to communicate with a randomly generated TPC-DS data source." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory + Policies: + - Statement: + - Action: + - athena:GetQueryExecution + Effect: Allow + Resource: '*' + Version: '2012-10-17' + #S3CrudPolicy allows our connector to spill large responses to S3. You can optionally replace this pre-made policy + #with one that is more restrictive and can only 'put' but not read,delete, or overwrite files. + - S3CrudPolicy: + BucketName: !Ref SpillBucket \ No newline at end of file diff --git a/athena-tpcds/pom.xml b/athena-tpcds/pom.xml new file mode 100644 index 0000000000..129c923499 --- /dev/null +++ b/athena-tpcds/pom.xml @@ -0,0 +1,57 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-tpcds + 1.0 + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.teradata.tpcds + tpcds + 1.2 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSCompositeHandler.java b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSCompositeHandler.java new file mode 100644 index 0000000000..6522776cfd --- /dev/null +++ b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSCompositeHandler.java @@ -0,0 +1,31 @@ +/*- + * #%L + * athena-tpcds + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.tpcds; + +import com.amazonaws.athena.connector.lambda.handlers.CompositeHandler; + +public class TPCDSCompositeHandler + extends CompositeHandler +{ + public TPCDSCompositeHandler() + { + super(new TPCDSMetadataHandler(), new TPCDSRecordHandler()); + } +} diff --git a/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSMetadataHandler.java b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSMetadataHandler.java new file mode 100644 index 0000000000..ebd2d591cb --- /dev/null +++ b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSMetadataHandler.java @@ -0,0 +1,199 @@ +/*- + * #%L + * athena-tpcds + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.tpcds; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockWriter; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.handlers.MetadataHandler; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.collect.ImmutableSet; +import com.teradata.tpcds.Table; +import com.teradata.tpcds.column.Column; +import org.apache.arrow.util.VisibleForTesting; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.HashSet; +import java.util.List; +import java.util.Set; +import java.util.stream.Collectors; + +/** + * Handles metadata requests for the Athena TPC-DS Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Provides 5 Schems, each representing a different scale factor (1,10,100,250,1000) + * 2. Each schema has 25 TPC-DS tables + * 3. Each table is divided into NUM_SPLITS splits * scale_factor/10 + */ +public class TPCDSMetadataHandler + extends MetadataHandler +{ + private static final Logger logger = LoggerFactory.getLogger(TPCDSMetadataHandler.class); + + //The name of the field that contains the number of the split. This is used for parallelizing data generation. + protected static final String SPLIT_NUMBER_FIELD = "splitNum"; + //The name of the field that contains the total number of splits that were generated. + //This is used for parallelizing data generation. + protected static final String SPLIT_TOTAL_NUMBER_FIELD = "totalNumSplits"; + //The is the name of the field that contains the scale factor of the schema used in the request. + protected static final String SPLIT_SCALE_FACTOR_FIELD = "scaleFactor"; + //The list of valid schemas which also convey the scale factor + protected static final Set SCHEMA_NAMES = ImmutableSet.of("tpcds1", "tpcds10", "tpcds100", "tpcds250", "tpcds1000"); + + /** + * used to aid in debugging. Athena will use this name in conjunction with your catalog id + * to correlate relevant query errors. + */ + private static final String SOURCE_TYPE = "tpcds"; + + public TPCDSMetadataHandler() + { + super(SOURCE_TYPE); + } + + @VisibleForTesting + protected TPCDSMetadataHandler(EncryptionKeyFactory keyFactory, + AWSSecretsManager secretsManager, + AmazonAthena athena, + String spillBucket, + String spillPrefix) + { + super(keyFactory, secretsManager, athena, SOURCE_TYPE, spillBucket, spillPrefix); + } + + /** + * Returns our static list of schemas which correspond to the scale factor of the dataset we will generate. + * + * @see MetadataHandler + */ + @Override + public ListSchemasResponse doListSchemaNames(BlockAllocator allocator, ListSchemasRequest request) + { + logger.info("doListSchemaNames: enter - " + request); + return new ListSchemasResponse(request.getCatalogName(), SCHEMA_NAMES); + } + + /** + * Used to get the list of static tables from TerraData's TPCDS generator. + * + * @see MetadataHandler + */ + @Override + public ListTablesResponse doListTables(BlockAllocator allocator, ListTablesRequest request) + { + logger.info("doListTables: enter - " + request); + + List tables = Table.getBaseTables().stream() + .map(next -> new TableName(request.getSchemaName(), next.getName())) + .collect(Collectors.toList()); + + return new ListTablesResponse(request.getCatalogName(), tables); + } + + /** + * Used to get definition (field names, types, descriptions, etc...) of a Table using the static + * metadata provided by TerraData's TPCDS generator. + * + * @see MetadataHandler + */ + @Override + public GetTableResponse doGetTable(BlockAllocator allocator, GetTableRequest request) + { + logger.info("doGetTable: enter - " + request); + + Table table = TPCDSUtils.validateTable(request.getTableName()); + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + for (Column nextCol : table.getColumns()) { + schemaBuilder.addField(TPCDSUtils.convertColumn(nextCol)); + } + + return new GetTableResponse(request.getCatalogName(), + request.getTableName(), + schemaBuilder.build(), + Collections.EMPTY_SET); + } + + /** + * We do not support partitioning at this time since Partition Pruning Performance is not part of the dimensions + * we test using TPCDS. By making this a NoOp the Athena Federation SDK will automatically generate a single + * placeholder partition to signal to Athena that there is indeed data that needs to be read and that it should + * call get splits. + * + * @see MetadataHandler + */ + @Override + public void getPartitions(BlockWriter blockWriter, GetTableLayoutRequest request, QueryStatusChecker queryStatusChecker) + throws Exception + { + //NoOp + } + + /** + * Used to split-up the reads required to scan the requested batch of partition(s). We are generating a fixed + * number of splits based on the scale factor. + * + * @see MetadataHandler + */ + @Override + public GetSplitsResponse doGetSplits(BlockAllocator allocator, GetSplitsRequest request) + { + String catalogName = request.getCatalogName(); + int scaleFactor = TPCDSUtils.extractScaleFactor(request.getTableName().getSchemaName()); + int totalSplits = (int) Math.ceil(((double) scaleFactor / 48D)); //each split would be ~48MB + + logger.info("doGetSplits: Generating {} splits for {} at scale factor {}", + totalSplits, request.getTableName(), scaleFactor); + + int nextSplit = request.getContinuationToken() == null ? 0 : Integer.parseInt(request.getContinuationToken()); + Set splits = new HashSet<>(); + for (int i = nextSplit; i < totalSplits; i++) { + splits.add(Split.newBuilder(makeSpillLocation(request), makeEncryptionKey()) + .add(SPLIT_NUMBER_FIELD, String.valueOf(i)) + .add(SPLIT_TOTAL_NUMBER_FIELD, String.valueOf(totalSplits)) + .add(SPLIT_SCALE_FACTOR_FIELD, String.valueOf(scaleFactor)) + .build()); + if (splits.size() >= 1000) { + return new GetSplitsResponse(catalogName, splits, String.valueOf(i + 1)); + } + } + + logger.info("doGetSplits: exit - " + splits.size()); + return new GetSplitsResponse(catalogName, splits); + } +} diff --git a/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSRecordHandler.java b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSRecordHandler.java new file mode 100644 index 0000000000..ffd0224e79 --- /dev/null +++ b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSRecordHandler.java @@ -0,0 +1,229 @@ +/*- + * #%L + * athena-tpcds + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.tpcds; + +import com.amazonaws.athena.connector.lambda.QueryStatusChecker; +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockSpiller; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.handlers.RecordHandler; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.athena.AmazonAthenaClientBuilder; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.AmazonS3ClientBuilder; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.amazonaws.services.secretsmanager.AWSSecretsManagerClientBuilder; +import com.teradata.tpcds.Results; +import com.teradata.tpcds.Session; +import com.teradata.tpcds.Table; +import com.teradata.tpcds.column.Column; +import com.teradata.tpcds.column.ColumnType; +import org.apache.arrow.util.VisibleForTesting; +import org.apache.arrow.vector.types.pojo.Field; +import org.apache.arrow.vector.types.pojo.Schema; +import org.joda.time.LocalDate; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.math.BigDecimal; +import java.util.Date; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.Optional; + +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_NUMBER_FIELD; +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_SCALE_FACTOR_FIELD; +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_TOTAL_NUMBER_FIELD; +import static com.teradata.tpcds.Results.constructResults; + +/** + * Handles data read record requests for the Athena TPC-DS Connector. + *

+ * For more detail, please see the module's README.md, some notable characteristics of this class include: + *

+ * 1. Generates data for the requested table on the fly. + * 2. Applies constraints to the data as it is generated, emulating predicate-pushdown. + */ +public class TPCDSRecordHandler + extends RecordHandler +{ + private static final Logger logger = LoggerFactory.getLogger(TPCDSRecordHandler.class); + + /** + * used to aid in debugging. Athena will use this name in conjunction with your catalog id + * to correlate relevant query errors. + */ + private static final String SOURCE_TYPE = "tpcds"; + + public TPCDSRecordHandler() + { + super(AmazonS3ClientBuilder.defaultClient(), AWSSecretsManagerClientBuilder.defaultClient(), AmazonAthenaClientBuilder.defaultClient(), SOURCE_TYPE); + } + + @VisibleForTesting + protected TPCDSRecordHandler(AmazonS3 amazonS3, AWSSecretsManager secretsManager, AmazonAthena athena) + { + super(amazonS3, secretsManager, athena, SOURCE_TYPE); + } + + /** + * Generated TPCDS data for the given Table and scale factor as defined by the requested Split. + * + * @see RecordHandler + */ + @Override + protected void readWithConstraint(BlockSpiller spiller, ReadRecordsRequest recordsRequest, QueryStatusChecker queryStatusChecker) + throws IOException + { + Split split = recordsRequest.getSplit(); + int splitNumber = Integer.parseInt(split.getProperty(SPLIT_NUMBER_FIELD)); + int totalNumSplits = Integer.parseInt(split.getProperty(SPLIT_TOTAL_NUMBER_FIELD)); + int scaleFactor = Integer.parseInt(split.getProperty(SPLIT_SCALE_FACTOR_FIELD)); + Table table = validateTable(recordsRequest.getTableName()); + + Session session = Session.getDefaultSession() + .withScale(scaleFactor) + .withParallelism(totalNumSplits) + .withChunkNumber(splitNumber + 1) + .withTable(table) + .withNoSexism(true); + + Results results = constructResults(table, session); + Iterator>> itr = results.iterator(); + + Map writers = makeWriters(recordsRequest.getSchema(), table); + while (itr.hasNext() && queryStatusChecker.isQueryRunning()) { + List row = itr.next().get(0); + spiller.writeRows((Block block, int numRow) -> { + boolean matched = true; + for (Map.Entry nextWriter : writers.entrySet()) { + matched &= nextWriter.getValue().write(block, numRow, row.get(nextWriter.getKey())); + } + return matched ? 1 : 0; + }); + } + } + + /** + * Required that the requested Table be present in the TPCDS generated schema. + * + * @param tableName The fully qualified name of the requested table. + * @return The TPCDS table, if present, otherwise the method throws. + */ + private Table validateTable(TableName tableName) + { + Optional

table = Table.getBaseTables().stream() + .filter(next -> next.getName().equals(tableName.getTableName())) + .findFirst(); + + if (!table.isPresent()) { + throw new RuntimeException("Unknown table " + tableName); + } + + return table.get(); + } + + /** + * Generates the CellWriters used to convert the TPCDS Generators data to Apache Arrow. + * + * @param schemaForRead The schema to read/project. + * @param table The TPCDS Table we are reading from. + * @return Map where integer is the Column position in the TPCDS data set and the CellWriter + * can be used to read,convert,write the value at that position for any row into the correct position and type + * in our Apache Arrow response. + */ + private Map makeWriters(Schema schemaForRead, Table table) + { + Map columnPositions = new HashMap<>(); + for (Column next : table.getColumns()) { + columnPositions.put(next.getName(), next); + } + + //We use this approach to reduce the overhead of field lookups. This isn't as good as true columnar processing + //using Arrow but it gets us ~80% of the way there from a rows/second per cpu-cycle perspective. + Map writers = new HashMap<>(); + for (Field nextField : schemaForRead.getFields()) { + Column column = columnPositions.get(nextField.getName()); + writers.put(column.getPosition(), makeWriter(nextField, column)); + } + return writers; + } + + /** + * Makes a CellWriter for the provided Apache Arrow Field and TPCDS Column. + * + * @param field The Apache Arrow Field. + * @param column The corresponding TPCDS Column. + * @return The CellWriter that can be used to convert and write values for the provided Field/Column pair. + */ + private CellWriter makeWriter(Field field, Column column) + { + ColumnType type = column.getType(); + switch (type.getBase()) { + case TIME: + case IDENTIFIER: + return (Block block, int rowNum, String rawValue) -> { + Long value = (rawValue != null) ? Long.parseLong(rawValue) : null; + return block.setValue(field.getName(), rowNum, value); + }; + case INTEGER: + return (Block block, int rowNum, String rawValue) -> { + Integer value = (rawValue != null) ? Integer.parseInt(rawValue) : null; + return block.setValue(field.getName(), rowNum, value); + }; + case DATE: + return (Block block, int rowNum, String rawValue) -> { + Date value = (rawValue != null) ? LocalDate.parse(rawValue).toDate() : null; + return block.setValue(field.getName(), rowNum, value); + }; + case DECIMAL: + return (Block block, int rowNum, String rawValue) -> { + BigDecimal value = (rawValue != null) ? new BigDecimal(rawValue) : null; + return block.setValue(field.getName(), rowNum, value); + }; + case CHAR: + case VARCHAR: + return (Block block, int rowNum, String rawValue) -> { + return block.setValue(field.getName(), rowNum, rawValue); + }; + } + throw new IllegalArgumentException("Unsupported TPC-DS type " + column.getName() + ":" + column.getType().getBase()); + } + + public interface CellWriter + { + /** + * Converts a value from TPCDS' string representation into the appropriate Apache Arrow type + * and writes it to the correct field in the provided Block and row. The implementation should + * also apply constraints as an optimization. + * + * @param block The Apache Arrow Block to write into. + * @param rowNum The row number in the Arrow Block to write into. + * @param value The value to convert and write into the Apache Arrow Block. + * @return True if the value passed all Contraints. + */ + boolean write(Block block, int rowNum, String value); + } +} diff --git a/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSUtils.java b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSUtils.java new file mode 100644 index 0000000000..065d3b4a4c --- /dev/null +++ b/athena-tpcds/src/main/java/com/amazonaws/athena/connectors/tpcds/TPCDSUtils.java @@ -0,0 +1,106 @@ +/*- + * #%L + * athena-tpcds + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.tpcds; + +import com.amazonaws.athena.connector.lambda.data.FieldBuilder; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.teradata.tpcds.Table; +import com.teradata.tpcds.column.Column; +import com.teradata.tpcds.column.ColumnType; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.ArrowType; +import org.apache.arrow.vector.types.pojo.Field; + +import java.util.Optional; + +/** + * Utility class that centralizes a few commonly used tools for working with + * the TPC-DS Tables,Columns,Schemas. + */ +public class TPCDSUtils +{ + private TPCDSUtils() {} + + /** + * Converts from TPCDS columns to Apache Arrow fields. + * + * @param column The TPCDS column to conver. + * @return The Apache Arrow field that corresponds to the TPCDS column. + */ + public static Field convertColumn(Column column) + { + ColumnType type = column.getType(); + switch (type.getBase()) { + case TIME: + case IDENTIFIER: + return FieldBuilder.newBuilder(column.getName(), Types.MinorType.BIGINT.getType()).build(); + case INTEGER: + return FieldBuilder.newBuilder(column.getName(), Types.MinorType.INT.getType()).build(); + case DATE: + return FieldBuilder.newBuilder(column.getName(), Types.MinorType.DATEDAY.getType()).build(); + case DECIMAL: + ArrowType arrowType = new ArrowType.Decimal(type.getPrecision().get(), type.getScale().get()); + return FieldBuilder.newBuilder(column.getName(), arrowType).build(); + case CHAR: + case VARCHAR: + return FieldBuilder.newBuilder(column.getName(), Types.MinorType.VARCHAR.getType()).build(); + } + throw new IllegalArgumentException("Unsupported TPC-DS type " + column.getName() + ":" + column.getType().getBase()); + } + + /** + * Extracts the scale factor of the schema from its name. + * + * @param schemaName The schema name from which to extract a scale factor. + * @return The scale factor associated with the schema name. Method throws is the scale factor can not be determined. + */ + public static int extractScaleFactor(String schemaName) + { + if (!schemaName.startsWith("tpcds")) { + throw new RuntimeException("Unknown schema format " + schemaName + ", can not extract scale factor."); + } + + try { + return Integer.parseInt(schemaName.substring(5)); + } + catch (RuntimeException ex) { + throw new RuntimeException("Unknown schema format " + schemaName + ", can not extract scale factor.", ex); + } + } + + /** + * Required that the requested Table be present in the TPCDS generated schema. + * + * @param tableName The fully qualified name of the requested table. + * @return The TPCDS table, if present, otherwise the method throws. + */ + public static Table validateTable(TableName tableName) + { + Optional
table = Table.getBaseTables().stream() + .filter(next -> next.getName().equals(tableName.getTableName())) + .findFirst(); + + if (!table.isPresent()) { + throw new RuntimeException("Unknown table " + tableName); + } + + return table.get(); + } +} diff --git a/athena-tpcds/src/main/resources/queries/q1.sql b/athena-tpcds/src/main/resources/queries/q1.sql new file mode 100644 index 0000000000..e10b68cdc2 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q1.sql @@ -0,0 +1,19 @@ +WITH customer_total_return AS +( SELECT + sr_customer_sk AS ctr_customer_sk, + sr_store_sk AS ctr_store_sk, + sum(sr_return_amt) AS ctr_total_return + FROM store_returns[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE sr_returned_date_sk = d_date_sk AND d_year = 2000 + GROUP BY sr_customer_sk, sr_store_sk) +SELECT c_customer_id +FROM customer_total_return ctr1, store[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] +WHERE ctr1.ctr_total_return > + (SELECT avg(ctr_total_return) * 1.2 + FROM customer_total_return ctr2 + WHERE ctr1.ctr_store_sk = ctr2.ctr_store_sk) + AND s_store_sk = ctr1.ctr_store_sk + AND s_state = 'TN' + AND ctr1.ctr_customer_sk = c_customer_sk +ORDER BY c_customer_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q10.sql b/athena-tpcds/src/main/resources/queries/q10.sql new file mode 100644 index 0000000000..0adab7ebd5 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q10.sql @@ -0,0 +1,57 @@ +SELECT + cd_gender, + cd_marital_status, + cd_education_status, + count(*) cnt1, + cd_purchase_estimate, + count(*) cnt2, + cd_credit_rating, + count(*) cnt3, + cd_dep_count, + count(*) cnt4, + cd_dep_employed_count, + count(*) cnt5, + cd_dep_college_count, + count(*) cnt6 +FROM + customer[ TABLE_SUFFIX ] c, customer_address[ TABLE_SUFFIX ] ca, customer_demographics[ TABLE_SUFFIX ] +WHERE + c.c_current_addr_sk = ca.ca_address_sk AND + ca_county IN ('Rush County', 'Toole County', 'Jefferson County', + 'Dona Ana County', 'La Porte County') AND + cd_demo_sk = c.c_current_cdemo_sk AND + exists(SELECT * + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c.c_customer_sk = ss_customer_sk AND + ss_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_moy BETWEEN 1 AND 1 + 3) AND + (exists(SELECT * + FROM web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c.c_customer_sk = ws_bill_customer_sk AND + ws_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_moy BETWEEN 1 AND 1 + 3) OR + exists(SELECT * + FROM catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c.c_customer_sk = cs_ship_customer_sk AND + cs_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_moy BETWEEN 1 AND 1 + 3)) +GROUP BY cd_gender, + cd_marital_status, + cd_education_status, + cd_purchase_estimate, + cd_credit_rating, + cd_dep_count, + cd_dep_employed_count, + cd_dep_college_count +ORDER BY cd_gender, + cd_marital_status, + cd_education_status, + cd_purchase_estimate, + cd_credit_rating, + cd_dep_count, + cd_dep_employed_count, + cd_dep_college_count +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q11.sql b/athena-tpcds/src/main/resources/queries/q11.sql new file mode 100644 index 0000000000..8ceac01f4f --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q11.sql @@ -0,0 +1,68 @@ +WITH year_total AS ( + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + c_preferred_cust_flag customer_preferred_cust_flag, + c_birth_country customer_birth_country, + c_login customer_login, + c_email_address customer_email_address, + d_year dyear, + sum(ss_ext_list_price - ss_ext_discount_amt) year_total, + 's' sale_type + FROM customer[ TABLE_SUFFIX ], store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c_customer_sk = ss_customer_sk + AND ss_sold_date_sk = d_date_sk + GROUP BY c_customer_id + , c_first_name + , c_last_name + , d_year + , c_preferred_cust_flag + , c_birth_country + , c_login + , c_email_address + , d_year + UNION ALL + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + c_preferred_cust_flag customer_preferred_cust_flag, + c_birth_country customer_birth_country, + c_login customer_login, + c_email_address customer_email_address, + d_year dyear, + sum(ws_ext_list_price - ws_ext_discount_amt) year_total, + 'w' sale_type + FROM customer[ TABLE_SUFFIX ], web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c_customer_sk = ws_bill_customer_sk + AND ws_sold_date_sk = d_date_sk + GROUP BY + c_customer_id, c_first_name, c_last_name, c_preferred_cust_flag, c_birth_country, + c_login, c_email_address, d_year) +SELECT t_s_secyear.customer_preferred_cust_flag +FROM year_total t_s_firstyear + , year_total t_s_secyear + , year_total t_w_firstyear + , year_total t_w_secyear +WHERE t_s_secyear.customer_id = t_s_firstyear.customer_id + AND t_s_firstyear.customer_id = t_w_secyear.customer_id + AND t_s_firstyear.customer_id = t_w_firstyear.customer_id + AND t_s_firstyear.sale_type = 's' + AND t_w_firstyear.sale_type = 'w' + AND t_s_secyear.sale_type = 's' + AND t_w_secyear.sale_type = 'w' + AND t_s_firstyear.dyear = 2001 + AND t_s_secyear.dyear = 2001 + 1 + AND t_w_firstyear.dyear = 2001 + AND t_w_secyear.dyear = 2001 + 1 + AND t_s_firstyear.year_total > 0 + AND t_w_firstyear.year_total > 0 + AND CASE WHEN t_w_firstyear.year_total > 0 + THEN t_w_secyear.year_total / t_w_firstyear.year_total + ELSE NULL END + > CASE WHEN t_s_firstyear.year_total > 0 + THEN t_s_secyear.year_total / t_s_firstyear.year_total + ELSE NULL END +ORDER BY t_s_secyear.customer_preferred_cust_flag +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q12.sql b/athena-tpcds/src/main/resources/queries/q12.sql new file mode 100644 index 0000000000..c1528a206d --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q12.sql @@ -0,0 +1,22 @@ +SELECT + i_item_desc, + i_category, + i_class, + i_current_price, + sum(ws_ext_sales_price) AS itemrevenue, + sum(ws_ext_sales_price) * 100 / sum(sum(ws_ext_sales_price)) + OVER + (PARTITION BY i_class) AS revenueratio +FROM + web_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] +WHERE + ws_item_sk = i_item_sk + AND i_category IN ('Sports', 'Books', 'Home') + AND ws_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('1999-02-22' AS DATE) + AND (cast('1999-02-22' AS DATE) + INTERVAL '30' day) +GROUP BY + i_item_id, i_item_desc, i_category, i_class, i_current_price +ORDER BY + i_category, i_class, i_item_id, i_item_desc, revenueratio +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q13.sql b/athena-tpcds/src/main/resources/queries/q13.sql new file mode 100644 index 0000000000..1d3fe9721c --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q13.sql @@ -0,0 +1,49 @@ +SELECT + avg(ss_quantity), + avg(ss_ext_sales_price), + avg(ss_ext_wholesale_cost), + sum(ss_ext_wholesale_cost) +FROM store_sales[ TABLE_SUFFIX ] + , store[ TABLE_SUFFIX ] + , customer_demographics[ TABLE_SUFFIX ] + , household_demographics[ TABLE_SUFFIX ] + , customer_address[ TABLE_SUFFIX ] + , date_dim[ TABLE_SUFFIX ] +WHERE s_store_sk = ss_store_sk + AND ss_sold_date_sk = d_date_sk AND d_year = 2001 + AND ((ss_hdemo_sk = hd_demo_sk + AND cd_demo_sk = ss_cdemo_sk + AND cd_marital_status = 'M' + AND cd_education_status = 'Advanced Degree' + AND ss_sales_price BETWEEN 100.00 AND 150.00 + AND hd_dep_count = 3 +) OR + (ss_hdemo_sk = hd_demo_sk + AND cd_demo_sk = ss_cdemo_sk + AND cd_marital_status = 'S' + AND cd_education_status = 'College' + AND ss_sales_price BETWEEN 50.00 AND 100.00 + AND hd_dep_count = 1 + ) OR + (ss_hdemo_sk = hd_demo_sk + AND cd_demo_sk = ss_cdemo_sk + AND cd_marital_status = 'W' + AND cd_education_status = '2 yr Degree' + AND ss_sales_price BETWEEN 150.00 AND 200.00 + AND hd_dep_count = 1 + )) + AND ((ss_addr_sk = ca_address_sk + AND ca_country = 'United States' + AND ca_state IN ('TX', 'OH', 'TX') + AND ss_net_profit BETWEEN 100 AND 200 +) OR + (ss_addr_sk = ca_address_sk + AND ca_country = 'United States' + AND ca_state IN ('OR', 'NM', 'KY') + AND ss_net_profit BETWEEN 150 AND 300 + ) OR + (ss_addr_sk = ca_address_sk + AND ca_country = 'United States' + AND ca_state IN ('VA', 'TX', 'MS') + AND ss_net_profit BETWEEN 50 AND 250 + )) diff --git a/athena-tpcds/src/main/resources/queries/q14a.sql b/athena-tpcds/src/main/resources/queries/q14a.sql new file mode 100644 index 0000000000..5293b6054e --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q14a.sql @@ -0,0 +1,120 @@ +WITH cross_items AS +(SELECT i_item_sk ss_item_sk + FROM item[ TABLE_SUFFIX ], + (SELECT + iss.i_brand_id brand_id, + iss.i_class_id class_id, + iss.i_category_id category_id + FROM store_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] iss, date_dim[ TABLE_SUFFIX ] d1 + WHERE ss_item_sk = iss.i_item_sk + AND ss_sold_date_sk = d1.d_date_sk + AND d1.d_year BETWEEN 1999 AND 1999 + 2 + INTERSECT + SELECT + ics.i_brand_id, + ics.i_class_id, + ics.i_category_id + FROM catalog_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] ics, date_dim[ TABLE_SUFFIX ] d2 + WHERE cs_item_sk = ics.i_item_sk + AND cs_sold_date_sk = d2.d_date_sk + AND d2.d_year BETWEEN 1999 AND 1999 + 2 + INTERSECT + SELECT + iws.i_brand_id, + iws.i_class_id, + iws.i_category_id + FROM web_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] iws, date_dim[ TABLE_SUFFIX ] d3 + WHERE ws_item_sk = iws.i_item_sk + AND ws_sold_date_sk = d3.d_date_sk + AND d3.d_year BETWEEN 1999 AND 1999 + 2) x + WHERE i_brand_id = brand_id + AND i_class_id = class_id + AND i_category_id = category_id +), + avg_sales AS + (SELECT avg(quantity * list_price) average_sales + FROM ( + SELECT + ss_quantity quantity, + ss_list_price list_price + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE ss_sold_date_sk = d_date_sk + AND d_year BETWEEN 1999 AND 2001 + UNION ALL + SELECT + cs_quantity quantity, + cs_list_price list_price + FROM catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE cs_sold_date_sk = d_date_sk + AND d_year BETWEEN 1999 AND 1999 + 2 + UNION ALL + SELECT + ws_quantity quantity, + ws_list_price list_price + FROM web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE ws_sold_date_sk = d_date_sk + AND d_year BETWEEN 1999 AND 1999 + 2) x) +SELECT + channel, + i_brand_id, + i_class_id, + i_category_id, + sum(sales), + sum(number_sales) +FROM ( + SELECT + 'store' channel, + i_brand_id, + i_class_id, + i_category_id, + sum(ss_quantity * ss_list_price) sales, + count(*) number_sales + FROM store_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE ss_item_sk IN (SELECT ss_item_sk + FROM cross_items) + AND ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND d_year = 1999 + 2 + AND d_moy = 11 + GROUP BY i_brand_id, i_class_id, i_category_id + HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales + FROM avg_sales) + UNION ALL + SELECT + 'catalog' channel, + i_brand_id, + i_class_id, + i_category_id, + sum(cs_quantity * cs_list_price) sales, + count(*) number_sales + FROM catalog_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE cs_item_sk IN (SELECT ss_item_sk + FROM cross_items) + AND cs_item_sk = i_item_sk + AND cs_sold_date_sk = d_date_sk + AND d_year = 1999 + 2 + AND d_moy = 11 + GROUP BY i_brand_id, i_class_id, i_category_id + HAVING sum(cs_quantity * cs_list_price) > (SELECT average_sales FROM avg_sales) + UNION ALL + SELECT + 'web' channel, + i_brand_id, + i_class_id, + i_category_id, + sum(ws_quantity * ws_list_price) sales, + count(*) number_sales + FROM web_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE ws_item_sk IN (SELECT ss_item_sk + FROM cross_items) + AND ws_item_sk = i_item_sk + AND ws_sold_date_sk = d_date_sk + AND d_year = 1999 + 2 + AND d_moy = 11 + GROUP BY i_brand_id, i_class_id, i_category_id + HAVING sum(ws_quantity * ws_list_price) > (SELECT average_sales + FROM avg_sales) + ) +GROUP BY ROLLUP (channel, i_brand_id, i_class_id, i_category_id) +ORDER BY channel, i_brand_id, i_class_id, i_category_id +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q14b.sql b/athena-tpcds/src/main/resources/queries/q14b.sql new file mode 100644 index 0000000000..929a8484bf --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q14b.sql @@ -0,0 +1,95 @@ +WITH cross_items AS +(SELECT i_item_sk ss_item_sk + FROM item, + (SELECT + iss.i_brand_id brand_id, + iss.i_class_id class_id, + iss.i_category_id category_id + FROM store_sales, item iss, date_dim d1 + WHERE ss_item_sk = iss.i_item_sk + AND ss_sold_date_sk = d1.d_date_sk + AND d1.d_year BETWEEN 1999 AND 1999 + 2 + INTERSECT + SELECT + ics.i_brand_id, + ics.i_class_id, + ics.i_category_id + FROM catalog_sales, item ics, date_dim d2 + WHERE cs_item_sk = ics.i_item_sk + AND cs_sold_date_sk = d2.d_date_sk + AND d2.d_year BETWEEN 1999 AND 1999 + 2 + INTERSECT + SELECT + iws.i_brand_id, + iws.i_class_id, + iws.i_category_id + FROM web_sales, item iws, date_dim d3 + WHERE ws_item_sk = iws.i_item_sk + AND ws_sold_date_sk = d3.d_date_sk + AND d3.d_year BETWEEN 1999 AND 1999 + 2) x + WHERE i_brand_id = brand_id + AND i_class_id = class_id + AND i_category_id = category_id +), + avg_sales AS + (SELECT avg(quantity * list_price) average_sales + FROM (SELECT + ss_quantity quantity, + ss_list_price list_price + FROM store_sales, date_dim + WHERE ss_sold_date_sk = d_date_sk AND d_year BETWEEN 1999 AND 1999 + 2 + UNION ALL + SELECT + cs_quantity quantity, + cs_list_price list_price + FROM catalog_sales, date_dim + WHERE cs_sold_date_sk = d_date_sk AND d_year BETWEEN 1999 AND 1999 + 2 + UNION ALL + SELECT + ws_quantity quantity, + ws_list_price list_price + FROM web_sales, date_dim + WHERE ws_sold_date_sk = d_date_sk AND d_year BETWEEN 1999 AND 1999 + 2) x) +SELECT * +FROM + (SELECT + 'store' channel, + i_brand_id, + i_class_id, + i_category_id, + sum(ss_quantity * ss_list_price) sales, + count(*) number_sales + FROM store_sales, item, date_dim + WHERE ss_item_sk IN (SELECT ss_item_sk + FROM cross_items) + AND ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND d_week_seq = (SELECT d_week_seq + FROM date_dim + WHERE d_year = 1999 + 1 AND d_moy = 12 AND d_dom = 11) + GROUP BY i_brand_id, i_class_id, i_category_id + HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales + FROM avg_sales)) this_year, + (SELECT + 'store' channel, + i_brand_id, + i_class_id, + i_category_id, + sum(ss_quantity * ss_list_price) sales, + count(*) number_sales + FROM store_sales, item, date_dim + WHERE ss_item_sk IN (SELECT ss_item_sk + FROM cross_items) + AND ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND d_week_seq = (SELECT d_week_seq + FROM date_dim + WHERE d_year = 1999 AND d_moy = 12 AND d_dom = 11) + GROUP BY i_brand_id, i_class_id, i_category_id + HAVING sum(ss_quantity * ss_list_price) > (SELECT average_sales + FROM avg_sales)) last_year +WHERE this_year.i_brand_id = last_year.i_brand_id + AND this_year.i_class_id = last_year.i_class_id + AND this_year.i_category_id = last_year.i_category_id +ORDER BY this_year.channel, this_year.i_brand_id, this_year.i_class_id, this_year.i_category_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q15.sql b/athena-tpcds/src/main/resources/queries/q15.sql new file mode 100644 index 0000000000..bdd125ab01 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q15.sql @@ -0,0 +1,15 @@ +SELECT + ca_zip, + sum(cs_sales_price) +FROM catalog_sales[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] +WHERE cs_bill_customer_sk = c_customer_sk + AND c_current_addr_sk = ca_address_sk + AND (substr(ca_zip, 1, 5) IN ('85669', '86197', '88274', '83405', '86475', + '85392', '85460', '80348', '81792') + OR ca_state IN ('CA', 'WA', 'GA') + OR cs_sales_price > 500) + AND cs_sold_date_sk = d_date_sk + AND d_qoy = 2 AND d_year = 2001 +GROUP BY ca_zip +ORDER BY ca_zip +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q16.sql b/athena-tpcds/src/main/resources/queries/q16.sql new file mode 100644 index 0000000000..cf6636cb44 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q16.sql @@ -0,0 +1,23 @@ +SELECT + count(DISTINCT cs_order_number) AS "order count ", + sum(cs_ext_ship_cost) AS "total shipping cost ", + sum(cs_net_profit) AS "total net profit " +FROM + catalog_sales cs1, date_dim, customer_address, call_center +WHERE + CAST(d_date as DATE) BETWEEN CAST('2002-02-01' AS DATE) AND (CAST('2002-02-01' AS DATE) + INTERVAL '60' day) + AND cs1.cs_ship_date_sk = d_date_sk + AND cs1.cs_ship_addr_sk = ca_address_sk + AND ca_state = 'GA' + AND cs1.cs_call_center_sk = cc_call_center_sk + AND cc_county IN + ('Williamson County', 'Williamson County', 'Williamson County', 'Williamson County', 'Williamson County') + AND EXISTS(SELECT * + FROM catalog_sales cs2 + WHERE cs1.cs_order_number = cs2.cs_order_number + AND cs1.cs_warehouse_sk <> cs2.cs_warehouse_sk) + AND NOT EXISTS(SELECT * + FROM catalog_returns cr1 + WHERE cs1.cs_order_number = cr1.cr_order_number) +ORDER BY count(DISTINCT cs_order_number) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q17.sql b/athena-tpcds/src/main/resources/queries/q17.sql new file mode 100644 index 0000000000..cecf05fc46 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q17.sql @@ -0,0 +1,33 @@ +SELECT + i_item_id, + i_item_desc, + s_state, + count(ss_quantity) AS store_sales_quantitycount, + avg(ss_quantity) AS store_sales_quantityave, + stddev_samp(ss_quantity) AS store_sales_quantitystdev, + stddev_samp(ss_quantity) / avg(ss_quantity) AS store_sales_quantitycov, + count(sr_return_quantity) as_store_returns_quantitycount, + avg(sr_return_quantity) as_store_returns_quantityave, + stddev_samp(sr_return_quantity) as_store_returns_quantitystdev, + stddev_samp(sr_return_quantity) / avg(sr_return_quantity) AS store_returns_quantitycov, + count(cs_quantity) AS catalog_sales_quantitycount, + avg(cs_quantity) AS catalog_sales_quantityave, + stddev_samp(cs_quantity) / avg(cs_quantity) AS catalog_sales_quantitystdev, + stddev_samp(cs_quantity) / avg(cs_quantity) AS catalog_sales_quantitycov +FROM store_sales[ TABLE_SUFFIX ], store_returns[ TABLE_SUFFIX ], catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] d1, date_dim[ TABLE_SUFFIX ] d2, date_dim[ TABLE_SUFFIX ] d3, store[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] +WHERE d1.d_quarter_name = '2001Q1' + AND d1.d_date_sk = ss_sold_date_sk + AND i_item_sk = ss_item_sk + AND s_store_sk = ss_store_sk + AND ss_customer_sk = sr_customer_sk + AND ss_item_sk = sr_item_sk + AND ss_ticket_number = sr_ticket_number + AND sr_returned_date_sk = d2.d_date_sk + AND d2.d_quarter_name IN ('2001Q1', '2001Q2', '2001Q3') + AND sr_customer_sk = cs_bill_customer_sk + AND sr_item_sk = cs_item_sk + AND cs_sold_date_sk = d3.d_date_sk + AND d3.d_quarter_name IN ('2001Q1', '2001Q2', '2001Q3') +GROUP BY i_item_id, i_item_desc, s_state +ORDER BY i_item_id, i_item_desc, s_state +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q18.sql b/athena-tpcds/src/main/resources/queries/q18.sql new file mode 100644 index 0000000000..91d5b489fa --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q18.sql @@ -0,0 +1,28 @@ +SELECT + i_item_id, + ca_country, + ca_state, + ca_county, + avg(cast(cs_quantity AS DECIMAL(12, 2))) agg1, + avg(cast(cs_list_price AS DECIMAL(12, 2))) agg2, + avg(cast(cs_coupon_amt AS DECIMAL(12, 2))) agg3, + avg(cast(cs_sales_price AS DECIMAL(12, 2))) agg4, + avg(cast(cs_net_profit AS DECIMAL(12, 2))) agg5, + avg(cast(c_birth_year AS DECIMAL(12, 2))) agg6, + avg(cast(cd1.cd_dep_count AS DECIMAL(12, 2))) agg7 +FROM catalog_sales[ TABLE_SUFFIX ], customer_demographics[ TABLE_SUFFIX ] cd1, + customer_demographics[ TABLE_SUFFIX ] cd2, customer[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] +WHERE cs_sold_date_sk = d_date_sk AND + cs_item_sk = i_item_sk AND + cs_bill_cdemo_sk = cd1.cd_demo_sk AND + cs_bill_customer_sk = c_customer_sk AND + cd1.cd_gender = 'F' AND + cd1.cd_education_status = 'Unknown' AND + c_current_cdemo_sk = cd2.cd_demo_sk AND + c_current_addr_sk = ca_address_sk AND + c_birth_month IN (1, 6, 8, 9, 12, 2) AND + d_year = 1998 AND + ca_state IN ('MS', 'IN', 'ND', 'OK', 'NM', 'VA', 'MS') +GROUP BY ROLLUP (i_item_id, ca_country, ca_state, ca_county) +ORDER BY ca_country, ca_state, ca_county, i_item_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q19.sql b/athena-tpcds/src/main/resources/queries/q19.sql new file mode 100644 index 0000000000..63579caf38 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q19.sql @@ -0,0 +1,19 @@ +SELECT + i_brand_id brand_id, + i_brand brand, + i_manufact_id, + i_manufact, + sum(ss_ext_sales_price) ext_price +FROM date_dim[ TABLE_SUFFIX ], store_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ] +WHERE d_date_sk = ss_sold_date_sk + AND ss_item_sk = i_item_sk + AND i_manager_id = 8 + AND d_moy = 11 + AND d_year = 1998 + AND ss_customer_sk = c_customer_sk + AND c_current_addr_sk = ca_address_sk + AND substr(ca_zip, 1, 5) <> substr(s_zip, 1, 5) + AND ss_store_sk = s_store_sk +GROUP BY i_brand, i_brand_id, i_manufact_id, i_manufact +ORDER BY ext_price DESC, brand, brand_id, i_manufact_id, i_manufact +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q2.sql b/athena-tpcds/src/main/resources/queries/q2.sql new file mode 100644 index 0000000000..b722db6b58 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q2.sql @@ -0,0 +1,81 @@ +WITH wscs AS +( SELECT + sold_date_sk, + sales_price + FROM (SELECT + ws_sold_date_sk sold_date_sk, + ws_ext_sales_price sales_price + FROM web_sales[ TABLE_SUFFIX ]) x + UNION ALL + (SELECT + cs_sold_date_sk sold_date_sk, + cs_ext_sales_price sales_price + FROM catalog_sales[ TABLE_SUFFIX ])), + wswscs AS + ( SELECT + d_week_seq, + sum(CASE WHEN (d_day_name = 'Sunday') + THEN sales_price + ELSE NULL END) + sun_sales, + sum(CASE WHEN (d_day_name = 'Monday') + THEN sales_price + ELSE NULL END) + mon_sales, + sum(CASE WHEN (d_day_name = 'Tuesday') + THEN sales_price + ELSE NULL END) + tue_sales, + sum(CASE WHEN (d_day_name = 'Wednesday') + THEN sales_price + ELSE NULL END) + wed_sales, + sum(CASE WHEN (d_day_name = 'Thursday') + THEN sales_price + ELSE NULL END) + thu_sales, + sum(CASE WHEN (d_day_name = 'Friday') + THEN sales_price + ELSE NULL END) + fri_sales, + sum(CASE WHEN (d_day_name = 'Saturday') + THEN sales_price + ELSE NULL END) + sat_sales + FROM wscs, date_dim[ TABLE_SUFFIX ] + WHERE d_date_sk = sold_date_sk + GROUP BY d_week_seq) +SELECT + d_week_seq1, + round(sun_sales1 / sun_sales2, 2), + round(mon_sales1 / mon_sales2, 2), + round(tue_sales1 / tue_sales2, 2), + round(wed_sales1 / wed_sales2, 2), + round(thu_sales1 / thu_sales2, 2), + round(fri_sales1 / fri_sales2, 2), + round(sat_sales1 / sat_sales2, 2) +FROM + (SELECT + wswscs.d_week_seq d_week_seq1, + sun_sales sun_sales1, + mon_sales mon_sales1, + tue_sales tue_sales1, + wed_sales wed_sales1, + thu_sales thu_sales1, + fri_sales fri_sales1, + sat_sales sat_sales1 + FROM wswscs, date_dim[ TABLE_SUFFIX ] + WHERE date_dim[ TABLE_SUFFIX ].d_week_seq = wswscs.d_week_seq AND d_year = 2001) y, + (SELECT + wswscs.d_week_seq d_week_seq2, + sun_sales sun_sales2, + mon_sales mon_sales2, + tue_sales tue_sales2, + wed_sales wed_sales2, + thu_sales thu_sales2, + fri_sales fri_sales2, + sat_sales sat_sales2 + FROM wswscs, date_dim[ TABLE_SUFFIX ] + WHERE date_dim[ TABLE_SUFFIX ].d_week_seq = wswscs.d_week_seq AND d_year = 2001 + 1) z +WHERE d_week_seq1 = d_week_seq2 - 53 +ORDER BY d_week_seq1 diff --git a/athena-tpcds/src/main/resources/queries/q20.sql b/athena-tpcds/src/main/resources/queries/q20.sql new file mode 100644 index 0000000000..d47f50897a --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q20.sql @@ -0,0 +1,18 @@ +SELECT + i_item_desc, + i_category, + i_class, + i_current_price, + sum(cs_ext_sales_price) AS itemrevenue, + sum(cs_ext_sales_price) * 100 / sum(sum(cs_ext_sales_price)) + OVER + (PARTITION BY i_class) AS revenueratio +FROM catalog_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] +WHERE cs_item_sk = i_item_sk + AND i_category IN ('Sports', 'Books', 'Home') + AND cs_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('1999-02-22' AS DATE) +AND (cast('1999-02-22' AS DATE) + INTERVAL '30' day) +GROUP BY i_item_id, i_item_desc, i_category, i_class, i_current_price +ORDER BY i_category, i_class, i_item_id, i_item_desc, revenueratio +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q21.sql b/athena-tpcds/src/main/resources/queries/q21.sql new file mode 100644 index 0000000000..7781a0a8db --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q21.sql @@ -0,0 +1,25 @@ +SELECT * +FROM ( + SELECT + w_warehouse_name, + i_item_id, + sum(CASE WHEN (cast(d_date AS DATE) < cast('2000-03-11' AS DATE)) + THEN inv_quantity_on_hand + ELSE 0 END) AS inv_before, + sum(CASE WHEN (cast(d_date AS DATE) >= cast('2000-03-11' AS DATE)) + THEN inv_quantity_on_hand + ELSE 0 END) AS inv_after + FROM inventory[ TABLE_SUFFIX ], warehouse[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE i_current_price BETWEEN 0.99 AND 1.49 + AND i_item_sk = inv_item_sk + AND inv_warehouse_sk = w_warehouse_sk + AND inv_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN (cast('2000-03-11' AS DATE) - INTERVAL '30' day) + AND (cast('2000-03-11' AS DATE) + INTERVAL '30' day) + GROUP BY w_warehouse_name, i_item_id) x +WHERE (CASE WHEN inv_before > 0 + THEN inv_after / inv_before + ELSE NULL + END) BETWEEN 2.0 / 3.0 AND 3.0 / 2.0 +ORDER BY w_warehouse_name, i_item_id +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q22.sql b/athena-tpcds/src/main/resources/queries/q22.sql new file mode 100644 index 0000000000..a02a38a396 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q22.sql @@ -0,0 +1,14 @@ +SELECT + i_product_name, + i_brand, + i_class, + i_category, + avg(inv_quantity_on_hand) qoh +FROM inventory[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], warehouse[ TABLE_SUFFIX ] +WHERE inv_date_sk = d_date_sk + AND inv_item_sk = i_item_sk + AND inv_warehouse_sk = w_warehouse_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 +GROUP BY ROLLUP (i_product_name, i_brand, i_class, i_category) +ORDER BY qoh, i_product_name, i_brand, i_class, i_category +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q23a.sql b/athena-tpcds/src/main/resources/queries/q23a.sql new file mode 100644 index 0000000000..32a3416512 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q23a.sql @@ -0,0 +1,53 @@ +WITH frequent_ss_items AS +(SELECT + substr(i_item_desc, 1, 30) itemdesc, + i_item_sk item_sk, + d_date solddate, + count(*) cnt + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] + WHERE ss_sold_date_sk = d_date_sk + AND ss_item_sk = i_item_sk + AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3) + GROUP BY substr(i_item_desc, 1, 30), i_item_sk, d_date + HAVING count(*) > 4), + max_store_sales AS + (SELECT max(csales) tpcds_cmax + FROM (SELECT + c_customer_sk, + sum(ss_quantity * ss_sales_price) csales + FROM store_sales[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE ss_customer_sk = c_customer_sk + AND ss_sold_date_sk = d_date_sk + AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3) + GROUP BY c_customer_sk) x), + best_ss_customer AS + (SELECT + c_customer_sk, + sum(ss_quantity * ss_sales_price) ssales + FROM store_sales[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] + WHERE ss_customer_sk = c_customer_sk + GROUP BY c_customer_sk + HAVING sum(ss_quantity * ss_sales_price) > (50 / 100.0) * + (SELECT * + FROM max_store_sales)) +SELECT sum(sales) +FROM ((SELECT cs_quantity * cs_list_price sales +FROM catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] +WHERE d_year = 2000 + AND d_moy = 2 + AND cs_sold_date_sk = d_date_sk + AND cs_item_sk IN (SELECT item_sk +FROM frequent_ss_items) + AND cs_bill_customer_sk IN (SELECT c_customer_sk +FROM best_ss_customer)) + UNION ALL + (SELECT ws_quantity * ws_list_price sales + FROM web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE d_year = 2000 + AND d_moy = 2 + AND ws_sold_date_sk = d_date_sk + AND ws_item_sk IN (SELECT item_sk + FROM frequent_ss_items) + AND ws_bill_customer_sk IN (SELECT c_customer_sk + FROM best_ss_customer))) y +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q23b.sql b/athena-tpcds/src/main/resources/queries/q23b.sql new file mode 100644 index 0000000000..04ff14e197 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q23b.sql @@ -0,0 +1,67 @@ +WITH frequent_ss_items AS +(SELECT + substr(i_item_desc, 1, 30) itemdesc, + i_item_sk item_sk, + d_date solddate, + count(*) cnt + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] + WHERE ss_sold_date_sk = d_date_sk + AND ss_item_sk = i_item_sk + AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3) + GROUP BY substr(i_item_desc, 1, 30), i_item_sk, d_date + HAVING count(*) > 4), + max_store_sales AS + (SELECT max(csales) tpcds_cmax + FROM (SELECT + c_customer_sk, + sum(ss_quantity * ss_sales_price) csales + FROM store_sales[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE ss_customer_sk = c_customer_sk + AND ss_sold_date_sk = d_date_sk + AND d_year IN (2000, 2000 + 1, 2000 + 2, 2000 + 3) + GROUP BY c_customer_sk) x), + best_ss_customer AS + (SELECT + c_customer_sk, + sum(ss_quantity * ss_sales_price) ssales + FROM store_sales[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] + WHERE ss_customer_sk = c_customer_sk + GROUP BY c_customer_sk + HAVING sum(ss_quantity * ss_sales_price) > (50 / 100.0) * + (SELECT * + FROM max_store_sales)) +SELECT + c_last_name, + c_first_name, + sales +FROM ((SELECT + c_last_name, + c_first_name, + sum(cs_quantity * cs_list_price) sales +FROM catalog_sales[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] +WHERE d_year = 2000 + AND d_moy = 2 + AND cs_sold_date_sk = d_date_sk + AND cs_item_sk IN (SELECT item_sk +FROM frequent_ss_items) + AND cs_bill_customer_sk IN (SELECT c_customer_sk +FROM best_ss_customer) + AND cs_bill_customer_sk = c_customer_sk +GROUP BY c_last_name, c_first_name) + UNION ALL + (SELECT + c_last_name, + c_first_name, + sum(ws_quantity * ws_list_price) sales + FROM web_sales, customer, date_dim + WHERE d_year = 2000 + AND d_moy = 2 + AND ws_sold_date_sk = d_date_sk + AND ws_item_sk IN (SELECT item_sk + FROM frequent_ss_items) + AND ws_bill_customer_sk IN (SELECT c_customer_sk + FROM best_ss_customer) + AND ws_bill_customer_sk = c_customer_sk + GROUP BY c_last_name, c_first_name)) y +ORDER BY c_last_name, c_first_name, sales +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q24a.sql b/athena-tpcds/src/main/resources/queries/q24a.sql new file mode 100644 index 0000000000..a6efec1d3b --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q24a.sql @@ -0,0 +1,34 @@ +WITH ssales AS +(SELECT + c_last_name, + c_first_name, + s_store_name, + ca_state, + s_state, + i_color, + i_current_price, + i_manager_id, + i_units, + i_size, + sum(ss_net_paid) netpaid + FROM store_sales[ TABLE_SUFFIX ], store_returns[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ] + WHERE ss_ticket_number = sr_ticket_number + AND ss_item_sk = sr_item_sk + AND ss_customer_sk = c_customer_sk + AND ss_item_sk = i_item_sk + AND ss_store_sk = s_store_sk + AND c_birth_country = upper(ca_country) + AND s_zip = ca_zip + AND s_market_id = 8 + GROUP BY c_last_name, c_first_name, s_store_name, ca_state, s_state, i_color, + i_current_price, i_manager_id, i_units, i_size) +SELECT + c_last_name, + c_first_name, + s_store_name, + sum(netpaid) paid +FROM ssales +WHERE i_color = 'pale' +GROUP BY c_last_name, c_first_name, s_store_name +HAVING sum(netpaid) > (SELECT 0.05 * avg(netpaid) +FROM ssales) \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q24b.sql b/athena-tpcds/src/main/resources/queries/q24b.sql new file mode 100644 index 0000000000..9d23f6857c --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q24b.sql @@ -0,0 +1,34 @@ +WITH ssales AS +(SELECT + c_last_name, + c_first_name, + s_store_name, + ca_state, + s_state, + i_color, + i_current_price, + i_manager_id, + i_units, + i_size, + sum(ss_net_paid) netpaid + FROM store_sales[ TABLE_SUFFIX ], store_returns[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ] + WHERE ss_ticket_number = sr_ticket_number + AND ss_item_sk = sr_item_sk + AND ss_customer_sk = c_customer_sk + AND ss_item_sk = i_item_sk + AND ss_store_sk = s_store_sk + AND c_birth_country = upper(ca_country) + AND s_zip = ca_zip + AND s_market_id = 8 + GROUP BY c_last_name, c_first_name, s_store_name, ca_state, s_state, + i_color, i_current_price, i_manager_id, i_units, i_size) +SELECT + c_last_name, + c_first_name, + s_store_name, + sum(netpaid) paid +FROM ssales +WHERE i_color = 'chiffon' +GROUP BY c_last_name, c_first_name, s_store_name +HAVING sum(netpaid) > (SELECT 0.05 * avg(netpaid) +FROM ssales) diff --git a/athena-tpcds/src/main/resources/queries/q25.sql b/athena-tpcds/src/main/resources/queries/q25.sql new file mode 100644 index 0000000000..4aa8fca199 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q25.sql @@ -0,0 +1,33 @@ +SELECT + i_item_id, + i_item_desc, + s_store_id, + s_store_name, + sum(ss_net_profit) AS store_sales_profit, + sum(sr_net_loss) AS store_returns_loss, + sum(cs_net_profit) AS catalog_sales_profit +FROM + store_sales[ TABLE_SUFFIX ], store_returns[ TABLE_SUFFIX ], catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] d1, + date_dim[ TABLE_SUFFIX ] d2, date_dim[ TABLE_SUFFIX ] d3, store[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] +WHERE + d1.d_moy = 4 + AND d1.d_year = 2001 + AND d1.d_date_sk = ss_sold_date_sk + AND i_item_sk = ss_item_sk + AND s_store_sk = ss_store_sk + AND ss_customer_sk = sr_customer_sk + AND ss_item_sk = sr_item_sk + AND ss_ticket_number = sr_ticket_number + AND sr_returned_date_sk = d2.d_date_sk + AND d2.d_moy BETWEEN 4 AND 10 + AND d2.d_year = 2001 + AND sr_customer_sk = cs_bill_customer_sk + AND sr_item_sk = cs_item_sk + AND cs_sold_date_sk = d3.d_date_sk + AND d3.d_moy BETWEEN 4 AND 10 + AND d3.d_year = 2001 +GROUP BY + i_item_id, i_item_desc, s_store_id, s_store_name +ORDER BY + i_item_id, i_item_desc, s_store_id, s_store_name +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q26.sql b/athena-tpcds/src/main/resources/queries/q26.sql new file mode 100644 index 0000000000..b66f74ddee --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q26.sql @@ -0,0 +1,19 @@ +SELECT + i_item_id, + avg(cs_quantity) agg1, + avg(cs_list_price) agg2, + avg(cs_coupon_amt) agg3, + avg(cs_sales_price) agg4 +FROM catalog_sales[ TABLE_SUFFIX ], customer_demographics[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], promotion[ TABLE_SUFFIX ] +WHERE cs_sold_date_sk = d_date_sk AND + cs_item_sk = i_item_sk AND + cs_bill_cdemo_sk = cd_demo_sk AND + cs_promo_sk = p_promo_sk AND + cd_gender = 'M' AND + cd_marital_status = 'S' AND + cd_education_status = 'College' AND + (p_channel_email = 'N' OR p_channel_event = 'N') AND + d_year = 2000 +GROUP BY i_item_id +ORDER BY i_item_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q27.sql b/athena-tpcds/src/main/resources/queries/q27.sql new file mode 100644 index 0000000000..907f78124f --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q27.sql @@ -0,0 +1,21 @@ +SELECT + i_item_id, + s_state, + grouping(s_state) g_state, + avg(ss_quantity) agg1, + avg(ss_list_price) agg2, + avg(ss_coupon_amt) agg3, + avg(ss_sales_price) agg4 +FROM store_sales[ TABLE_SUFFIX ], customer_demographics[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] +WHERE ss_sold_date_sk = d_date_sk AND + ss_item_sk = i_item_sk AND + ss_store_sk = s_store_sk AND + ss_cdemo_sk = cd_demo_sk AND + cd_gender = 'M' AND + cd_marital_status = 'S' AND + cd_education_status = 'College' AND + d_year = 2002 AND + s_state IN ('TN', 'TN', 'TN', 'TN', 'TN', 'TN') +GROUP BY ROLLUP (i_item_id, s_state) +ORDER BY i_item_id, s_state +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q28.sql b/athena-tpcds/src/main/resources/queries/q28.sql new file mode 100644 index 0000000000..2b9b7a5360 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q28.sql @@ -0,0 +1,56 @@ +SELECT * +FROM (SELECT + avg(ss_list_price) B1_LP, + count(ss_list_price) B1_CNT, + count(DISTINCT ss_list_price) B1_CNTD +FROM store_sales[ TABLE_SUFFIX ] +WHERE ss_quantity BETWEEN 0 AND 5 + AND (ss_list_price BETWEEN 8 AND 8 + 10 + OR ss_coupon_amt BETWEEN 459 AND 459 + 1000 + OR ss_wholesale_cost BETWEEN 57 AND 57 + 20)) B1, + (SELECT + avg(ss_list_price) B2_LP, + count(ss_list_price) B2_CNT, + count(DISTINCT ss_list_price) B2_CNTD + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 6 AND 10 + AND (ss_list_price BETWEEN 90 AND 90 + 10 + OR ss_coupon_amt BETWEEN 2323 AND 2323 + 1000 + OR ss_wholesale_cost BETWEEN 31 AND 31 + 20)) B2, + (SELECT + avg(ss_list_price) B3_LP, + count(ss_list_price) B3_CNT, + count(DISTINCT ss_list_price) B3_CNTD + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 11 AND 15 + AND (ss_list_price BETWEEN 142 AND 142 + 10 + OR ss_coupon_amt BETWEEN 12214 AND 12214 + 1000 + OR ss_wholesale_cost BETWEEN 79 AND 79 + 20)) B3, + (SELECT + avg(ss_list_price) B4_LP, + count(ss_list_price) B4_CNT, + count(DISTINCT ss_list_price) B4_CNTD + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 16 AND 20 + AND (ss_list_price BETWEEN 135 AND 135 + 10 + OR ss_coupon_amt BETWEEN 6071 AND 6071 + 1000 + OR ss_wholesale_cost BETWEEN 38 AND 38 + 20)) B4, + (SELECT + avg(ss_list_price) B5_LP, + count(ss_list_price) B5_CNT, + count(DISTINCT ss_list_price) B5_CNTD + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 21 AND 25 + AND (ss_list_price BETWEEN 122 AND 122 + 10 + OR ss_coupon_amt BETWEEN 836 AND 836 + 1000 + OR ss_wholesale_cost BETWEEN 17 AND 17 + 20)) B5, + (SELECT + avg(ss_list_price) B6_LP, + count(ss_list_price) B6_CNT, + count(DISTINCT ss_list_price) B6_CNTD + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 26 AND 30 + AND (ss_list_price BETWEEN 154 AND 154 + 10 + OR ss_coupon_amt BETWEEN 7326 AND 7326 + 1000 + OR ss_wholesale_cost BETWEEN 7 AND 7 + 20)) B6 +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q29.sql b/athena-tpcds/src/main/resources/queries/q29.sql new file mode 100644 index 0000000000..f6f0c0dd4a --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q29.sql @@ -0,0 +1,32 @@ +SELECT + i_item_id, + i_item_desc, + s_store_id, + s_store_name, + sum(ss_quantity) AS store_sales_quantity, + sum(sr_return_quantity) AS store_returns_quantity, + sum(cs_quantity) AS catalog_sales_quantity +FROM + store_sales[ TABLE_SUFFIX ], store_returns[ TABLE_SUFFIX ], catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] d1, date_dim[ TABLE_SUFFIX ] d2, + date_dim[ TABLE_SUFFIX ] d3, store[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] +WHERE + d1.d_moy = 9 + AND d1.d_year = 1999 + AND d1.d_date_sk = ss_sold_date_sk + AND i_item_sk = ss_item_sk + AND s_store_sk = ss_store_sk + AND ss_customer_sk = sr_customer_sk + AND ss_item_sk = sr_item_sk + AND ss_ticket_number = sr_ticket_number + AND sr_returned_date_sk = d2.d_date_sk + AND d2.d_moy BETWEEN 9 AND 9 + 3 + AND d2.d_year = 1999 + AND sr_customer_sk = cs_bill_customer_sk + AND sr_item_sk = cs_item_sk + AND cs_sold_date_sk = d3.d_date_sk + AND d3.d_year IN (1999, 1999 + 1, 1999 + 2) +GROUP BY + i_item_id, i_item_desc, s_store_id, s_store_name +ORDER BY + i_item_id, i_item_desc, s_store_id, s_store_name +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q3.sql b/athena-tpcds/src/main/resources/queries/q3.sql new file mode 100644 index 0000000000..4ec201ef16 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q3.sql @@ -0,0 +1,13 @@ +SELECT + dt.d_year, + it.i_brand_id brand_id, + it.i_brand brand, + SUM(ss_ext_sales_price) sum_agg +FROM date_dim[ TABLE_SUFFIX ] dt, store_sales[ TABLE_SUFFIX ] ss, item[ TABLE_SUFFIX ] it +WHERE dt.d_date_sk = ss.ss_sold_date_sk + AND ss.ss_item_sk = it.i_item_sk + AND it.i_manufact_id = 128 + AND dt.d_moy = 11 +GROUP BY dt.d_year, it.i_brand, it.i_brand_id +ORDER BY dt.d_year, sum_agg DESC, brand_id +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q30.sql b/athena-tpcds/src/main/resources/queries/q30.sql new file mode 100644 index 0000000000..986bef566d --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q30.sql @@ -0,0 +1,35 @@ +WITH customer_total_return AS +(SELECT + wr_returning_customer_sk AS ctr_customer_sk, + ca_state AS ctr_state, + sum(wr_return_amt) AS ctr_total_return + FROM web_returns, date_dim, customer_address + WHERE wr_returned_date_sk = d_date_sk + AND d_year = 2002 + AND wr_returning_addr_sk = ca_address_sk + GROUP BY wr_returning_customer_sk, ca_state) +SELECT + c_customer_id, + c_salutation, + c_first_name, + c_last_name, + c_preferred_cust_flag, + c_birth_day, + c_birth_month, + c_birth_year, + c_birth_country, + c_login, + c_email_address, + c_last_review_date, + ctr_total_return +FROM customer_total_return ctr1, customer_address, customer +WHERE ctr1.ctr_total_return > (SELECT avg(ctr_total_return) * 1.2 +FROM customer_total_return ctr2 +WHERE ctr1.ctr_state = ctr2.ctr_state) + AND ca_address_sk = c_current_addr_sk + AND ca_state = 'GA' + AND ctr1.ctr_customer_sk = c_customer_sk +ORDER BY c_customer_id, c_salutation, c_first_name, c_last_name, c_preferred_cust_flag + , c_birth_day, c_birth_month, c_birth_year, c_birth_country, c_login, c_email_address + , c_last_review_date, ctr_total_return +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q31.sql b/athena-tpcds/src/main/resources/queries/q31.sql new file mode 100644 index 0000000000..fec45d5ecc --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q31.sql @@ -0,0 +1,60 @@ +WITH ss AS +(SELECT + ca_county, + d_qoy, + d_year, + sum(ss_ext_sales_price) AS store_sales + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ] + WHERE ss_sold_date_sk = d_date_sk + AND ss_addr_sk = ca_address_sk + GROUP BY ca_county, d_qoy, d_year), + ws AS + (SELECT + ca_county, + d_qoy, + d_year, + sum(ws_ext_sales_price) AS web_sales + FROM web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ] + WHERE ws_sold_date_sk = d_date_sk + AND ws_bill_addr_sk = ca_address_sk + GROUP BY ca_county, d_qoy, d_year) +SELECT + ss1.ca_county, + ss1.d_year, + ws2.web_sales / ws1.web_sales web_q1_q2_increase, + ss2.store_sales / ss1.store_sales store_q1_q2_increase, + ws3.web_sales / ws2.web_sales web_q2_q3_increase, + ss3.store_sales / ss2.store_sales store_q2_q3_increase +FROM + ss ss1, ss ss2, ss ss3, ws ws1, ws ws2, ws ws3 +WHERE + ss1.d_qoy = 1 + AND ss1.d_year = 2000 + AND ss1.ca_county = ss2.ca_county + AND ss2.d_qoy = 2 + AND ss2.d_year = 2000 + AND ss2.ca_county = ss3.ca_county + AND ss3.d_qoy = 3 + AND ss3.d_year = 2000 + AND ss1.ca_county = ws1.ca_county + AND ws1.d_qoy = 1 + AND ws1.d_year = 2000 + AND ws1.ca_county = ws2.ca_county + AND ws2.d_qoy = 2 + AND ws2.d_year = 2000 + AND ws1.ca_county = ws3.ca_county + AND ws3.d_qoy = 3 + AND ws3.d_year = 2000 + AND CASE WHEN ws1.web_sales > 0 + THEN ws2.web_sales / ws1.web_sales + ELSE NULL END + > CASE WHEN ss1.store_sales > 0 + THEN ss2.store_sales / ss1.store_sales + ELSE NULL END + AND CASE WHEN ws2.web_sales > 0 + THEN ws3.web_sales / ws2.web_sales + ELSE NULL END + > CASE WHEN ss2.store_sales > 0 + THEN ss3.store_sales / ss2.store_sales + ELSE NULL END +ORDER BY ss1.ca_county diff --git a/athena-tpcds/src/main/resources/queries/q32.sql b/athena-tpcds/src/main/resources/queries/q32.sql new file mode 100644 index 0000000000..c6eb6d25d0 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q32.sql @@ -0,0 +1,15 @@ +SELECT 1 AS "excess discount amount " +FROM + catalog_sales, item, date_dim +WHERE + i_manufact_id = 977 + AND i_item_sk = cs_item_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-01-27' AS DATE) AND (cast('2000-01-27' AS DATE) + interval '90' day) + AND d_date_sk = cs_sold_date_sk + AND cs_ext_discount_amt > ( + SELECT 1.3 * avg(cs_ext_discount_amt) + FROM catalog_sales, date_dim + WHERE cs_item_sk = i_item_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-01-27' AS DATE) AND (cast('2000-01-27' AS DATE) + interval '90' day) + AND d_date_sk = cs_sold_date_sk) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q33.sql b/athena-tpcds/src/main/resources/queries/q33.sql new file mode 100644 index 0000000000..20e1cc9236 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q33.sql @@ -0,0 +1,65 @@ +WITH ss AS ( + SELECT + i_manufact_id, + sum(ss_ext_sales_price) total_sales + FROM + store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] + WHERE + i_manufact_id IN (SELECT i_manufact_id + FROM item[ TABLE_SUFFIX ] + WHERE i_category IN ('Electronics')) + AND ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND d_year = 1998 + AND d_moy = 5 + AND ss_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_manufact_id), cs AS +(SELECT + i_manufact_id, + sum(cs_ext_sales_price) total_sales + FROM catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] + WHERE + i_manufact_id IN ( + SELECT i_manufact_id + FROM item[ TABLE_SUFFIX ] + WHERE + i_category IN ('Electronics')) + AND cs_item_sk = i_item_sk + AND cs_sold_date_sk = d_date_sk + AND d_year = 1998 + AND d_moy = 5 + AND cs_bill_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_manufact_id), + ws AS ( + SELECT + i_manufact_id, + sum(ws_ext_sales_price) total_sales + FROM + web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer_address[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] + WHERE + i_manufact_id IN (SELECT i_manufact_id + FROM item[ TABLE_SUFFIX ] + WHERE i_category IN ('Electronics')) + AND ws_item_sk = i_item_sk + AND ws_sold_date_sk = d_date_sk + AND d_year = 1998 + AND d_moy = 5 + AND ws_bill_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_manufact_id) +SELECT + i_manufact_id, + sum(total_sales) total_sales +FROM (SELECT * + FROM ss + UNION ALL + SELECT * + FROM cs + UNION ALL + SELECT * + FROM ws) tmp1 +GROUP BY i_manufact_id +ORDER BY total_sales +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q34.sql b/athena-tpcds/src/main/resources/queries/q34.sql new file mode 100644 index 0000000000..c32c379f2f --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q34.sql @@ -0,0 +1,32 @@ +SELECT + c_last_name, + c_first_name, + c_salutation, + c_preferred_cust_flag, + ss_ticket_number, + cnt +FROM + (SELECT + ss_ticket_number, + ss_customer_sk, + count(*) cnt + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ], household_demographics[ TABLE_SUFFIX ] + WHERE store_sales[ TABLE_SUFFIX ].ss_sold_date_sk = date_dim[ TABLE_SUFFIX ].d_date_sk + AND store_sales[ TABLE_SUFFIX ].ss_store_sk = store[ TABLE_SUFFIX ].s_store_sk + AND store_sales[ TABLE_SUFFIX ].ss_hdemo_sk = household_demographics[ TABLE_SUFFIX ].hd_demo_sk + AND (date_dim[ TABLE_SUFFIX ].d_dom BETWEEN 1 AND 3 OR date_dim[ TABLE_SUFFIX ].d_dom BETWEEN 25 AND 28) + AND (household_demographics[ TABLE_SUFFIX ].hd_buy_potential = '>10000' OR + household_demographics[ TABLE_SUFFIX ].hd_buy_potential = 'unknown') + AND household_demographics[ TABLE_SUFFIX ].hd_vehicle_count > 0 + AND (CASE WHEN household_demographics[ TABLE_SUFFIX ].hd_vehicle_count > 0 + THEN household_demographics[ TABLE_SUFFIX ].hd_dep_count / household_demographics[ TABLE_SUFFIX ].hd_vehicle_count + ELSE NULL + END) > 1.2 + AND date_dim[ TABLE_SUFFIX ].d_year IN (1999, 1999 + 1, 1999 + 2) + AND store[ TABLE_SUFFIX ].s_county IN + ('Williamson County', 'Williamson County', 'Williamson County', 'Williamson County', + 'Williamson County', 'Williamson County', 'Williamson County', 'Williamson County') + GROUP BY ss_ticket_number, ss_customer_sk) dn, customer +WHERE ss_customer_sk = c_customer_sk + AND cnt BETWEEN 15 AND 20 +ORDER BY c_last_name, c_first_name, c_salutation, c_preferred_cust_flag DESC diff --git a/athena-tpcds/src/main/resources/queries/q35.sql b/athena-tpcds/src/main/resources/queries/q35.sql new file mode 100644 index 0000000000..cfe4342d8b --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q35.sql @@ -0,0 +1,46 @@ +SELECT + ca_state, + cd_gender, + cd_marital_status, + count(*) cnt1, + min(cd_dep_count), + max(cd_dep_count), + avg(cd_dep_count), + cd_dep_employed_count, + count(*) cnt2, + min(cd_dep_employed_count), + max(cd_dep_employed_count), + avg(cd_dep_employed_count), + cd_dep_college_count, + count(*) cnt3, + min(cd_dep_college_count), + max(cd_dep_college_count), + avg(cd_dep_college_count) +FROM + customer c, customer_address ca, customer_demographics +WHERE + c.c_current_addr_sk = ca.ca_address_sk AND + cd_demo_sk = c.c_current_cdemo_sk AND + exists(SELECT * + FROM store_sales, date_dim + WHERE c.c_customer_sk = ss_customer_sk AND + ss_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_qoy < 4) AND + (exists(SELECT * + FROM web_sales, date_dim + WHERE c.c_customer_sk = ws_bill_customer_sk AND + ws_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_qoy < 4) OR + exists(SELECT * + FROM catalog_sales, date_dim + WHERE c.c_customer_sk = cs_ship_customer_sk AND + cs_sold_date_sk = d_date_sk AND + d_year = 2002 AND + d_qoy < 4)) +GROUP BY ca_state, cd_gender, cd_marital_status, cd_dep_count, + cd_dep_employed_count, cd_dep_college_count +ORDER BY ca_state, cd_gender, cd_marital_status, cd_dep_count, + cd_dep_employed_count, cd_dep_college_count +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q37.sql b/athena-tpcds/src/main/resources/queries/q37.sql new file mode 100644 index 0000000000..b87f65467e --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q37.sql @@ -0,0 +1,15 @@ +SELECT + i_item_id, + i_item_desc, + i_current_price +FROM item, inventory, date_dim, catalog_sales +WHERE i_current_price BETWEEN 68 AND 68 + 30 + AND inv_item_sk = i_item_sk + AND d_date_sk = inv_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-02-01' AS DATE) AND (cast('2000-02-01' AS DATE) + INTERVAL '60' day) + AND i_manufact_id IN (677, 940, 694, 808) + AND inv_quantity_on_hand BETWEEN 100 AND 500 + AND cs_item_sk = i_item_sk +GROUP BY i_item_id, i_item_desc, i_current_price +ORDER BY i_item_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q38.sql b/athena-tpcds/src/main/resources/queries/q38.sql new file mode 100644 index 0000000000..ae1a40d538 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q38.sql @@ -0,0 +1,30 @@ +SELECT count(*) +FROM ( + SELECT DISTINCT + c_last_name, + c_first_name, + d_date + FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] + WHERE store_sales[ TABLE_SUFFIX ].ss_sold_date_sk = date_dim[ TABLE_SUFFIX ].d_date_sk + AND store_sales[ TABLE_SUFFIX ].ss_customer_sk = customer[ TABLE_SUFFIX ].c_customer_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + INTERSECT + SELECT DISTINCT + c_last_name, + c_first_name, + d_date + FROM catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] + WHERE catalog_sales[ TABLE_SUFFIX ].cs_sold_date_sk = date_dim[ TABLE_SUFFIX ].d_date_sk + AND catalog_sales[ TABLE_SUFFIX ].cs_bill_customer_sk = customer[ TABLE_SUFFIX ].c_customer_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + INTERSECT + SELECT DISTINCT + c_last_name, + c_first_name, + d_date + FROM web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] + WHERE web_sales[ TABLE_SUFFIX ].ws_sold_date_sk = date_dim[ TABLE_SUFFIX ].d_date_sk + AND web_sales[ TABLE_SUFFIX ].ws_bill_customer_sk = customer[ TABLE_SUFFIX ].c_customer_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + ) hot_cust +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q39a.sql b/athena-tpcds/src/main/resources/queries/q39a.sql new file mode 100644 index 0000000000..f6f6c7f035 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q39a.sql @@ -0,0 +1,47 @@ +WITH inv AS +(SELECT + w_warehouse_name, + w_warehouse_sk, + i_item_sk, + d_moy, + stdev, + mean, + CASE mean + WHEN 0 + THEN NULL + ELSE stdev / mean END cov + FROM (SELECT + w_warehouse_name, + w_warehouse_sk, + i_item_sk, + d_moy, + stddev_samp(inv_quantity_on_hand) stdev, + avg(inv_quantity_on_hand) mean + FROM inventory[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], warehouse[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE inv_item_sk = i_item_sk + AND inv_warehouse_sk = w_warehouse_sk + AND inv_date_sk = d_date_sk + AND d_year = 2001 + GROUP BY w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy) foo + WHERE CASE mean + WHEN 0 + THEN 0 + ELSE stdev / mean END > 1) +SELECT + inv1.w_warehouse_sk, + inv1.i_item_sk, + inv1.d_moy, + inv1.mean, + inv1.cov, + inv2.w_warehouse_sk, + inv2.i_item_sk, + inv2.d_moy, + inv2.mean, + inv2.cov +FROM inv inv1, inv inv2 +WHERE inv1.i_item_sk = inv2.i_item_sk + AND inv1.w_warehouse_sk = inv2.w_warehouse_sk + AND inv1.d_moy = 1 + AND inv2.d_moy = 1 + 1 +ORDER BY inv1.w_warehouse_sk, inv1.i_item_sk, inv1.d_moy, inv1.mean, inv1.cov + , inv2.d_moy, inv2.mean, inv2.cov \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q39b.sql b/athena-tpcds/src/main/resources/queries/q39b.sql new file mode 100644 index 0000000000..898e4dbe7c --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q39b.sql @@ -0,0 +1,49 @@ +WITH inv AS +(SELECT + w_warehouse_name, + w_warehouse_sk, + i_item_sk, + d_moy, + stdev, + mean, + CASE mean + WHEN 0 + THEN NULL + ELSE stdev / mean END cov + FROM (SELECT + w_warehouse_name, + w_warehouse_sk, + i_item_sk, + d_moy, + stddev_samp(inv_quantity_on_hand) stdev, + avg(inv_quantity_on_hand) mean + FROM inventory, item, warehouse, date_dim + WHERE inv_item_sk = i_item_sk + AND inv_warehouse_sk = w_warehouse_sk + AND inv_date_sk = d_date_sk + AND d_year = 2001 + GROUP BY w_warehouse_name, w_warehouse_sk, i_item_sk, d_moy) foo + WHERE CASE mean + WHEN 0 + THEN 0 + ELSE stdev / mean END > 1) +SELECT + inv1.w_warehouse_sk, + inv1.i_item_sk, + inv1.d_moy, + inv1.mean, + inv1.cov, + inv2.w_warehouse_sk, + inv2.i_item_sk, + inv2.d_moy, + inv2.mean, + inv2.cov +FROM inv inv1, inv inv2 +WHERE inv1.i_item_sk = inv2.i_item_sk + AND inv1.w_warehouse_sk = inv2.w_warehouse_sk + AND inv1.d_moy = 1 + AND inv2.d_moy = 1 + 1 + AND inv1.cov > 1.5 +ORDER BY inv1.w_warehouse_sk, inv1.i_item_sk, inv1.d_moy, inv1.mean, inv1.cov + , inv2.d_moy, inv2.mean, inv2.cov + diff --git a/athena-tpcds/src/main/resources/queries/q4.sql b/athena-tpcds/src/main/resources/queries/q4.sql new file mode 100644 index 0000000000..e6779a8bfb --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q4.sql @@ -0,0 +1,120 @@ +WITH year_total AS ( + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + c_preferred_cust_flag customer_preferred_cust_flag, + c_birth_country customer_birth_country, + c_login customer_login, + c_email_address customer_email_address, + d_year dyear, + sum(((ss_ext_list_price - ss_ext_wholesale_cost - ss_ext_discount_amt) + + ss_ext_sales_price) / 2) year_total, + 's' sale_type + FROM customer[ TABLE_SUFFIX ], store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c_customer_sk = ss_customer_sk AND ss_sold_date_sk = d_date_sk + GROUP BY c_customer_id, + c_first_name, + c_last_name, + c_preferred_cust_flag, + c_birth_country, + c_login, + c_email_address, + d_year + UNION ALL + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + c_preferred_cust_flag customer_preferred_cust_flag, + c_birth_country customer_birth_country, + c_login customer_login, + c_email_address customer_email_address, + d_year dyear, + sum((((cs_ext_list_price - cs_ext_wholesale_cost - cs_ext_discount_amt) + + cs_ext_sales_price) / 2)) year_total, + 'c' sale_type + FROM customer[ TABLE_SUFFIX ], catalog_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c_customer_sk = cs_bill_customer_sk AND cs_sold_date_sk = d_date_sk + GROUP BY c_customer_id, + c_first_name, + c_last_name, + c_preferred_cust_flag, + c_birth_country, + c_login, + c_email_address, + d_year + UNION ALL + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + c_preferred_cust_flag customer_preferred_cust_flag, + c_birth_country customer_birth_country, + c_login customer_login, + c_email_address customer_email_address, + d_year dyear, + sum((((ws_ext_list_price - ws_ext_wholesale_cost - ws_ext_discount_amt) + ws_ext_sales_price) / + 2)) year_total, + 'w' sale_type + FROM customer[ TABLE_SUFFIX ], web_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ] + WHERE c_customer_sk = ws_bill_customer_sk AND ws_sold_date_sk = d_date_sk + GROUP BY c_customer_id, + c_first_name, + c_last_name, + c_preferred_cust_flag, + c_birth_country, + c_login, + c_email_address, + d_year) +SELECT + t_s_secyear.customer_id, + t_s_secyear.customer_first_name, + t_s_secyear.customer_last_name, + t_s_secyear.customer_preferred_cust_flag, + t_s_secyear.customer_birth_country, + t_s_secyear.customer_login, + t_s_secyear.customer_email_address +FROM year_total t_s_firstyear, year_total t_s_secyear, year_total t_c_firstyear, + year_total t_c_secyear, year_total t_w_firstyear, year_total t_w_secyear +WHERE t_s_secyear.customer_id = t_s_firstyear.customer_id + AND t_s_firstyear.customer_id = t_c_secyear.customer_id + AND t_s_firstyear.customer_id = t_c_firstyear.customer_id + AND t_s_firstyear.customer_id = t_w_firstyear.customer_id + AND t_s_firstyear.customer_id = t_w_secyear.customer_id + AND t_s_firstyear.sale_type = 's' + AND t_c_firstyear.sale_type = 'c' + AND t_w_firstyear.sale_type = 'w' + AND t_s_secyear.sale_type = 's' + AND t_c_secyear.sale_type = 'c' + AND t_w_secyear.sale_type = 'w' + AND t_s_firstyear.dyear = 2001 + AND t_s_secyear.dyear = 2001 + 1 + AND t_c_firstyear.dyear = 2001 + AND t_c_secyear.dyear = 2001 + 1 + AND t_w_firstyear.dyear = 2001 + AND t_w_secyear.dyear = 2001 + 1 + AND t_s_firstyear.year_total > 0 + AND t_c_firstyear.year_total > 0 + AND t_w_firstyear.year_total > 0 + AND CASE WHEN t_c_firstyear.year_total > 0 + THEN t_c_secyear.year_total / t_c_firstyear.year_total + ELSE NULL END + > CASE WHEN t_s_firstyear.year_total > 0 + THEN t_s_secyear.year_total / t_s_firstyear.year_total + ELSE NULL END + AND CASE WHEN t_c_firstyear.year_total > 0 + THEN t_c_secyear.year_total / t_c_firstyear.year_total + ELSE NULL END + > CASE WHEN t_w_firstyear.year_total > 0 + THEN t_w_secyear.year_total / t_w_firstyear.year_total + ELSE NULL END +ORDER BY + t_s_secyear.customer_id, + t_s_secyear.customer_first_name, + t_s_secyear.customer_last_name, + t_s_secyear.customer_preferred_cust_flag, + t_s_secyear.customer_birth_country, + t_s_secyear.customer_login, + t_s_secyear.customer_email_address +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q40.sql b/athena-tpcds/src/main/resources/queries/q40.sql new file mode 100644 index 0000000000..f6f1f2fda8 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q40.sql @@ -0,0 +1,25 @@ +SELECT + w_state, + i_item_id, + sum(CASE WHEN (cast(d_date AS DATE) < cast('2000-03-11' AS DATE)) + THEN cs_sales_price - coalesce(cr_refunded_cash, 0) + ELSE 0 END) AS sales_before, + sum(CASE WHEN (cast(d_date AS DATE) >= cast('2000-03-11' AS DATE)) + THEN cs_sales_price - coalesce(cr_refunded_cash, 0) + ELSE 0 END) AS sales_after +FROM + catalog_sales + LEFT OUTER JOIN catalog_returns ON + (cs_order_number = cr_order_number + AND cs_item_sk = cr_item_sk) + , warehouse, item, date_dim +WHERE + i_current_price BETWEEN 0.99 AND 1.49 + AND i_item_sk = cs_item_sk + AND cs_warehouse_sk = w_warehouse_sk + AND cs_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN (cast('2000-03-11' AS DATE) - INTERVAL '30' day) + AND (cast('2000-03-11' AS DATE) + INTERVAL '30' day) +GROUP BY w_state, i_item_id +ORDER BY w_state, i_item_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q41.sql b/athena-tpcds/src/main/resources/queries/q41.sql new file mode 100644 index 0000000000..25e317e0e2 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q41.sql @@ -0,0 +1,49 @@ +SELECT DISTINCT (i_product_name) +FROM item i1 +WHERE i_manufact_id BETWEEN 738 AND 738 + 40 + AND (SELECT count(*) AS item_cnt +FROM item +WHERE (i_manufact = i1.i_manufact AND + ((i_category = 'Women' AND + (i_color = 'powder' OR i_color = 'khaki') AND + (i_units = 'Ounce' OR i_units = 'Oz') AND + (i_size = 'medium' OR i_size = 'extra large') + ) OR + (i_category = 'Women' AND + (i_color = 'brown' OR i_color = 'honeydew') AND + (i_units = 'Bunch' OR i_units = 'Ton') AND + (i_size = 'N/A' OR i_size = 'small') + ) OR + (i_category = 'Men' AND + (i_color = 'floral' OR i_color = 'deep') AND + (i_units = 'N/A' OR i_units = 'Dozen') AND + (i_size = 'petite' OR i_size = 'large') + ) OR + (i_category = 'Men' AND + (i_color = 'light' OR i_color = 'cornflower') AND + (i_units = 'Box' OR i_units = 'Pound') AND + (i_size = 'medium' OR i_size = 'extra large') + ))) OR + (i_manufact = i1.i_manufact AND + ((i_category = 'Women' AND + (i_color = 'midnight' OR i_color = 'snow') AND + (i_units = 'Pallet' OR i_units = 'Gross') AND + (i_size = 'medium' OR i_size = 'extra large') + ) OR + (i_category = 'Women' AND + (i_color = 'cyan' OR i_color = 'papaya') AND + (i_units = 'Cup' OR i_units = 'Dram') AND + (i_size = 'N/A' OR i_size = 'small') + ) OR + (i_category = 'Men' AND + (i_color = 'orange' OR i_color = 'frosted') AND + (i_units = 'Each' OR i_units = 'Tbl') AND + (i_size = 'petite' OR i_size = 'large') + ) OR + (i_category = 'Men' AND + (i_color = 'forest' OR i_color = 'ghost') AND + (i_units = 'Lb' OR i_units = 'Bundle') AND + (i_size = 'medium' OR i_size = 'extra large') + )))) > 0 +ORDER BY i_product_name +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q42.sql b/athena-tpcds/src/main/resources/queries/q42.sql new file mode 100644 index 0000000000..66a819597d --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q42.sql @@ -0,0 +1,18 @@ +SELECT + dt.d_year, + item[ TABLE_SUFFIX ].i_category_id, + item[ TABLE_SUFFIX ].i_category, + sum(ss_ext_sales_price) +FROM date_dim[ TABLE_SUFFIX ] dt, store_sales[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ] +WHERE dt.d_date_sk = store_sales[ TABLE_SUFFIX ].ss_sold_date_sk + AND store_sales[ TABLE_SUFFIX ].ss_item_sk = item[ TABLE_SUFFIX ].i_item_sk + AND item[ TABLE_SUFFIX ].i_manager_id = 1 + AND dt.d_moy = 11 + AND dt.d_year = 2000 +GROUP BY dt.d_year + , item[ TABLE_SUFFIX ].i_category_id + , item[ TABLE_SUFFIX ].i_category +ORDER BY sum(ss_ext_sales_price) DESC, dt.d_year + , item[ TABLE_SUFFIX ].i_category_id + , item[ TABLE_SUFFIX ].i_category +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q43.sql b/athena-tpcds/src/main/resources/queries/q43.sql new file mode 100644 index 0000000000..907c71fa54 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q43.sql @@ -0,0 +1,33 @@ +SELECT + s_store_name, + s_store_id, + sum(CASE WHEN (d_day_name = 'Sunday') + THEN ss_sales_price + ELSE NULL END) sun_sales, + sum(CASE WHEN (d_day_name = 'Monday') + THEN ss_sales_price + ELSE NULL END) mon_sales, + sum(CASE WHEN (d_day_name = 'Tuesday') + THEN ss_sales_price + ELSE NULL END) tue_sales, + sum(CASE WHEN (d_day_name = 'Wednesday') + THEN ss_sales_price + ELSE NULL END) wed_sales, + sum(CASE WHEN (d_day_name = 'Thursday') + THEN ss_sales_price + ELSE NULL END) thu_sales, + sum(CASE WHEN (d_day_name = 'Friday') + THEN ss_sales_price + ELSE NULL END) fri_sales, + sum(CASE WHEN (d_day_name = 'Saturday') + THEN ss_sales_price + ELSE NULL END) sat_sales +FROM date_dim[ TABLE_SUFFIX ], store_sales[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ] +WHERE d_date_sk = ss_sold_date_sk AND + s_store_sk = ss_store_sk AND + s_gmt_offset = -5 AND + d_year = 2000 +GROUP BY s_store_name, s_store_id +ORDER BY s_store_name, s_store_id, sun_sales, mon_sales, tue_sales, wed_sales, + thu_sales, fri_sales, sat_sales +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q44.sql b/athena-tpcds/src/main/resources/queries/q44.sql new file mode 100644 index 0000000000..eaadc64af1 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q44.sql @@ -0,0 +1,46 @@ +SELECT + asceding.rnk, + i1.i_product_name best_performing, + i2.i_product_name worst_performing +FROM (SELECT * +FROM (SELECT + item_sk, + rank() + OVER ( + ORDER BY rank_col ASC) rnk +FROM (SELECT + ss_item_sk item_sk, + avg(ss_net_profit) rank_col +FROM store_sales[ TABLE_SUFFIX ] ss1 +WHERE ss_store_sk = 4 +GROUP BY ss_item_sk +HAVING avg(ss_net_profit) > 0.9 * (SELECT avg(ss_net_profit) rank_col +FROM store_sales[ TABLE_SUFFIX ] +WHERE ss_store_sk = 4 + AND ss_addr_sk IS NULL +GROUP BY ss_store_sk)) V1) V11 +WHERE rnk < 11) asceding, + (SELECT * + FROM (SELECT + item_sk, + rank() + OVER ( + ORDER BY rank_col DESC) rnk + FROM (SELECT + ss_item_sk item_sk, + avg(ss_net_profit) rank_col + FROM store_sales[ TABLE_SUFFIX ] ss1 + WHERE ss_store_sk = 4 + GROUP BY ss_item_sk + HAVING avg(ss_net_profit) > 0.9 * (SELECT avg(ss_net_profit) rank_col + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_store_sk = 4 + AND ss_addr_sk IS NULL + GROUP BY ss_store_sk)) V2) V21 + WHERE rnk < 11) descending, + item i1, item i2 +WHERE asceding.rnk = descending.rnk + AND i1.i_item_sk = asceding.item_sk + AND i2.i_item_sk = descending.item_sk +ORDER BY asceding.rnk +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q45.sql b/athena-tpcds/src/main/resources/queries/q45.sql new file mode 100644 index 0000000000..907438f196 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q45.sql @@ -0,0 +1,21 @@ +SELECT + ca_zip, + ca_city, + sum(ws_sales_price) +FROM web_sales, customer, customer_address, date_dim, item +WHERE ws_bill_customer_sk = c_customer_sk + AND c_current_addr_sk = ca_address_sk + AND ws_item_sk = i_item_sk + AND (substr(ca_zip, 1, 5) IN + ('85669', '86197', '88274', '83405', '86475', '85392', '85460', '80348', '81792') + OR + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_item_sk IN (2, 3, 5, 7, 11, 13, 17, 19, 23, 29) + ) +) + AND ws_sold_date_sk = d_date_sk + AND d_qoy = 2 AND d_year = 2001 +GROUP BY ca_zip, ca_city +ORDER BY ca_zip, ca_city +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q46.sql b/athena-tpcds/src/main/resources/queries/q46.sql new file mode 100644 index 0000000000..0911677dff --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q46.sql @@ -0,0 +1,32 @@ +SELECT + c_last_name, + c_first_name, + ca_city, + bought_city, + ss_ticket_number, + amt, + profit +FROM + (SELECT + ss_ticket_number, + ss_customer_sk, + ca_city bought_city, + sum(ss_coupon_amt) amt, + sum(ss_net_profit) profit + FROM store_sales, date_dim, store, household_demographics, customer_address + WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk + AND store_sales.ss_store_sk = store.s_store_sk + AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk + AND store_sales.ss_addr_sk = customer_address.ca_address_sk + AND (household_demographics.hd_dep_count = 4 OR + household_demographics.hd_vehicle_count = 3) + AND date_dim.d_dow IN (6, 0) + AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2) + AND store.s_city IN ('Fairview', 'Midway', 'Fairview', 'Fairview', 'Fairview') + GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, customer, + customer_address current_addr +WHERE ss_customer_sk = c_customer_sk + AND customer.c_current_addr_sk = current_addr.ca_address_sk + AND current_addr.ca_city <> bought_city +ORDER BY c_last_name, c_first_name, ca_city, bought_city, ss_ticket_number +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q47.sql b/athena-tpcds/src/main/resources/queries/q47.sql new file mode 100644 index 0000000000..cfc37a4cec --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q47.sql @@ -0,0 +1,63 @@ +WITH v1 AS ( + SELECT + i_category, + i_brand, + s_store_name, + s_company_name, + d_year, + d_moy, + sum(ss_sales_price) sum_sales, + avg(sum(ss_sales_price)) + OVER + (PARTITION BY i_category, i_brand, + s_store_name, s_company_name, d_year) + avg_monthly_sales, + rank() + OVER + (PARTITION BY i_category, i_brand, + s_store_name, s_company_name + ORDER BY d_year, d_moy) rn + FROM item, store_sales, date_dim, store + WHERE ss_item_sk = i_item_sk AND + ss_sold_date_sk = d_date_sk AND + ss_store_sk = s_store_sk AND + ( + d_year = 1999 OR + (d_year = 1999 - 1 AND d_moy = 12) OR + (d_year = 1999 + 1 AND d_moy = 1) + ) + GROUP BY i_category, i_brand, + s_store_name, s_company_name, + d_year, d_moy), + v2 AS ( + SELECT + v1.i_category, + v1.i_brand, + v1.s_store_name, + v1.s_company_name, + v1.d_year, + v1.d_moy, + v1.avg_monthly_sales, + v1.sum_sales, + v1_lag.sum_sales psum, + v1_lead.sum_sales nsum + FROM v1, v1 v1_lag, v1 v1_lead + WHERE v1.i_category = v1_lag.i_category AND + v1.i_category = v1_lead.i_category AND + v1.i_brand = v1_lag.i_brand AND + v1.i_brand = v1_lead.i_brand AND + v1.s_store_name = v1_lag.s_store_name AND + v1.s_store_name = v1_lead.s_store_name AND + v1.s_company_name = v1_lag.s_company_name AND + v1.s_company_name = v1_lead.s_company_name AND + v1.rn = v1_lag.rn + 1 AND + v1.rn = v1_lead.rn - 1) +SELECT * +FROM v2 +WHERE d_year = 1999 AND + avg_monthly_sales > 0 AND + CASE WHEN avg_monthly_sales > 0 + THEN abs(sum_sales - avg_monthly_sales) / avg_monthly_sales + ELSE NULL END > 0.1 +ORDER BY sum_sales - avg_monthly_sales, 3 +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q48.sql b/athena-tpcds/src/main/resources/queries/q48.sql new file mode 100644 index 0000000000..fdb9f38e29 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q48.sql @@ -0,0 +1,63 @@ +SELECT sum(ss_quantity) +FROM store_sales, store, customer_demographics, customer_address, date_dim +WHERE s_store_sk = ss_store_sk + AND ss_sold_date_sk = d_date_sk AND d_year = 2001 + AND + ( + ( + cd_demo_sk = ss_cdemo_sk + AND + cd_marital_status = 'M' + AND + cd_education_status = '4 yr Degree' + AND + ss_sales_price BETWEEN 100.00 AND 150.00 + ) + OR + ( + cd_demo_sk = ss_cdemo_sk + AND + cd_marital_status = 'D' + AND + cd_education_status = '2 yr Degree' + AND + ss_sales_price BETWEEN 50.00 AND 100.00 + ) + OR + ( + cd_demo_sk = ss_cdemo_sk + AND + cd_marital_status = 'S' + AND + cd_education_status = 'College' + AND + ss_sales_price BETWEEN 150.00 AND 200.00 + ) + ) + AND + ( + ( + ss_addr_sk = ca_address_sk + AND + ca_country = 'United States' + AND + ca_state IN ('CO', 'OH', 'TX') + AND ss_net_profit BETWEEN 0 AND 2000 + ) + OR + (ss_addr_sk = ca_address_sk + AND + ca_country = 'United States' + AND + ca_state IN ('OR', 'MN', 'KY') + AND ss_net_profit BETWEEN 150 AND 3000 + ) + OR + (ss_addr_sk = ca_address_sk + AND + ca_country = 'United States' + AND + ca_state IN ('VA', 'CA', 'MS') + AND ss_net_profit BETWEEN 50 AND 25000 + ) + ) diff --git a/athena-tpcds/src/main/resources/queries/q49.sql b/athena-tpcds/src/main/resources/queries/q49.sql new file mode 100644 index 0000000000..9568d8b92d --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q49.sql @@ -0,0 +1,126 @@ +SELECT + 'web' AS channel, + web.item, + web.return_ratio, + web.return_rank, + web.currency_rank +FROM ( + SELECT + item, + return_ratio, + currency_ratio, + rank() + OVER ( + ORDER BY return_ratio) AS return_rank, + rank() + OVER ( + ORDER BY currency_ratio) AS currency_rank + FROM + (SELECT + ws.ws_item_sk AS item, + (cast(sum(coalesce(wr.wr_return_quantity, 0)) AS DECIMAL(15, 4)) / + cast(sum(coalesce(ws.ws_quantity, 0)) AS DECIMAL(15, 4))) AS return_ratio, + (cast(sum(coalesce(wr.wr_return_amt, 0)) AS DECIMAL(15, 4)) / + cast(sum(coalesce(ws.ws_net_paid, 0)) AS DECIMAL(15, 4))) AS currency_ratio + FROM + web_sales ws LEFT OUTER JOIN web_returns wr + ON (ws.ws_order_number = wr.wr_order_number AND + ws.ws_item_sk = wr.wr_item_sk) + , date_dim + WHERE + wr.wr_return_amt > 10000 + AND ws.ws_net_profit > 1 + AND ws.ws_net_paid > 0 + AND ws.ws_quantity > 0 + AND ws_sold_date_sk = d_date_sk + AND d_year = 2001 + AND d_moy = 12 + GROUP BY ws.ws_item_sk + ) in_web + ) web +WHERE (web.return_rank <= 10 OR web.currency_rank <= 10) +UNION +SELECT + 'catalog' AS channel, + catalog.item, + catalog.return_ratio, + catalog.return_rank, + catalog.currency_rank +FROM ( + SELECT + item, + return_ratio, + currency_ratio, + rank() + OVER ( + ORDER BY return_ratio) AS return_rank, + rank() + OVER ( + ORDER BY currency_ratio) AS currency_rank + FROM + (SELECT + cs.cs_item_sk AS item, + (cast(sum(coalesce(cr.cr_return_quantity, 0)) AS DECIMAL(15, 4)) / + cast(sum(coalesce(cs.cs_quantity, 0)) AS DECIMAL(15, 4))) AS return_ratio, + (cast(sum(coalesce(cr.cr_return_amount, 0)) AS DECIMAL(15, 4)) / + cast(sum(coalesce(cs.cs_net_paid, 0)) AS DECIMAL(15, 4))) AS currency_ratio + FROM + catalog_sales cs LEFT OUTER JOIN catalog_returns cr + ON (cs.cs_order_number = cr.cr_order_number AND + cs.cs_item_sk = cr.cr_item_sk) + , date_dim + WHERE + cr.cr_return_amount > 10000 + AND cs.cs_net_profit > 1 + AND cs.cs_net_paid > 0 + AND cs.cs_quantity > 0 + AND cs_sold_date_sk = d_date_sk + AND d_year = 2001 + AND d_moy = 12 + GROUP BY cs.cs_item_sk + ) in_cat + ) catalog +WHERE (catalog.return_rank <= 10 OR catalog.currency_rank <= 10) +UNION +SELECT + 'store' AS channel, + store.item, + store.return_ratio, + store.return_rank, + store.currency_rank +FROM ( + SELECT + item, + return_ratio, + currency_ratio, + rank() + OVER ( + ORDER BY return_ratio) AS return_rank, + rank() + OVER ( + ORDER BY currency_ratio) AS currency_rank + FROM + (SELECT + sts.ss_item_sk AS item, + (cast(sum(coalesce(sr.sr_return_quantity, 0)) AS DECIMAL(15, 4)) / + cast(sum(coalesce(sts.ss_quantity, 0)) AS DECIMAL(15, 4))) AS return_ratio, + (cast(sum(coalesce(sr.sr_return_amt, 0)) AS DECIMAL(15, 4)) / + cast(sum(coalesce(sts.ss_net_paid, 0)) AS DECIMAL(15, 4))) AS currency_ratio + FROM + store_sales sts LEFT OUTER JOIN store_returns sr + ON (sts.ss_ticket_number = sr.sr_ticket_number AND sts.ss_item_sk = sr.sr_item_sk) + , date_dim + WHERE + sr.sr_return_amt > 10000 + AND sts.ss_net_profit > 1 + AND sts.ss_net_paid > 0 + AND sts.ss_quantity > 0 + AND ss_sold_date_sk = d_date_sk + AND d_year = 2001 + AND d_moy = 12 + GROUP BY sts.ss_item_sk + ) in_store + ) store +WHERE (store.return_rank <= 10 OR store.currency_rank <= 10) +ORDER BY 1, 4, 5 +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q5.sql b/athena-tpcds/src/main/resources/queries/q5.sql new file mode 100644 index 0000000000..f5f0d20308 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q5.sql @@ -0,0 +1,131 @@ +WITH ssr AS +( SELECT + s_store_id, + sum(sales_price) AS sales, + sum(profit) AS profit, + sum(return_amt) AS RETURNS, + sum(net_loss) AS profit_loss + FROM + (SELECT + ss_store_sk AS store_sk, + ss_sold_date_sk AS date_sk, + ss_ext_sales_price AS sales_price, + ss_net_profit AS profit, + cast(0 AS DECIMAL(7, 2)) AS return_amt, + cast(0 AS DECIMAL(7, 2)) AS net_loss + FROM store_sales + UNION ALL + SELECT + sr_store_sk AS store_sk, + sr_returned_date_sk AS date_sk, + cast(0 AS DECIMAL(7, 2)) AS sales_price, + cast(0 AS DECIMAL(7, 2)) AS profit, + sr_return_amt AS return_amt, + sr_net_loss AS net_loss + FROM store_returns) + salesreturns, date_dim, store + WHERE date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-23' AS DATE) + AND ((cast('2000-08-23' AS DATE) + INTERVAL '14' day)) + AND store_sk = s_store_sk + GROUP BY s_store_id), + csr AS + ( SELECT + cp_catalog_page_id, + sum(sales_price) AS sales, + sum(profit) AS profit, + sum(return_amt) AS RETURNS, + sum(net_loss) AS profit_loss + FROM + (SELECT + cs_catalog_page_sk AS page_sk, + cs_sold_date_sk AS date_sk, + cs_ext_sales_price AS sales_price, + cs_net_profit AS profit, + cast(0 AS DECIMAL(7, 2)) AS return_amt, + cast(0 AS DECIMAL(7, 2)) AS net_loss + FROM catalog_sales + UNION ALL + SELECT + cr_catalog_page_sk AS page_sk, + cr_returned_date_sk AS date_sk, + cast(0 AS DECIMAL(7, 2)) AS sales_price, + cast(0 AS DECIMAL(7, 2)) AS profit, + cr_return_amount AS return_amt, + cr_net_loss AS net_loss + FROM catalog_returns + ) salesreturns, date_dim, catalog_page + WHERE date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-23' AS DATE) + AND ((cast('2000-08-23' AS DATE) + INTERVAL '14' day)) + AND page_sk = cp_catalog_page_sk + GROUP BY cp_catalog_page_id) + , + wsr AS + ( SELECT + web_site_id, + sum(sales_price) AS sales, + sum(profit) AS profit, + sum(return_amt) AS RETURNS, + sum(net_loss) AS profit_loss + FROM + (SELECT + ws_web_site_sk AS wsr_web_site_sk, + ws_sold_date_sk AS date_sk, + ws_ext_sales_price AS sales_price, + ws_net_profit AS profit, + cast(0 AS DECIMAL(7, 2)) AS return_amt, + cast(0 AS DECIMAL(7, 2)) AS net_loss + FROM web_sales + UNION ALL + SELECT + ws_web_site_sk AS wsr_web_site_sk, + wr_returned_date_sk AS date_sk, + cast(0 AS DECIMAL(7, 2)) AS sales_price, + cast(0 AS DECIMAL(7, 2)) AS profit, + wr_return_amt AS return_amt, + wr_net_loss AS net_loss + FROM web_returns + LEFT OUTER JOIN web_sales ON + (wr_item_sk = ws_item_sk + AND wr_order_number = ws_order_number) + ) salesreturns, date_dim, web_site + WHERE date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-23' AS DATE) + AND ((cast('2000-08-23' AS DATE) + INTERVAL '14' day)) + AND wsr_web_site_sk = web_site_sk + GROUP BY web_site_id) +SELECT + channel, + id, + sum(sales) AS sales, + sum(returns) AS returns, + sum(profit) AS profit +FROM + (SELECT + 'store channel' AS channel, + concat('store', s_store_id) AS id, + sales, + returns, + (profit - profit_loss) AS profit + FROM ssr + UNION ALL + SELECT + 'catalog channel' AS channel, + concat('catalog_page', cp_catalog_page_id) AS id, + sales, + returns, + (profit - profit_loss) AS profit + FROM csr + UNION ALL + SELECT + 'web channel' AS channel, + concat('web_site', web_site_id) AS id, + sales, + returns, + (profit - profit_loss) AS profit + FROM wsr + ) x +GROUP BY ROLLUP (channel, id) +ORDER BY channel, id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q50.sql b/athena-tpcds/src/main/resources/queries/q50.sql new file mode 100644 index 0000000000..776d1d2865 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q50.sql @@ -0,0 +1,47 @@ +SELECT + s_store_name, + s_company_id, + s_street_number, + s_street_name, + s_street_type, + s_suite_number, + s_city, + s_county, + s_state, + s_zip, + sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk <= 30) + THEN 1 + ELSE 0 END) AS "30 days ", + sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 30) AND + (sr_returned_date_sk - ss_sold_date_sk <= 60) + THEN 1 + ELSE 0 END) AS "31 - 60 days ", + sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 60) AND + (sr_returned_date_sk - ss_sold_date_sk <= 90) + THEN 1 + ELSE 0 END) AS "61 - 90 days ", + sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 90) AND + (sr_returned_date_sk - ss_sold_date_sk <= 120) + THEN 1 + ELSE 0 END) AS "91 - 120 days ", + sum(CASE WHEN (sr_returned_date_sk - ss_sold_date_sk > 120) + THEN 1 + ELSE 0 END) AS ">120 days " +FROM + store_sales, store_returns, store, date_dim d1, date_dim d2 +WHERE + d2.d_year = 2001 + AND d2.d_moy = 8 + AND ss_ticket_number = sr_ticket_number + AND ss_item_sk = sr_item_sk + AND ss_sold_date_sk = d1.d_date_sk + AND sr_returned_date_sk = d2.d_date_sk + AND ss_customer_sk = sr_customer_sk + AND ss_store_sk = s_store_sk +GROUP BY + s_store_name, s_company_id, s_street_number, s_street_name, s_street_type, + s_suite_number, s_city, s_county, s_state, s_zip +ORDER BY + s_store_name, s_company_id, s_street_number, s_street_name, s_street_type, + s_suite_number, s_city, s_county, s_state, s_zip +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q51.sql b/athena-tpcds/src/main/resources/queries/q51.sql new file mode 100644 index 0000000000..62b003eb67 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q51.sql @@ -0,0 +1,55 @@ +WITH web_v1 AS ( + SELECT + ws_item_sk item_sk, + d_date, + sum(sum(ws_sales_price)) + OVER (PARTITION BY ws_item_sk + ORDER BY d_date + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) cume_sales + FROM web_sales, date_dim + WHERE ws_sold_date_sk = d_date_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + AND ws_item_sk IS NOT NULL + GROUP BY ws_item_sk, d_date), + store_v1 AS ( + SELECT + ss_item_sk item_sk, + d_date, + sum(sum(ss_sales_price)) + OVER (PARTITION BY ss_item_sk + ORDER BY d_date + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) cume_sales + FROM store_sales, date_dim + WHERE ss_sold_date_sk = d_date_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + AND ss_item_sk IS NOT NULL + GROUP BY ss_item_sk, d_date) +SELECT * +FROM (SELECT + item_sk, + d_date, + web_sales, + store_sales, + max(web_sales) + OVER (PARTITION BY item_sk + ORDER BY d_date + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) web_cumulative, + max(store_sales) + OVER (PARTITION BY item_sk + ORDER BY d_date + ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) store_cumulative +FROM (SELECT + CASE WHEN web.item_sk IS NOT NULL + THEN web.item_sk + ELSE store.item_sk END item_sk, + CASE WHEN web.d_date IS NOT NULL + THEN web.d_date + ELSE store.d_date END d_date, + web.cume_sales web_sales, + store.cume_sales store_sales +FROM web_v1 web FULL OUTER JOIN store_v1 store ON (web.item_sk = store.item_sk + AND web.d_date = store.d_date) + ) x) y +WHERE web_cumulative > store_cumulative +ORDER BY item_sk, d_date +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q52.sql b/athena-tpcds/src/main/resources/queries/q52.sql new file mode 100644 index 0000000000..467d1ae050 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q52.sql @@ -0,0 +1,14 @@ +SELECT + dt.d_year, + item.i_brand_id brand_id, + item.i_brand brand, + sum(ss_ext_sales_price) ext_price +FROM date_dim dt, store_sales, item +WHERE dt.d_date_sk = store_sales.ss_sold_date_sk + AND store_sales.ss_item_sk = item.i_item_sk + AND item.i_manager_id = 1 + AND dt.d_moy = 11 + AND dt.d_year = 2000 +GROUP BY dt.d_year, item.i_brand, item.i_brand_id +ORDER BY dt.d_year, ext_price DESC, brand_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q53.sql b/athena-tpcds/src/main/resources/queries/q53.sql new file mode 100644 index 0000000000..b42c68dcf8 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q53.sql @@ -0,0 +1,30 @@ +SELECT * +FROM + (SELECT + i_manufact_id, + sum(ss_sales_price) sum_sales, + avg(sum(ss_sales_price)) + OVER (PARTITION BY i_manufact_id) avg_quarterly_sales + FROM item, store_sales, date_dim, store + WHERE ss_item_sk = i_item_sk AND + ss_sold_date_sk = d_date_sk AND + ss_store_sk = s_store_sk AND + d_month_seq IN (1200, 1200 + 1, 1200 + 2, 1200 + 3, 1200 + 4, 1200 + 5, 1200 + 6, + 1200 + 7, 1200 + 8, 1200 + 9, 1200 + 10, 1200 + 11) AND + ((i_category IN ('Books', 'Children', 'Electronics') AND + i_class IN ('personal', 'portable', 'reference', 'self-help') AND + i_brand IN ('scholaramalgamalg #14', 'scholaramalgamalg #7', + 'exportiunivamalg #9', 'scholaramalgamalg #9')) + OR + (i_category IN ('Women', 'Music', 'Men') AND + i_class IN ('accessories', 'classical', 'fragrances', 'pants') AND + i_brand IN ('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1', + 'importoamalg #1'))) + GROUP BY i_manufact_id, d_qoy) tmp1 +WHERE CASE WHEN avg_quarterly_sales > 0 + THEN abs(sum_sales - avg_quarterly_sales) / avg_quarterly_sales + ELSE NULL END > 0.1 +ORDER BY avg_quarterly_sales, + sum_sales, + i_manufact_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q54.sql b/athena-tpcds/src/main/resources/queries/q54.sql new file mode 100644 index 0000000000..2508677bb7 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q54.sql @@ -0,0 +1,61 @@ +WITH my_customers AS ( + SELECT DISTINCT + c_customer_sk, + c_current_addr_sk + FROM + (SELECT + cs_sold_date_sk sold_date_sk, + cs_bill_customer_sk customer_sk, + cs_item_sk item_sk + FROM catalog_sales + UNION ALL + SELECT + ws_sold_date_sk sold_date_sk, + ws_bill_customer_sk customer_sk, + ws_item_sk item_sk + FROM web_sales + ) cs_or_ws_sales, + item, + date_dim, + customer + WHERE sold_date_sk = d_date_sk + AND item_sk = i_item_sk + AND i_category = 'Women' + AND i_class = 'maternity' + AND c_customer_sk = cs_or_ws_sales.customer_sk + AND d_moy = 12 + AND d_year = 1998 +) + , my_revenue AS ( + SELECT + c_customer_sk, + sum(ss_ext_sales_price) AS revenue + FROM my_customers, + store_sales, + customer_address, + store, + date_dim + WHERE c_current_addr_sk = ca_address_sk + AND ca_county = s_county + AND ca_state = s_state + AND ss_sold_date_sk = d_date_sk + AND c_customer_sk = ss_customer_sk + AND d_month_seq BETWEEN (SELECT DISTINCT d_month_seq + 1 + FROM date_dim + WHERE d_year = 1998 AND d_moy = 12) + AND (SELECT DISTINCT d_month_seq + 3 + FROM date_dim + WHERE d_year = 1998 AND d_moy = 12) + GROUP BY c_customer_sk +) + , segments AS +(SELECT cast((revenue / 50) AS INTEGER) AS segment + FROM my_revenue) +SELECT + segment, + count(*) AS num_customers, + segment * 50 AS segment_base +FROM segments +GROUP BY segment +ORDER BY segment, num_customers +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q55.sql b/athena-tpcds/src/main/resources/queries/q55.sql new file mode 100644 index 0000000000..bc5d888c9a --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q55.sql @@ -0,0 +1,13 @@ +SELECT + i_brand_id brand_id, + i_brand brand, + sum(ss_ext_sales_price) ext_price +FROM date_dim, store_sales, item +WHERE d_date_sk = ss_sold_date_sk + AND ss_item_sk = i_item_sk + AND i_manager_id = 28 + AND d_moy = 11 + AND d_year = 1999 +GROUP BY i_brand, i_brand_id +ORDER BY ext_price DESC, brand_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q56.sql b/athena-tpcds/src/main/resources/queries/q56.sql new file mode 100644 index 0000000000..2fa1738dcf --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q56.sql @@ -0,0 +1,65 @@ +WITH ss AS ( + SELECT + i_item_id, + sum(ss_ext_sales_price) total_sales + FROM + store_sales, date_dim, customer_address, item + WHERE + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_color IN ('slate', 'blanched', 'burnished')) + AND ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND d_year = 2001 + AND d_moy = 2 + AND ss_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_item_id), + cs AS ( + SELECT + i_item_id, + sum(cs_ext_sales_price) total_sales + FROM + catalog_sales, date_dim, customer_address, item + WHERE + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_color IN ('slate', 'blanched', 'burnished')) + AND cs_item_sk = i_item_sk + AND cs_sold_date_sk = d_date_sk + AND d_year = 2001 + AND d_moy = 2 + AND cs_bill_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_item_id), + ws AS ( + SELECT + i_item_id, + sum(ws_ext_sales_price) total_sales + FROM + web_sales, date_dim, customer_address, item + WHERE + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_color IN ('slate', 'blanched', 'burnished')) + AND ws_item_sk = i_item_sk + AND ws_sold_date_sk = d_date_sk + AND d_year = 2001 + AND d_moy = 2 + AND ws_bill_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_item_id) +SELECT + i_item_id, + sum(total_sales) total_sales +FROM (SELECT * + FROM ss + UNION ALL + SELECT * + FROM cs + UNION ALL + SELECT * + FROM ws) tmp1 +GROUP BY i_item_id +ORDER BY total_sales +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q57.sql b/athena-tpcds/src/main/resources/queries/q57.sql new file mode 100644 index 0000000000..cf70d4b905 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q57.sql @@ -0,0 +1,56 @@ +WITH v1 AS ( + SELECT + i_category, + i_brand, + cc_name, + d_year, + d_moy, + sum(cs_sales_price) sum_sales, + avg(sum(cs_sales_price)) + OVER + (PARTITION BY i_category, i_brand, cc_name, d_year) + avg_monthly_sales, + rank() + OVER + (PARTITION BY i_category, i_brand, cc_name + ORDER BY d_year, d_moy) rn + FROM item, catalog_sales, date_dim, call_center + WHERE cs_item_sk = i_item_sk AND + cs_sold_date_sk = d_date_sk AND + cc_call_center_sk = cs_call_center_sk AND + ( + d_year = 1999 OR + (d_year = 1999 - 1 AND d_moy = 12) OR + (d_year = 1999 + 1 AND d_moy = 1) + ) + GROUP BY i_category, i_brand, + cc_name, d_year, d_moy), + v2 AS ( + SELECT + v1.i_category, + v1.i_brand, + v1.cc_name, + v1.d_year, + v1.d_moy, + v1.avg_monthly_sales, + v1.sum_sales, + v1_lag.sum_sales psum, + v1_lead.sum_sales nsum + FROM v1, v1 v1_lag, v1 v1_lead + WHERE v1.i_category = v1_lag.i_category AND + v1.i_category = v1_lead.i_category AND + v1.i_brand = v1_lag.i_brand AND + v1.i_brand = v1_lead.i_brand AND + v1.cc_name = v1_lag.cc_name AND + v1.cc_name = v1_lead.cc_name AND + v1.rn = v1_lag.rn + 1 AND + v1.rn = v1_lead.rn - 1) +SELECT * +FROM v2 +WHERE d_year = 1999 AND + avg_monthly_sales > 0 AND + CASE WHEN avg_monthly_sales > 0 + THEN abs(sum_sales - avg_monthly_sales) / avg_monthly_sales + ELSE NULL END > 0.1 +ORDER BY sum_sales - avg_monthly_sales, 3 +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q59.sql b/athena-tpcds/src/main/resources/queries/q59.sql new file mode 100644 index 0000000000..3cef202768 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q59.sql @@ -0,0 +1,75 @@ +WITH wss AS +(SELECT + d_week_seq, + ss_store_sk, + sum(CASE WHEN (d_day_name = 'Sunday') + THEN ss_sales_price + ELSE NULL END) sun_sales, + sum(CASE WHEN (d_day_name = 'Monday') + THEN ss_sales_price + ELSE NULL END) mon_sales, + sum(CASE WHEN (d_day_name = 'Tuesday') + THEN ss_sales_price + ELSE NULL END) tue_sales, + sum(CASE WHEN (d_day_name = 'Wednesday') + THEN ss_sales_price + ELSE NULL END) wed_sales, + sum(CASE WHEN (d_day_name = 'Thursday') + THEN ss_sales_price + ELSE NULL END) thu_sales, + sum(CASE WHEN (d_day_name = 'Friday') + THEN ss_sales_price + ELSE NULL END) fri_sales, + sum(CASE WHEN (d_day_name = 'Saturday') + THEN ss_sales_price + ELSE NULL END) sat_sales + FROM store_sales, date_dim + WHERE d_date_sk = ss_sold_date_sk + GROUP BY d_week_seq, ss_store_sk +) +SELECT + s_store_name1, + s_store_id1, + d_week_seq1, + sun_sales1 / sun_sales2, + mon_sales1 / mon_sales2, + tue_sales1 / tue_sales2, + wed_sales1 / wed_sales2, + thu_sales1 / thu_sales2, + fri_sales1 / fri_sales2, + sat_sales1 / sat_sales2 +FROM + (SELECT + s_store_name s_store_name1, + wss.d_week_seq d_week_seq1, + s_store_id s_store_id1, + sun_sales sun_sales1, + mon_sales mon_sales1, + tue_sales tue_sales1, + wed_sales wed_sales1, + thu_sales thu_sales1, + fri_sales fri_sales1, + sat_sales sat_sales1 + FROM wss, store, date_dim d + WHERE d.d_week_seq = wss.d_week_seq AND + ss_store_sk = s_store_sk AND + d_month_seq BETWEEN 1212 AND 1212 + 11) y, + (SELECT + s_store_name s_store_name2, + wss.d_week_seq d_week_seq2, + s_store_id s_store_id2, + sun_sales sun_sales2, + mon_sales mon_sales2, + tue_sales tue_sales2, + wed_sales wed_sales2, + thu_sales thu_sales2, + fri_sales fri_sales2, + sat_sales sat_sales2 + FROM wss, store, date_dim d + WHERE d.d_week_seq = wss.d_week_seq AND + ss_store_sk = s_store_sk AND + d_month_seq BETWEEN 1212 + 12 AND 1212 + 23) x +WHERE s_store_id1 = s_store_id2 + AND d_week_seq1 = d_week_seq2 - 52 +ORDER BY s_store_name1, s_store_id1, d_week_seq1 +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q6.sql b/athena-tpcds/src/main/resources/queries/q6.sql new file mode 100644 index 0000000000..a2ffec762e --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q6.sql @@ -0,0 +1,21 @@ +SELECT + a.ca_state state, + count(*) cnt +FROM + customer_address[ TABLE_SUFFIX ] a, customer[ TABLE_SUFFIX ] c, store_sales[ TABLE_SUFFIX ] s, date_dim[ TABLE_SUFFIX ] d, item[ TABLE_SUFFIX ] i +WHERE a.ca_address_sk = c.c_current_addr_sk + AND c.c_customer_sk = s.ss_customer_sk + AND s.ss_sold_date_sk = d.d_date_sk + AND s.ss_item_sk = i.i_item_sk + AND d.d_month_seq = + (SELECT DISTINCT (d_month_seq) + FROM date_dim[ TABLE_SUFFIX ] + WHERE d_year = 2000 AND d_moy = 1) + AND i.i_current_price > 1.2 * + (SELECT avg(j.i_current_price) + FROM item[ TABLE_SUFFIX ] j + WHERE j.i_category = i.i_category) +GROUP BY a.ca_state +HAVING count(*) >= 10 +ORDER BY cnt +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q60.sql b/athena-tpcds/src/main/resources/queries/q60.sql new file mode 100644 index 0000000000..41b963f44b --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q60.sql @@ -0,0 +1,62 @@ +WITH ss AS ( + SELECT + i_item_id, + sum(ss_ext_sales_price) total_sales + FROM store_sales, date_dim, customer_address, item + WHERE + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_category IN ('Music')) + AND ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND d_year = 1998 + AND d_moy = 9 + AND ss_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_item_id), + cs AS ( + SELECT + i_item_id, + sum(cs_ext_sales_price) total_sales + FROM catalog_sales, date_dim, customer_address, item + WHERE + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_category IN ('Music')) + AND cs_item_sk = i_item_sk + AND cs_sold_date_sk = d_date_sk + AND d_year = 1998 + AND d_moy = 9 + AND cs_bill_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_item_id), + ws AS ( + SELECT + i_item_id, + sum(ws_ext_sales_price) total_sales + FROM web_sales, date_dim, customer_address, item + WHERE + i_item_id IN (SELECT i_item_id + FROM item + WHERE i_category IN ('Music')) + AND ws_item_sk = i_item_sk + AND ws_sold_date_sk = d_date_sk + AND d_year = 1998 + AND d_moy = 9 + AND ws_bill_addr_sk = ca_address_sk + AND ca_gmt_offset = -5 + GROUP BY i_item_id) +SELECT + i_item_id, + sum(total_sales) total_sales +FROM (SELECT * + FROM ss + UNION ALL + SELECT * + FROM cs + UNION ALL + SELECT * + FROM ws) tmp1 +GROUP BY i_item_id +ORDER BY i_item_id, total_sales +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q61.sql b/athena-tpcds/src/main/resources/queries/q61.sql new file mode 100644 index 0000000000..b0a872b4b8 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q61.sql @@ -0,0 +1,33 @@ +SELECT + promotions, + total, + cast(promotions AS DECIMAL(15, 4)) / cast(total AS DECIMAL(15, 4)) * 100 +FROM + (SELECT sum(ss_ext_sales_price) promotions + FROM store_sales, store, promotion, date_dim, customer, customer_address, item + WHERE ss_sold_date_sk = d_date_sk + AND ss_store_sk = s_store_sk + AND ss_promo_sk = p_promo_sk + AND ss_customer_sk = c_customer_sk + AND ca_address_sk = c_current_addr_sk + AND ss_item_sk = i_item_sk + AND ca_gmt_offset = -5 + AND i_category = 'Jewelry' + AND (p_channel_dmail = 'Y' OR p_channel_email = 'Y' OR p_channel_tv = 'Y') + AND s_gmt_offset = -5 + AND d_year = 1998 + AND d_moy = 11) promotional_sales, + (SELECT sum(ss_ext_sales_price) total + FROM store_sales, store, date_dim, customer, customer_address, item + WHERE ss_sold_date_sk = d_date_sk + AND ss_store_sk = s_store_sk + AND ss_customer_sk = c_customer_sk + AND ca_address_sk = c_current_addr_sk + AND ss_item_sk = i_item_sk + AND ca_gmt_offset = -5 + AND i_category = 'Jewelry' + AND s_gmt_offset = -5 + AND d_year = 1998 + AND d_moy = 11) all_sales +ORDER BY promotions, total +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q62.sql b/athena-tpcds/src/main/resources/queries/q62.sql new file mode 100644 index 0000000000..c8779b59cd --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q62.sql @@ -0,0 +1,35 @@ +SELECT + substr(w_warehouse_name, 1, 20), + sm_type, + web_name, + sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk <= 30) + THEN 1 + ELSE 0 END) AS "30 days ", + sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 30) AND + (ws_ship_date_sk - ws_sold_date_sk <= 60) + THEN 1 + ELSE 0 END) AS "31 - 60 days ", + sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 60) AND + (ws_ship_date_sk - ws_sold_date_sk <= 90) + THEN 1 + ELSE 0 END) AS "61 - 90 days ", + sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 90) AND + (ws_ship_date_sk - ws_sold_date_sk <= 120) + THEN 1 + ELSE 0 END) AS "91 - 120 days ", + sum(CASE WHEN (ws_ship_date_sk - ws_sold_date_sk > 120) + THEN 1 + ELSE 0 END) AS ">120 days " +FROM + web_sales, warehouse, ship_mode, web_site, date_dim +WHERE + d_month_seq BETWEEN 1200 AND 1200 + 11 + AND ws_ship_date_sk = d_date_sk + AND ws_warehouse_sk = w_warehouse_sk + AND ws_ship_mode_sk = sm_ship_mode_sk + AND ws_web_site_sk = web_site_sk +GROUP BY + substr(w_warehouse_name, 1, 20), sm_type, web_name +ORDER BY + substr(w_warehouse_name, 1, 20), sm_type, web_name +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q63.sql b/athena-tpcds/src/main/resources/queries/q63.sql new file mode 100644 index 0000000000..ef6867e0a9 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q63.sql @@ -0,0 +1,31 @@ +SELECT * +FROM (SELECT + i_manager_id, + sum(ss_sales_price) sum_sales, + avg(sum(ss_sales_price)) + OVER (PARTITION BY i_manager_id) avg_monthly_sales +FROM item + , store_sales + , date_dim + , store +WHERE ss_item_sk = i_item_sk + AND ss_sold_date_sk = d_date_sk + AND ss_store_sk = s_store_sk + AND d_month_seq IN (1200, 1200 + 1, 1200 + 2, 1200 + 3, 1200 + 4, 1200 + 5, 1200 + 6, 1200 + 7, + 1200 + 8, 1200 + 9, 1200 + 10, 1200 + 11) + AND ((i_category IN ('Books', 'Children', 'Electronics') + AND i_class IN ('personal', 'portable', 'refernece', 'self-help') + AND i_brand IN ('scholaramalgamalg #14', 'scholaramalgamalg #7', + 'exportiunivamalg #9', 'scholaramalgamalg #9')) + OR (i_category IN ('Women', 'Music', 'Men') + AND i_class IN ('accessories', 'classical', 'fragrances', 'pants') + AND i_brand IN ('amalgimporto #1', 'edu packscholar #1', 'exportiimporto #1', + 'importoamalg #1'))) +GROUP BY i_manager_id, d_moy) tmp1 +WHERE CASE WHEN avg_monthly_sales > 0 + THEN abs(sum_sales - avg_monthly_sales) / avg_monthly_sales + ELSE NULL END > 0.1 +ORDER BY i_manager_id + , avg_monthly_sales + , sum_sales +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q64.sql b/athena-tpcds/src/main/resources/queries/q64.sql new file mode 100644 index 0000000000..8ec1d31b61 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q64.sql @@ -0,0 +1,92 @@ +WITH cs_ui AS +(SELECT + cs_item_sk, + sum(cs_ext_list_price) AS sale, + sum(cr_refunded_cash + cr_reversed_charge + cr_store_credit) AS refund + FROM catalog_sales + , catalog_returns + WHERE cs_item_sk = cr_item_sk + AND cs_order_number = cr_order_number + GROUP BY cs_item_sk + HAVING sum(cs_ext_list_price) > 2 * sum(cr_refunded_cash + cr_reversed_charge + cr_store_credit)), + cross_sales AS + (SELECT + i_product_name product_name, + i_item_sk item_sk, + s_store_name store_name, + s_zip store_zip, + ad1.ca_street_number b_street_number, + ad1.ca_street_name b_streen_name, + ad1.ca_city b_city, + ad1.ca_zip b_zip, + ad2.ca_street_number c_street_number, + ad2.ca_street_name c_street_name, + ad2.ca_city c_city, + ad2.ca_zip c_zip, + d1.d_year AS syear, + d2.d_year AS fsyear, + d3.d_year s2year, + count(*) cnt, + sum(ss_wholesale_cost) s1, + sum(ss_list_price) s2, + sum(ss_coupon_amt) s3 + FROM store_sales, store_returns, cs_ui, date_dim d1, date_dim d2, date_dim d3, + store, customer, customer_demographics cd1, customer_demographics cd2, + promotion, household_demographics hd1, household_demographics hd2, + customer_address ad1, customer_address ad2, income_band ib1, income_band ib2, item + WHERE ss_store_sk = s_store_sk AND + ss_sold_date_sk = d1.d_date_sk AND + ss_customer_sk = c_customer_sk AND + ss_cdemo_sk = cd1.cd_demo_sk AND + ss_hdemo_sk = hd1.hd_demo_sk AND + ss_addr_sk = ad1.ca_address_sk AND + ss_item_sk = i_item_sk AND + ss_item_sk = sr_item_sk AND + ss_ticket_number = sr_ticket_number AND + ss_item_sk = cs_ui.cs_item_sk AND + c_current_cdemo_sk = cd2.cd_demo_sk AND + c_current_hdemo_sk = hd2.hd_demo_sk AND + c_current_addr_sk = ad2.ca_address_sk AND + c_first_sales_date_sk = d2.d_date_sk AND + c_first_shipto_date_sk = d3.d_date_sk AND + ss_promo_sk = p_promo_sk AND + hd1.hd_income_band_sk = ib1.ib_income_band_sk AND + hd2.hd_income_band_sk = ib2.ib_income_band_sk AND + cd1.cd_marital_status <> cd2.cd_marital_status AND + i_color IN ('purple', 'burlywood', 'indian', 'spring', 'floral', 'medium') AND + i_current_price BETWEEN 64 AND 64 + 10 AND + i_current_price BETWEEN 64 + 1 AND 64 + 15 + GROUP BY i_product_name, i_item_sk, s_store_name, s_zip, ad1.ca_street_number, + ad1.ca_street_name, ad1.ca_city, ad1.ca_zip, ad2.ca_street_number, + ad2.ca_street_name, ad2.ca_city, ad2.ca_zip, d1.d_year, d2.d_year, d3.d_year + ) +SELECT + cs1.product_name, + cs1.store_name, + cs1.store_zip, + cs1.b_street_number, + cs1.b_streen_name, + cs1.b_city, + cs1.b_zip, + cs1.c_street_number, + cs1.c_street_name, + cs1.c_city, + cs1.c_zip, + cs1.syear, + cs1.cnt, + cs1.s1, + cs1.s2, + cs1.s3, + cs2.s1, + cs2.s2, + cs2.s3, + cs2.syear, + cs2.cnt +FROM cross_sales cs1, cross_sales cs2 +WHERE cs1.item_sk = cs2.item_sk AND + cs1.syear = 1999 AND + cs2.syear = 1999 + 1 AND + cs2.cnt <= cs1.cnt AND + cs1.store_name = cs2.store_name AND + cs1.store_zip = cs2.store_zip +ORDER BY cs1.product_name, cs1.store_name, cs2.cnt diff --git a/athena-tpcds/src/main/resources/queries/q65.sql b/athena-tpcds/src/main/resources/queries/q65.sql new file mode 100644 index 0000000000..aad04be1bc --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q65.sql @@ -0,0 +1,33 @@ +SELECT + s_store_name, + i_item_desc, + sc.revenue, + i_current_price, + i_wholesale_cost, + i_brand +FROM store, item, + (SELECT + ss_store_sk, + avg(revenue) AS ave + FROM + (SELECT + ss_store_sk, + ss_item_sk, + sum(ss_sales_price) AS revenue + FROM store_sales, date_dim + WHERE ss_sold_date_sk = d_date_sk AND d_month_seq BETWEEN 1176 AND 1176 + 11 + GROUP BY ss_store_sk, ss_item_sk) sa + GROUP BY ss_store_sk) sb, + (SELECT + ss_store_sk, + ss_item_sk, + sum(ss_sales_price) AS revenue + FROM store_sales, date_dim + WHERE ss_sold_date_sk = d_date_sk AND d_month_seq BETWEEN 1176 AND 1176 + 11 + GROUP BY ss_store_sk, ss_item_sk) sc +WHERE sb.ss_store_sk = sc.ss_store_sk AND + sc.revenue <= 0.1 * sb.ave AND + s_store_sk = sc.ss_store_sk AND + i_item_sk = sc.ss_item_sk +ORDER BY s_store_name, i_item_desc +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q66.sql b/athena-tpcds/src/main/resources/queries/q66.sql new file mode 100644 index 0000000000..f826b41643 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q66.sql @@ -0,0 +1,240 @@ +SELECT + w_warehouse_name, + w_warehouse_sq_ft, + w_city, + w_county, + w_state, + w_country, + ship_carriers, + year, + sum(jan_sales) AS jan_sales, + sum(feb_sales) AS feb_sales, + sum(mar_sales) AS mar_sales, + sum(apr_sales) AS apr_sales, + sum(may_sales) AS may_sales, + sum(jun_sales) AS jun_sales, + sum(jul_sales) AS jul_sales, + sum(aug_sales) AS aug_sales, + sum(sep_sales) AS sep_sales, + sum(oct_sales) AS oct_sales, + sum(nov_sales) AS nov_sales, + sum(dec_sales) AS dec_sales, + sum(jan_sales / w_warehouse_sq_ft) AS jan_sales_per_sq_foot, + sum(feb_sales / w_warehouse_sq_ft) AS feb_sales_per_sq_foot, + sum(mar_sales / w_warehouse_sq_ft) AS mar_sales_per_sq_foot, + sum(apr_sales / w_warehouse_sq_ft) AS apr_sales_per_sq_foot, + sum(may_sales / w_warehouse_sq_ft) AS may_sales_per_sq_foot, + sum(jun_sales / w_warehouse_sq_ft) AS jun_sales_per_sq_foot, + sum(jul_sales / w_warehouse_sq_ft) AS jul_sales_per_sq_foot, + sum(aug_sales / w_warehouse_sq_ft) AS aug_sales_per_sq_foot, + sum(sep_sales / w_warehouse_sq_ft) AS sep_sales_per_sq_foot, + sum(oct_sales / w_warehouse_sq_ft) AS oct_sales_per_sq_foot, + sum(nov_sales / w_warehouse_sq_ft) AS nov_sales_per_sq_foot, + sum(dec_sales / w_warehouse_sq_ft) AS dec_sales_per_sq_foot, + sum(jan_net) AS jan_net, + sum(feb_net) AS feb_net, + sum(mar_net) AS mar_net, + sum(apr_net) AS apr_net, + sum(may_net) AS may_net, + sum(jun_net) AS jun_net, + sum(jul_net) AS jul_net, + sum(aug_net) AS aug_net, + sum(sep_net) AS sep_net, + sum(oct_net) AS oct_net, + sum(nov_net) AS nov_net, + sum(dec_net) AS dec_net +FROM ( + (SELECT + w_warehouse_name, + w_warehouse_sq_ft, + w_city, + w_county, + w_state, + w_country, + concat('DHL', ',', 'BARIAN') AS ship_carriers, + d_year AS year, + sum(CASE WHEN d_moy = 1 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS jan_sales, + sum(CASE WHEN d_moy = 2 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS feb_sales, + sum(CASE WHEN d_moy = 3 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS mar_sales, + sum(CASE WHEN d_moy = 4 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS apr_sales, + sum(CASE WHEN d_moy = 5 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS may_sales, + sum(CASE WHEN d_moy = 6 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS jun_sales, + sum(CASE WHEN d_moy = 7 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS jul_sales, + sum(CASE WHEN d_moy = 8 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS aug_sales, + sum(CASE WHEN d_moy = 9 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS sep_sales, + sum(CASE WHEN d_moy = 10 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS oct_sales, + sum(CASE WHEN d_moy = 11 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS nov_sales, + sum(CASE WHEN d_moy = 12 + THEN ws_ext_sales_price * ws_quantity + ELSE 0 END) AS dec_sales, + sum(CASE WHEN d_moy = 1 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS jan_net, + sum(CASE WHEN d_moy = 2 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS feb_net, + sum(CASE WHEN d_moy = 3 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS mar_net, + sum(CASE WHEN d_moy = 4 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS apr_net, + sum(CASE WHEN d_moy = 5 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS may_net, + sum(CASE WHEN d_moy = 6 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS jun_net, + sum(CASE WHEN d_moy = 7 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS jul_net, + sum(CASE WHEN d_moy = 8 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS aug_net, + sum(CASE WHEN d_moy = 9 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS sep_net, + sum(CASE WHEN d_moy = 10 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS oct_net, + sum(CASE WHEN d_moy = 11 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS nov_net, + sum(CASE WHEN d_moy = 12 + THEN ws_net_paid * ws_quantity + ELSE 0 END) AS dec_net + FROM + web_sales, warehouse, date_dim, time_dim, ship_mode + WHERE + ws_warehouse_sk = w_warehouse_sk + AND ws_sold_date_sk = d_date_sk + AND ws_sold_time_sk = t_time_sk + AND ws_ship_mode_sk = sm_ship_mode_sk + AND d_year = 2001 + AND t_time BETWEEN 30838 AND 30838 + 28800 + AND sm_carrier IN ('DHL', 'BARIAN') + GROUP BY + w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year) + UNION ALL + (SELECT + w_warehouse_name, + w_warehouse_sq_ft, + w_city, + w_county, + w_state, + w_country, + concat('DHL', ',', 'BARIAN') AS ship_carriers, + d_year AS year, + sum(CASE WHEN d_moy = 1 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS jan_sales, + sum(CASE WHEN d_moy = 2 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS feb_sales, + sum(CASE WHEN d_moy = 3 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS mar_sales, + sum(CASE WHEN d_moy = 4 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS apr_sales, + sum(CASE WHEN d_moy = 5 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS may_sales, + sum(CASE WHEN d_moy = 6 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS jun_sales, + sum(CASE WHEN d_moy = 7 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS jul_sales, + sum(CASE WHEN d_moy = 8 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS aug_sales, + sum(CASE WHEN d_moy = 9 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS sep_sales, + sum(CASE WHEN d_moy = 10 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS oct_sales, + sum(CASE WHEN d_moy = 11 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS nov_sales, + sum(CASE WHEN d_moy = 12 + THEN cs_sales_price * cs_quantity + ELSE 0 END) AS dec_sales, + sum(CASE WHEN d_moy = 1 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS jan_net, + sum(CASE WHEN d_moy = 2 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS feb_net, + sum(CASE WHEN d_moy = 3 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS mar_net, + sum(CASE WHEN d_moy = 4 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS apr_net, + sum(CASE WHEN d_moy = 5 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS may_net, + sum(CASE WHEN d_moy = 6 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS jun_net, + sum(CASE WHEN d_moy = 7 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS jul_net, + sum(CASE WHEN d_moy = 8 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS aug_net, + sum(CASE WHEN d_moy = 9 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS sep_net, + sum(CASE WHEN d_moy = 10 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS oct_net, + sum(CASE WHEN d_moy = 11 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS nov_net, + sum(CASE WHEN d_moy = 12 + THEN cs_net_paid_inc_tax * cs_quantity + ELSE 0 END) AS dec_net + FROM + catalog_sales, warehouse, date_dim, time_dim, ship_mode + WHERE + cs_warehouse_sk = w_warehouse_sk + AND cs_sold_date_sk = d_date_sk + AND cs_sold_time_sk = t_time_sk + AND cs_ship_mode_sk = sm_ship_mode_sk + AND d_year = 2001 + AND t_time BETWEEN 30838 AND 30838 + 28800 + AND sm_carrier IN ('DHL', 'BARIAN') + GROUP BY + w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, d_year + ) + ) x +GROUP BY + w_warehouse_name, w_warehouse_sq_ft, w_city, w_county, w_state, w_country, + ship_carriers, year +ORDER BY w_warehouse_name +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q67.sql b/athena-tpcds/src/main/resources/queries/q67.sql new file mode 100644 index 0000000000..f66e2252bd --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q67.sql @@ -0,0 +1,38 @@ +SELECT * +FROM + (SELECT + i_category, + i_class, + i_brand, + i_product_name, + d_year, + d_qoy, + d_moy, + s_store_id, + sumsales, + rank() + OVER (PARTITION BY i_category + ORDER BY sumsales DESC) rk + FROM + (SELECT + i_category, + i_class, + i_brand, + i_product_name, + d_year, + d_qoy, + d_moy, + s_store_id, + sum(coalesce(ss_sales_price * ss_quantity, 0)) sumsales + FROM store_sales, date_dim, store, item + WHERE ss_sold_date_sk = d_date_sk + AND ss_item_sk = i_item_sk + AND ss_store_sk = s_store_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + GROUP BY ROLLUP (i_category, i_class, i_brand, i_product_name, d_year, d_qoy, + d_moy, s_store_id)) dw1) dw2 +WHERE rk <= 100 +ORDER BY + i_category, i_class, i_brand, i_product_name, d_year, + d_qoy, d_moy, s_store_id, sumsales, rk +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q68.sql b/athena-tpcds/src/main/resources/queries/q68.sql new file mode 100644 index 0000000000..adb8a7189d --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q68.sql @@ -0,0 +1,34 @@ +SELECT + c_last_name, + c_first_name, + ca_city, + bought_city, + ss_ticket_number, + extended_price, + extended_tax, + list_price +FROM (SELECT + ss_ticket_number, + ss_customer_sk, + ca_city bought_city, + sum(ss_ext_sales_price) extended_price, + sum(ss_ext_list_price) list_price, + sum(ss_ext_tax) extended_tax +FROM store_sales, date_dim, store, household_demographics, customer_address +WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk + AND store_sales.ss_store_sk = store.s_store_sk + AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk + AND store_sales.ss_addr_sk = customer_address.ca_address_sk + AND date_dim.d_dom BETWEEN 1 AND 2 + AND (household_demographics.hd_dep_count = 4 OR + household_demographics.hd_vehicle_count = 3) + AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2) + AND store.s_city IN ('Midway', 'Fairview') +GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, ca_city) dn, + customer, + customer_address current_addr +WHERE ss_customer_sk = c_customer_sk + AND customer.c_current_addr_sk = current_addr.ca_address_sk + AND current_addr.ca_city <> bought_city +ORDER BY c_last_name, ss_ticket_number +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q69.sql b/athena-tpcds/src/main/resources/queries/q69.sql new file mode 100644 index 0000000000..1f0ee64f56 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q69.sql @@ -0,0 +1,38 @@ +SELECT + cd_gender, + cd_marital_status, + cd_education_status, + count(*) cnt1, + cd_purchase_estimate, + count(*) cnt2, + cd_credit_rating, + count(*) cnt3 +FROM + customer c, customer_address ca, customer_demographics +WHERE + c.c_current_addr_sk = ca.ca_address_sk AND + ca_state IN ('KY', 'GA', 'NM') AND + cd_demo_sk = c.c_current_cdemo_sk AND + exists(SELECT * + FROM store_sales, date_dim + WHERE c.c_customer_sk = ss_customer_sk AND + ss_sold_date_sk = d_date_sk AND + d_year = 2001 AND + d_moy BETWEEN 4 AND 4 + 2) AND + (NOT exists(SELECT * + FROM web_sales, date_dim + WHERE c.c_customer_sk = ws_bill_customer_sk AND + ws_sold_date_sk = d_date_sk AND + d_year = 2001 AND + d_moy BETWEEN 4 AND 4 + 2) AND + NOT exists(SELECT * + FROM catalog_sales, date_dim + WHERE c.c_customer_sk = cs_ship_customer_sk AND + cs_sold_date_sk = d_date_sk AND + d_year = 2001 AND + d_moy BETWEEN 4 AND 4 + 2)) +GROUP BY cd_gender, cd_marital_status, cd_education_status, + cd_purchase_estimate, cd_credit_rating +ORDER BY cd_gender, cd_marital_status, cd_education_status, + cd_purchase_estimate, cd_credit_rating +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q7.sql b/athena-tpcds/src/main/resources/queries/q7.sql new file mode 100644 index 0000000000..39144f13c8 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q7.sql @@ -0,0 +1,19 @@ +SELECT + i_item_id, + avg(ss_quantity) agg1, + avg(ss_list_price) agg2, + avg(ss_coupon_amt) agg3, + avg(ss_sales_price) agg4 +FROM store_sales[ TABLE_SUFFIX ], customer_demographics[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], item[ TABLE_SUFFIX ], promotion[ TABLE_SUFFIX ] +WHERE ss_sold_date_sk = d_date_sk AND + ss_item_sk = i_item_sk AND + ss_cdemo_sk = cd_demo_sk AND + ss_promo_sk = p_promo_sk AND + cd_gender = 'M' AND + cd_marital_status = 'S' AND + cd_education_status = 'College' AND + (p_channel_email = 'N' OR p_channel_event = 'N') AND + d_year = 2000 +GROUP BY i_item_id +ORDER BY i_item_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q71.sql b/athena-tpcds/src/main/resources/queries/q71.sql new file mode 100644 index 0000000000..8d724b9244 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q71.sql @@ -0,0 +1,44 @@ +SELECT + i_brand_id brand_id, + i_brand brand, + t_hour, + t_minute, + sum(ext_price) ext_price +FROM item, + (SELECT + ws_ext_sales_price AS ext_price, + ws_sold_date_sk AS sold_date_sk, + ws_item_sk AS sold_item_sk, + ws_sold_time_sk AS time_sk + FROM web_sales, date_dim + WHERE d_date_sk = ws_sold_date_sk + AND d_moy = 11 + AND d_year = 1999 + UNION ALL + SELECT + cs_ext_sales_price AS ext_price, + cs_sold_date_sk AS sold_date_sk, + cs_item_sk AS sold_item_sk, + cs_sold_time_sk AS time_sk + FROM catalog_sales, date_dim + WHERE d_date_sk = cs_sold_date_sk + AND d_moy = 11 + AND d_year = 1999 + UNION ALL + SELECT + ss_ext_sales_price AS ext_price, + ss_sold_date_sk AS sold_date_sk, + ss_item_sk AS sold_item_sk, + ss_sold_time_sk AS time_sk + FROM store_sales, date_dim + WHERE d_date_sk = ss_sold_date_sk + AND d_moy = 11 + AND d_year = 1999 + ) AS tmp, time_dim +WHERE + sold_item_sk = i_item_sk + AND i_manager_id = 1 + AND time_sk = t_time_sk + AND (t_meal_time = 'breakfast' OR t_meal_time = 'dinner') +GROUP BY i_brand, i_brand_id, t_hour, t_minute +ORDER BY ext_price DESC, brand_id diff --git a/athena-tpcds/src/main/resources/queries/q73.sql b/athena-tpcds/src/main/resources/queries/q73.sql new file mode 100644 index 0000000000..881be2e902 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q73.sql @@ -0,0 +1,30 @@ +SELECT + c_last_name, + c_first_name, + c_salutation, + c_preferred_cust_flag, + ss_ticket_number, + cnt +FROM + (SELECT + ss_ticket_number, + ss_customer_sk, + count(*) cnt + FROM store_sales, date_dim, store, household_demographics + WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk + AND store_sales.ss_store_sk = store.s_store_sk + AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk + AND date_dim.d_dom BETWEEN 1 AND 2 + AND (household_demographics.hd_buy_potential = '>10000' OR + household_demographics.hd_buy_potential = 'unknown') + AND household_demographics.hd_vehicle_count > 0 + AND CASE WHEN household_demographics.hd_vehicle_count > 0 + THEN + household_demographics.hd_dep_count / household_demographics.hd_vehicle_count + ELSE NULL END > 1 + AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2) + AND store.s_county IN ('Williamson County', 'Franklin Parish', 'Bronx County', 'Orange County') + GROUP BY ss_ticket_number, ss_customer_sk) dj, customer +WHERE ss_customer_sk = c_customer_sk + AND cnt BETWEEN 1 AND 5 +ORDER BY cnt DESC diff --git a/athena-tpcds/src/main/resources/queries/q74.sql b/athena-tpcds/src/main/resources/queries/q74.sql new file mode 100644 index 0000000000..154b26d680 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q74.sql @@ -0,0 +1,58 @@ +WITH year_total AS ( + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + d_year AS year, + sum(ss_net_paid) year_total, + 's' sale_type + FROM + customer, store_sales, date_dim + WHERE c_customer_sk = ss_customer_sk + AND ss_sold_date_sk = d_date_sk + AND d_year IN (2001, 2001 + 1) + GROUP BY + c_customer_id, c_first_name, c_last_name, d_year + UNION ALL + SELECT + c_customer_id customer_id, + c_first_name customer_first_name, + c_last_name customer_last_name, + d_year AS year, + sum(ws_net_paid) year_total, + 'w' sale_type + FROM + customer, web_sales, date_dim + WHERE c_customer_sk = ws_bill_customer_sk + AND ws_sold_date_sk = d_date_sk + AND d_year IN (2001, 2001 + 1) + GROUP BY + c_customer_id, c_first_name, c_last_name, d_year) +SELECT + t_s_secyear.customer_id, + t_s_secyear.customer_first_name, + t_s_secyear.customer_last_name +FROM + year_total t_s_firstyear, year_total t_s_secyear, + year_total t_w_firstyear, year_total t_w_secyear +WHERE t_s_secyear.customer_id = t_s_firstyear.customer_id + AND t_s_firstyear.customer_id = t_w_secyear.customer_id + AND t_s_firstyear.customer_id = t_w_firstyear.customer_id + AND t_s_firstyear.sale_type = 's' + AND t_w_firstyear.sale_type = 'w' + AND t_s_secyear.sale_type = 's' + AND t_w_secyear.sale_type = 'w' + AND t_s_firstyear.year = 2001 + AND t_s_secyear.year = 2001 + 1 + AND t_w_firstyear.year = 2001 + AND t_w_secyear.year = 2001 + 1 + AND t_s_firstyear.year_total > 0 + AND t_w_firstyear.year_total > 0 + AND CASE WHEN t_w_firstyear.year_total > 0 + THEN t_w_secyear.year_total / t_w_firstyear.year_total + ELSE NULL END + > CASE WHEN t_s_firstyear.year_total > 0 + THEN t_s_secyear.year_total / t_s_firstyear.year_total + ELSE NULL END +ORDER BY 1, 1, 1 +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q75.sql b/athena-tpcds/src/main/resources/queries/q75.sql new file mode 100644 index 0000000000..2a143232b5 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q75.sql @@ -0,0 +1,76 @@ +WITH all_sales AS ( + SELECT + d_year, + i_brand_id, + i_class_id, + i_category_id, + i_manufact_id, + SUM(sales_cnt) AS sales_cnt, + SUM(sales_amt) AS sales_amt + FROM ( + SELECT + d_year, + i_brand_id, + i_class_id, + i_category_id, + i_manufact_id, + cs_quantity - COALESCE(cr_return_quantity, 0) AS sales_cnt, + cs_ext_sales_price - COALESCE(cr_return_amount, 0.0) AS sales_amt + FROM catalog_sales + JOIN item ON i_item_sk = cs_item_sk + JOIN date_dim ON d_date_sk = cs_sold_date_sk + LEFT JOIN catalog_returns ON (cs_order_number = cr_order_number + AND cs_item_sk = cr_item_sk) + WHERE i_category = 'Books' + UNION + SELECT + d_year, + i_brand_id, + i_class_id, + i_category_id, + i_manufact_id, + ss_quantity - COALESCE(sr_return_quantity, 0) AS sales_cnt, + ss_ext_sales_price - COALESCE(sr_return_amt, 0.0) AS sales_amt + FROM store_sales + JOIN item ON i_item_sk = ss_item_sk + JOIN date_dim ON d_date_sk = ss_sold_date_sk + LEFT JOIN store_returns ON (ss_ticket_number = sr_ticket_number + AND ss_item_sk = sr_item_sk) + WHERE i_category = 'Books' + UNION + SELECT + d_year, + i_brand_id, + i_class_id, + i_category_id, + i_manufact_id, + ws_quantity - COALESCE(wr_return_quantity, 0) AS sales_cnt, + ws_ext_sales_price - COALESCE(wr_return_amt, 0.0) AS sales_amt + FROM web_sales + JOIN item ON i_item_sk = ws_item_sk + JOIN date_dim ON d_date_sk = ws_sold_date_sk + LEFT JOIN web_returns ON (ws_order_number = wr_order_number + AND ws_item_sk = wr_item_sk) + WHERE i_category = 'Books') sales_detail + GROUP BY d_year, i_brand_id, i_class_id, i_category_id, i_manufact_id) +SELECT + prev_yr.d_year AS prev_year, + curr_yr.d_year AS year, + curr_yr.i_brand_id, + curr_yr.i_class_id, + curr_yr.i_category_id, + curr_yr.i_manufact_id, + prev_yr.sales_cnt AS prev_yr_cnt, + curr_yr.sales_cnt AS curr_yr_cnt, + curr_yr.sales_cnt - prev_yr.sales_cnt AS sales_cnt_diff, + curr_yr.sales_amt - prev_yr.sales_amt AS sales_amt_diff +FROM all_sales curr_yr, all_sales prev_yr +WHERE curr_yr.i_brand_id = prev_yr.i_brand_id + AND curr_yr.i_class_id = prev_yr.i_class_id + AND curr_yr.i_category_id = prev_yr.i_category_id + AND curr_yr.i_manufact_id = prev_yr.i_manufact_id + AND curr_yr.d_year = 2002 + AND prev_yr.d_year = 2002 - 1 + AND CAST(curr_yr.sales_cnt AS DECIMAL(17, 2)) / CAST(prev_yr.sales_cnt AS DECIMAL(17, 2)) < 0.9 +ORDER BY sales_cnt_diff +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q76.sql b/athena-tpcds/src/main/resources/queries/q76.sql new file mode 100644 index 0000000000..815fa922be --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q76.sql @@ -0,0 +1,47 @@ +SELECT + channel, + col_name, + d_year, + d_qoy, + i_category, + COUNT(*) sales_cnt, + SUM(ext_sales_price) sales_amt +FROM ( + SELECT + 'store' AS channel, + ss_store_sk col_name, + d_year, + d_qoy, + i_category, + ss_ext_sales_price ext_sales_price + FROM store_sales, item, date_dim + WHERE ss_store_sk IS NULL + AND ss_sold_date_sk = d_date_sk + AND ss_item_sk = i_item_sk + UNION ALL + SELECT + 'web' AS channel, + ws_ship_customer_sk col_name, + d_year, + d_qoy, + i_category, + ws_ext_sales_price ext_sales_price + FROM web_sales, item, date_dim + WHERE ws_ship_customer_sk IS NULL + AND ws_sold_date_sk = d_date_sk + AND ws_item_sk = i_item_sk + UNION ALL + SELECT + 'catalog' AS channel, + cs_ship_addr_sk col_name, + d_year, + d_qoy, + i_category, + cs_ext_sales_price ext_sales_price + FROM catalog_sales, item, date_dim + WHERE cs_ship_addr_sk IS NULL + AND cs_sold_date_sk = d_date_sk + AND cs_item_sk = i_item_sk) foo +GROUP BY channel, col_name, d_year, d_qoy, i_category +ORDER BY channel, col_name, d_year, d_qoy, i_category +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q77.sql b/athena-tpcds/src/main/resources/queries/q77.sql new file mode 100644 index 0000000000..5ce4e91b8c --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q77.sql @@ -0,0 +1,100 @@ +WITH ss AS +(SELECT + s_store_sk, + sum(ss_ext_sales_price) AS sales, + sum(ss_net_profit) AS profit + FROM store_sales, date_dim, store + WHERE ss_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-03' AS DATE) AND + (cast('2000-08-03' AS DATE) + INTERVAL '30' day) + AND ss_store_sk = s_store_sk + GROUP BY s_store_sk), + sr AS + (SELECT + s_store_sk, + sum(sr_return_amt) AS returns, + sum(sr_net_loss) AS profit_loss + FROM store_returns, date_dim, store + WHERE sr_returned_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-03' AS DATE) AND + (cast('2000-08-03' AS DATE) + INTERVAL '30' day) + AND sr_store_sk = s_store_sk + GROUP BY s_store_sk), + cs AS + (SELECT + cs_call_center_sk, + sum(cs_ext_sales_price) AS sales, + sum(cs_net_profit) AS profit + FROM catalog_sales, date_dim + WHERE cs_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-03' AS DATE) AND + (cast('2000-08-03' AS DATE) + INTERVAL '30' day) + GROUP BY cs_call_center_sk), + cr AS + (SELECT + sum(cr_return_amount) AS returns, + sum(cr_net_loss) AS profit_loss + FROM catalog_returns, date_dim + WHERE cr_returned_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-03' AS DATE) AND + (cast('2000-08-03' AS DATE) + INTERVAL '30' day)), + ws AS + (SELECT + wp_web_page_sk, + sum(ws_ext_sales_price) AS sales, + sum(ws_net_profit) AS profit + FROM web_sales, date_dim, web_page + WHERE ws_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-03' AS DATE) AND + (cast('2000-08-03' AS DATE) + INTERVAL '30' day) + AND ws_web_page_sk = wp_web_page_sk + GROUP BY wp_web_page_sk), + wr AS + (SELECT + wp_web_page_sk, + sum(wr_return_amt) AS returns, + sum(wr_net_loss) AS profit_loss + FROM web_returns, date_dim, web_page + WHERE wr_returned_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-03' AS DATE) AND + (cast('2000-08-03' AS DATE) + INTERVAL '30' day) + AND wr_web_page_sk = wp_web_page_sk + GROUP BY wp_web_page_sk) +SELECT + channel, + id, + sum(sales) AS sales, + sum(returns) AS returns, + sum(profit) AS profit +FROM + (SELECT + 'store channel' AS channel, + ss.s_store_sk AS id, + sales, + coalesce(returns, 0) AS returns, + (profit - coalesce(profit_loss, 0)) AS profit + FROM ss + LEFT JOIN sr + ON ss.s_store_sk = sr.s_store_sk + UNION ALL + SELECT + 'catalog channel' AS channel, + cs_call_center_sk AS id, + sales, + returns, + (profit - profit_loss) AS profit + FROM cs, cr + UNION ALL + SELECT + 'web channel' AS channel, + ws.wp_web_page_sk AS id, + sales, + coalesce(returns, 0) returns, + (profit - coalesce(profit_loss, 0)) AS profit + FROM ws + LEFT JOIN wr + ON ws.wp_web_page_sk = wr.wp_web_page_sk + ) x +GROUP BY ROLLUP (channel, id) +ORDER BY channel, id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q78.sql b/athena-tpcds/src/main/resources/queries/q78.sql new file mode 100644 index 0000000000..07b0940e26 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q78.sql @@ -0,0 +1,64 @@ +WITH ws AS +(SELECT + d_year AS ws_sold_year, + ws_item_sk, + ws_bill_customer_sk ws_customer_sk, + sum(ws_quantity) ws_qty, + sum(ws_wholesale_cost) ws_wc, + sum(ws_sales_price) ws_sp + FROM web_sales + LEFT JOIN web_returns ON wr_order_number = ws_order_number AND ws_item_sk = wr_item_sk + JOIN date_dim ON ws_sold_date_sk = d_date_sk + WHERE wr_order_number IS NULL + GROUP BY d_year, ws_item_sk, ws_bill_customer_sk +), + cs AS + (SELECT + d_year AS cs_sold_year, + cs_item_sk, + cs_bill_customer_sk cs_customer_sk, + sum(cs_quantity) cs_qty, + sum(cs_wholesale_cost) cs_wc, + sum(cs_sales_price) cs_sp + FROM catalog_sales + LEFT JOIN catalog_returns ON cr_order_number = cs_order_number AND cs_item_sk = cr_item_sk + JOIN date_dim ON cs_sold_date_sk = d_date_sk + WHERE cr_order_number IS NULL + GROUP BY d_year, cs_item_sk, cs_bill_customer_sk + ), + ss AS + (SELECT + d_year AS ss_sold_year, + ss_item_sk, + ss_customer_sk, + sum(ss_quantity) ss_qty, + sum(ss_wholesale_cost) ss_wc, + sum(ss_sales_price) ss_sp + FROM store_sales + LEFT JOIN store_returns ON sr_ticket_number = ss_ticket_number AND ss_item_sk = sr_item_sk + JOIN date_dim ON ss_sold_date_sk = d_date_sk + WHERE sr_ticket_number IS NULL + GROUP BY d_year, ss_item_sk, ss_customer_sk + ) +SELECT + round(ss_qty / (coalesce(ws_qty + cs_qty, 1)), 2) ratio, + ss_qty store_qty, + ss_wc store_wholesale_cost, + ss_sp store_sales_price, + coalesce(ws_qty, 0) + coalesce(cs_qty, 0) other_chan_qty, + coalesce(ws_wc, 0) + coalesce(cs_wc, 0) other_chan_wholesale_cost, + coalesce(ws_sp, 0) + coalesce(cs_sp, 0) other_chan_sales_price +FROM ss + LEFT JOIN ws + ON (ws_sold_year = ss_sold_year AND ws_item_sk = ss_item_sk AND ws_customer_sk = ss_customer_sk) + LEFT JOIN cs + ON (cs_sold_year = ss_sold_year AND cs_item_sk = ss_item_sk AND cs_customer_sk = ss_customer_sk) +WHERE coalesce(ws_qty, 0) > 0 AND coalesce(cs_qty, 0) > 0 AND ss_sold_year = 2000 +ORDER BY + ratio, + ss_qty DESC, ss_wc DESC, ss_sp DESC, + other_chan_qty, + other_chan_wholesale_cost, + other_chan_sales_price, + round(ss_qty / (coalesce(ws_qty + cs_qty, 1)), 2) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q79.sql b/athena-tpcds/src/main/resources/queries/q79.sql new file mode 100644 index 0000000000..08f86dc203 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q79.sql @@ -0,0 +1,27 @@ +SELECT + c_last_name, + c_first_name, + substr(s_city, 1, 30), + ss_ticket_number, + amt, + profit +FROM + (SELECT + ss_ticket_number, + ss_customer_sk, + store.s_city, + sum(ss_coupon_amt) amt, + sum(ss_net_profit) profit + FROM store_sales, date_dim, store, household_demographics + WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk + AND store_sales.ss_store_sk = store.s_store_sk + AND store_sales.ss_hdemo_sk = household_demographics.hd_demo_sk + AND (household_demographics.hd_dep_count = 6 OR + household_demographics.hd_vehicle_count > 2) + AND date_dim.d_dow = 1 + AND date_dim.d_year IN (1999, 1999 + 1, 1999 + 2) + AND store.s_number_employees BETWEEN 200 AND 295 + GROUP BY ss_ticket_number, ss_customer_sk, ss_addr_sk, store.s_city) ms, customer +WHERE ss_customer_sk = c_customer_sk +ORDER BY c_last_name, c_first_name, substr(s_city, 1, 30), profit +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q8.sql b/athena-tpcds/src/main/resources/queries/q8.sql new file mode 100644 index 0000000000..9b31324225 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q8.sql @@ -0,0 +1,87 @@ +SELECT + s_store_name, + sum(ss_net_profit) +FROM store_sales[ TABLE_SUFFIX ], date_dim[ TABLE_SUFFIX ], store[ TABLE_SUFFIX ], + (SELECT ca_zip + FROM ( + (SELECT substr(ca_zip, 1, 5) ca_zip + FROM customer_address[ TABLE_SUFFIX ] + WHERE substr(ca_zip, 1, 5) IN ( + '24128','76232','65084','87816','83926','77556','20548', + '26231','43848','15126','91137','61265','98294','25782', + '17920','18426','98235','40081','84093','28577','55565', + '17183','54601','67897','22752','86284','18376','38607', + '45200','21756','29741','96765','23932','89360','29839', + '25989','28898','91068','72550','10390','18845','47770', + '82636','41367','76638','86198','81312','37126','39192', + '88424','72175','81426','53672','10445','42666','66864', + '66708','41248','48583','82276','18842','78890','49448', + '14089','38122','34425','79077','19849','43285','39861', + '66162','77610','13695','99543','83444','83041','12305', + '57665','68341','25003','57834','62878','49130','81096', + '18840','27700','23470','50412','21195','16021','76107', + '71954','68309','18119','98359','64544','10336','86379', + '27068','39736','98569','28915','24206','56529','57647', + '54917','42961','91110','63981','14922','36420','23006', + '67467','32754','30903','20260','31671','51798','72325', + '85816','68621','13955','36446','41766','68806','16725', + '15146','22744','35850','88086','51649','18270','52867', + '39972','96976','63792','11376','94898','13595','10516', + '90225','58943','39371','94945','28587','96576','57855', + '28488','26105','83933','25858','34322','44438','73171', + '30122','34102','22685','71256','78451','54364','13354', + '45375','40558','56458','28286','45266','47305','69399', + '83921','26233','11101','15371','69913','35942','15882', + '25631','24610','44165','99076','33786','70738','26653', + '14328','72305','62496','22152','10144','64147','48425', + '14663','21076','18799','30450','63089','81019','68893', + '24996','51200','51211','45692','92712','70466','79994', + '22437','25280','38935','71791','73134','56571','14060', + '19505','72425','56575','74351','68786','51650','20004', + '18383','76614','11634','18906','15765','41368','73241', + '76698','78567','97189','28545','76231','75691','22246', + '51061','90578','56691','68014','51103','94167','57047', + '14867','73520','15734','63435','25733','35474','24676', + '94627','53535','17879','15559','53268','59166','11928', + '59402','33282','45721','43933','68101','33515','36634', + '71286','19736','58058','55253','67473','41918','19515', + '36495','19430','22351','77191','91393','49156','50298', + '87501','18652','53179','18767','63193','23968','65164', + '68880','21286','72823','58470','67301','13394','31016', + '70372','67030','40604','24317','45748','39127','26065', + '77721','31029','31880','60576','24671','45549','13376', + '50016','33123','19769','22927','97789','46081','72151', + '15723','46136','51949','68100','96888','64528','14171', + '79777','28709','11489','25103','32213','78668','22245', + '15798','27156','37930','62971','21337','51622','67853', + '10567','38415','15455','58263','42029','60279','37125', + '56240','88190','50308','26859','64457','89091','82136', + '62377','36233','63837','58078','17043','30010','60099', + '28810','98025','29178','87343','73273','30469','64034', + '39516','86057','21309','90257','67875','40162','11356', + '73650','61810','72013','30431','22461','19512','13375', + '55307','30625','83849','68908','26689','96451','38193', + '46820','88885','84935','69035','83144','47537','56616', + '94983','48033','69952','25486','61547','27385','61860', + '58048','56910','16807','17871','35258','31387','35458', + '35576')) + INTERSECT + (SELECT ca_zip + FROM + (SELECT + substr(ca_zip, 1, 5) ca_zip, + count(*) cnt + FROM customer_address[ TABLE_SUFFIX ], customer[ TABLE_SUFFIX ] + WHERE ca_address_sk = c_current_addr_sk AND + c_preferred_cust_flag = 'Y' + GROUP BY ca_zip + HAVING count(*) > 10) A1) + ) A2 + ) V1 +WHERE ss_store_sk = s_store_sk + AND ss_sold_date_sk = d_date_sk + AND d_qoy = 2 AND d_year = 1998 + AND (substr(s_zip, 1, 2) = substr(V1.ca_zip, 1, 2)) +GROUP BY s_store_name +ORDER BY s_store_name +LIMIT 100 \ No newline at end of file diff --git a/athena-tpcds/src/main/resources/queries/q80.sql b/athena-tpcds/src/main/resources/queries/q80.sql new file mode 100644 index 0000000000..cd4b80ada4 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q80.sql @@ -0,0 +1,94 @@ +WITH ssr AS +(SELECT + s_store_id AS store_id, + sum(ss_ext_sales_price) AS sales, + sum(coalesce(sr_return_amt, 0)) AS returns, + sum(ss_net_profit - coalesce(sr_net_loss, 0)) AS profit + FROM store_sales + LEFT OUTER JOIN store_returns ON + (ss_item_sk = sr_item_sk AND + ss_ticket_number = sr_ticket_number) + , + date_dim, store, item, promotion + WHERE ss_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-23' AS DATE) + AND (cast('2000-08-23' AS DATE) + INTERVAL '30' day) + AND ss_store_sk = s_store_sk + AND ss_item_sk = i_item_sk + AND i_current_price > 50 + AND ss_promo_sk = p_promo_sk + AND p_channel_tv = 'N' + GROUP BY s_store_id), + csr AS + (SELECT + cp_catalog_page_id AS catalog_page_id, + sum(cs_ext_sales_price) AS sales, + sum(coalesce(cr_return_amount, 0)) AS returns, + sum(cs_net_profit - coalesce(cr_net_loss, 0)) AS profit + FROM catalog_sales + LEFT OUTER JOIN catalog_returns ON + (cs_item_sk = cr_item_sk AND + cs_order_number = cr_order_number) + , + date_dim, catalog_page, item, promotion + WHERE cs_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-23' AS DATE) + AND (cast('2000-08-23' AS DATE) + INTERVAL '30' day) + AND cs_catalog_page_sk = cp_catalog_page_sk + AND cs_item_sk = i_item_sk + AND i_current_price > 50 + AND cs_promo_sk = p_promo_sk + AND p_channel_tv = 'N' + GROUP BY cp_catalog_page_id), + wsr AS + (SELECT + web_site_id, + sum(ws_ext_sales_price) AS sales, + sum(coalesce(wr_return_amt, 0)) AS returns, + sum(ws_net_profit - coalesce(wr_net_loss, 0)) AS profit + FROM web_sales + LEFT OUTER JOIN web_returns ON + (ws_item_sk = wr_item_sk AND ws_order_number = wr_order_number) + , + date_dim, web_site, item, promotion + WHERE ws_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-08-23' AS DATE) + AND (cast('2000-08-23' AS DATE) + INTERVAL '30' day) + AND ws_web_site_sk = web_site_sk + AND ws_item_sk = i_item_sk + AND i_current_price > 50 + AND ws_promo_sk = p_promo_sk + AND p_channel_tv = 'N' + GROUP BY web_site_id) +SELECT + channel, + id, + sum(sales) AS sales, + sum(returns) AS returns, + sum(profit) AS profit +FROM (SELECT + 'store channel' AS channel, + concat('store', store_id) AS id, + sales, + returns, + profit + FROM ssr + UNION ALL + SELECT + 'catalog channel' AS channel, + concat('catalog_page', catalog_page_id) AS id, + sales, + returns, + profit + FROM csr + UNION ALL + SELECT + 'web channel' AS channel, + concat('web_site', web_site_id) AS id, + sales, + returns, + profit + FROM wsr) x +GROUP BY ROLLUP (channel, id) +ORDER BY channel, id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q81.sql b/athena-tpcds/src/main/resources/queries/q81.sql new file mode 100644 index 0000000000..18f0ffa7e8 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q81.sql @@ -0,0 +1,38 @@ +WITH customer_total_return AS +(SELECT + cr_returning_customer_sk AS ctr_customer_sk, + ca_state AS ctr_state, + sum(cr_return_amt_inc_tax) AS ctr_total_return + FROM catalog_returns, date_dim, customer_address + WHERE cr_returned_date_sk = d_date_sk + AND d_year = 2000 + AND cr_returning_addr_sk = ca_address_sk + GROUP BY cr_returning_customer_sk, ca_state ) +SELECT + c_customer_id, + c_salutation, + c_first_name, + c_last_name, + ca_street_number, + ca_street_name, + ca_street_type, + ca_suite_number, + ca_city, + ca_county, + ca_state, + ca_zip, + ca_country, + ca_gmt_offset, + ca_location_type, + ctr_total_return +FROM customer_total_return ctr1, customer_address, customer +WHERE ctr1.ctr_total_return > (SELECT avg(ctr_total_return) * 1.2 +FROM customer_total_return ctr2 +WHERE ctr1.ctr_state = ctr2.ctr_state) + AND ca_address_sk = c_current_addr_sk + AND ca_state = 'GA' + AND ctr1.ctr_customer_sk = c_customer_sk +ORDER BY c_customer_id, c_salutation, c_first_name, c_last_name, ca_street_number, ca_street_name + , ca_street_type, ca_suite_number, ca_city, ca_county, ca_state, ca_zip, ca_country, ca_gmt_offset + , ca_location_type, ctr_total_return +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q82.sql b/athena-tpcds/src/main/resources/queries/q82.sql new file mode 100644 index 0000000000..185eb6911b --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q82.sql @@ -0,0 +1,15 @@ +SELECT + i_item_id, + i_item_desc, + i_current_price +FROM item, inventory, date_dim, store_sales +WHERE i_current_price BETWEEN 62 AND 62 + 30 + AND inv_item_sk = i_item_sk + AND d_date_sk = inv_date_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-05-25' AS DATE) AND (cast('2000-05-25' AS DATE) + INTERVAL '60' day) + AND i_manufact_id IN (129, 270, 821, 423) + AND inv_quantity_on_hand BETWEEN 100 AND 500 + AND ss_item_sk = i_item_sk +GROUP BY i_item_id, i_item_desc, i_current_price +ORDER BY i_item_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q83.sql b/athena-tpcds/src/main/resources/queries/q83.sql new file mode 100644 index 0000000000..988d26006e --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q83.sql @@ -0,0 +1,56 @@ +WITH sr_items AS +(SELECT + i_item_id item_id, + sum(sr_return_quantity) sr_item_qty + FROM store_returns, item, date_dim + WHERE sr_item_sk = i_item_sk + AND CAST(d_date as DATE) IN (SELECT d_date + FROM date_dim + WHERE d_week_seq IN + (SELECT d_week_seq + FROM date_dim + WHERE CAST(d_date as DATE) IN (CAST('2000-06-30' AS DATE), CAST('2000-09-27' AS DATE), CAST('2000-11-17' AS DATE)))) + AND sr_returned_date_sk = d_date_sk + GROUP BY i_item_id), + cr_items AS + (SELECT + i_item_id item_id, + sum(cr_return_quantity) cr_item_qty + FROM catalog_returns, item, date_dim + WHERE cr_item_sk = i_item_sk + AND CAST(d_date as DATE) IN (SELECT d_date + FROM date_dim + WHERE d_week_seq IN + (SELECT d_week_seq + FROM date_dim + WHERE CAST(d_date as DATE) IN (CAST('2000-06-30' AS DATE), CAST('2000-09-27' AS DATE), CAST('2000-11-17' AS DATE)))) + AND cr_returned_date_sk = d_date_sk + GROUP BY i_item_id), + wr_items AS + (SELECT + i_item_id item_id, + sum(wr_return_quantity) wr_item_qty + FROM web_returns, item, date_dim + WHERE wr_item_sk = i_item_sk AND CAST(d_date as DATE) IN + (SELECT d_date + FROM date_dim + WHERE d_week_seq IN + (SELECT d_week_seq + FROM date_dim + WHERE CAST(d_date as DATE) IN (CAST('2000-06-30' AS DATE), CAST('2000-09-27' AS DATE), CAST('2000-11-17' AS DATE)))) + AND wr_returned_date_sk = d_date_sk + GROUP BY i_item_id) +SELECT + sr_items.item_id, + sr_item_qty, + sr_item_qty / (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 * 100 sr_dev, + cr_item_qty, + cr_item_qty / (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 * 100 cr_dev, + wr_item_qty, + wr_item_qty / (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 * 100 wr_dev, + (sr_item_qty + cr_item_qty + wr_item_qty) / 3.0 average +FROM sr_items, cr_items, wr_items +WHERE sr_items.item_id = cr_items.item_id + AND sr_items.item_id = wr_items.item_id +ORDER BY sr_items.item_id, sr_item_qty +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q84.sql b/athena-tpcds/src/main/resources/queries/q84.sql new file mode 100644 index 0000000000..a1076b57ce --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q84.sql @@ -0,0 +1,19 @@ +SELECT + c_customer_id AS customer_id, + concat(c_last_name, ', ', c_first_name) AS customername +FROM customer + , customer_address + , customer_demographics + , household_demographics + , income_band + , store_returns +WHERE ca_city = 'Edgewood' + AND c_current_addr_sk = ca_address_sk + AND ib_lower_bound >= 38128 + AND ib_upper_bound <= 38128 + 50000 + AND ib_income_band_sk = hd_income_band_sk + AND cd_demo_sk = c_current_cdemo_sk + AND hd_demo_sk = c_current_hdemo_sk + AND sr_cdemo_sk = cd_demo_sk +ORDER BY c_customer_id +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q85.sql b/athena-tpcds/src/main/resources/queries/q85.sql new file mode 100644 index 0000000000..cf718b0f8a --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q85.sql @@ -0,0 +1,82 @@ +SELECT + substr(r_reason_desc, 1, 20), + avg(ws_quantity), + avg(wr_refunded_cash), + avg(wr_fee) +FROM web_sales, web_returns, web_page, customer_demographics cd1, + customer_demographics cd2, customer_address, date_dim, reason +WHERE ws_web_page_sk = wp_web_page_sk + AND ws_item_sk = wr_item_sk + AND ws_order_number = wr_order_number + AND ws_sold_date_sk = d_date_sk AND d_year = 2000 + AND cd1.cd_demo_sk = wr_refunded_cdemo_sk + AND cd2.cd_demo_sk = wr_returning_cdemo_sk + AND ca_address_sk = wr_refunded_addr_sk + AND r_reason_sk = wr_reason_sk + AND + ( + ( + cd1.cd_marital_status = 'M' + AND + cd1.cd_marital_status = cd2.cd_marital_status + AND + cd1.cd_education_status = 'Advanced Degree' + AND + cd1.cd_education_status = cd2.cd_education_status + AND + ws_sales_price BETWEEN 100.00 AND 150.00 + ) + OR + ( + cd1.cd_marital_status = 'S' + AND + cd1.cd_marital_status = cd2.cd_marital_status + AND + cd1.cd_education_status = 'College' + AND + cd1.cd_education_status = cd2.cd_education_status + AND + ws_sales_price BETWEEN 50.00 AND 100.00 + ) + OR + ( + cd1.cd_marital_status = 'W' + AND + cd1.cd_marital_status = cd2.cd_marital_status + AND + cd1.cd_education_status = '2 yr Degree' + AND + cd1.cd_education_status = cd2.cd_education_status + AND + ws_sales_price BETWEEN 150.00 AND 200.00 + ) + ) + AND + ( + ( + ca_country = 'United States' + AND + ca_state IN ('IN', 'OH', 'NJ') + AND ws_net_profit BETWEEN 100 AND 200 + ) + OR + ( + ca_country = 'United States' + AND + ca_state IN ('WI', 'CT', 'KY') + AND ws_net_profit BETWEEN 150 AND 300 + ) + OR + ( + ca_country = 'United States' + AND + ca_state IN ('LA', 'IA', 'AR') + AND ws_net_profit BETWEEN 50 AND 250 + ) + ) +GROUP BY r_reason_desc +ORDER BY substr(r_reason_desc, 1, 20) + , avg(ws_quantity) + , avg(wr_refunded_cash) + , avg(wr_fee) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q87.sql b/athena-tpcds/src/main/resources/queries/q87.sql new file mode 100644 index 0000000000..4aaa9f39dc --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q87.sql @@ -0,0 +1,28 @@ +SELECT count(*) +FROM ((SELECT DISTINCT + c_last_name, + c_first_name, + d_date +FROM store_sales, date_dim, customer +WHERE store_sales.ss_sold_date_sk = date_dim.d_date_sk + AND store_sales.ss_customer_sk = customer.c_customer_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11) + EXCEPT + (SELECT DISTINCT + c_last_name, + c_first_name, + d_date + FROM catalog_sales, date_dim, customer + WHERE catalog_sales.cs_sold_date_sk = date_dim.d_date_sk + AND catalog_sales.cs_bill_customer_sk = customer.c_customer_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11) + EXCEPT + (SELECT DISTINCT + c_last_name, + c_first_name, + d_date + FROM web_sales, date_dim, customer + WHERE web_sales.ws_sold_date_sk = date_dim.d_date_sk + AND web_sales.ws_bill_customer_sk = customer.c_customer_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11) + ) cool_cust diff --git a/athena-tpcds/src/main/resources/queries/q88.sql b/athena-tpcds/src/main/resources/queries/q88.sql new file mode 100644 index 0000000000..25bcd90f41 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q88.sql @@ -0,0 +1,122 @@ +SELECT * +FROM + (SELECT count(*) h8_30_to_9 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 8 + AND time_dim.t_minute >= 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s1, + (SELECT count(*) h9_to_9_30 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 9 + AND time_dim.t_minute < 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s2, + (SELECT count(*) h9_30_to_10 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 9 + AND time_dim.t_minute >= 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s3, + (SELECT count(*) h10_to_10_30 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 10 + AND time_dim.t_minute < 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s4, + (SELECT count(*) h10_30_to_11 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 10 + AND time_dim.t_minute >= 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s5, + (SELECT count(*) h11_to_11_30 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 11 + AND time_dim.t_minute < 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s6, + (SELECT count(*) h11_30_to_12 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 11 + AND time_dim.t_minute >= 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s7, + (SELECT count(*) h12_to_12_30 + FROM store_sales, household_demographics, time_dim, store + WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 12 + AND time_dim.t_minute < 30 + AND ( + (household_demographics.hd_dep_count = 4 AND household_demographics.hd_vehicle_count <= 4 + 2) + OR + (household_demographics.hd_dep_count = 2 AND household_demographics.hd_vehicle_count <= 2 + 2) + OR + (household_demographics.hd_dep_count = 0 AND + household_demographics.hd_vehicle_count <= 0 + 2)) + AND store.s_store_name = 'ese') s8 diff --git a/athena-tpcds/src/main/resources/queries/q89.sql b/athena-tpcds/src/main/resources/queries/q89.sql new file mode 100644 index 0000000000..75408cb032 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q89.sql @@ -0,0 +1,30 @@ +SELECT * +FROM ( + SELECT + i_category, + i_class, + i_brand, + s_store_name, + s_company_name, + d_moy, + sum(ss_sales_price) sum_sales, + avg(sum(ss_sales_price)) + OVER + (PARTITION BY i_category, i_brand, s_store_name, s_company_name) + avg_monthly_sales + FROM item, store_sales, date_dim, store + WHERE ss_item_sk = i_item_sk AND + ss_sold_date_sk = d_date_sk AND + ss_store_sk = s_store_sk AND + d_year IN (1999) AND + ((i_category IN ('Books', 'Electronics', 'Sports') AND + i_class IN ('computers', 'stereo', 'football')) + OR (i_category IN ('Men', 'Jewelry', 'Women') AND + i_class IN ('shirts', 'birdal', 'dresses'))) + GROUP BY i_category, i_class, i_brand, + s_store_name, s_company_name, d_moy) tmp1 +WHERE CASE WHEN (avg_monthly_sales <> 0) + THEN (abs(sum_sales - avg_monthly_sales) / avg_monthly_sales) + ELSE NULL END > 0.1 +ORDER BY sum_sales - avg_monthly_sales, s_store_name +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q9.sql b/athena-tpcds/src/main/resources/queries/q9.sql new file mode 100644 index 0000000000..539407094e --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q9.sql @@ -0,0 +1,48 @@ +SELECT + CASE WHEN (SELECT count(*) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 1 AND 20) > 62316685 + THEN (SELECT avg(ss_ext_discount_amt) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 1 AND 20) + ELSE (SELECT avg(ss_net_paid) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 1 AND 20) END bucket1, + CASE WHEN (SELECT count(*) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 21 AND 40) > 19045798 + THEN (SELECT avg(ss_ext_discount_amt) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 21 AND 40) + ELSE (SELECT avg(ss_net_paid) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 21 AND 40) END bucket2, + CASE WHEN (SELECT count(*) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 41 AND 60) > 365541424 + THEN (SELECT avg(ss_ext_discount_amt) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 41 AND 60) + ELSE (SELECT avg(ss_net_paid) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 41 AND 60) END bucket3, + CASE WHEN (SELECT count(*) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 61 AND 80) > 216357808 + THEN (SELECT avg(ss_ext_discount_amt) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 61 AND 80) + ELSE (SELECT avg(ss_net_paid) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 61 AND 80) END bucket4, + CASE WHEN (SELECT count(*) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 81 AND 100) > 184483884 + THEN (SELECT avg(ss_ext_discount_amt) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 81 AND 100) + ELSE (SELECT avg(ss_net_paid) + FROM store_sales[ TABLE_SUFFIX ] + WHERE ss_quantity BETWEEN 81 AND 100) END bucket5 +FROM reason[ TABLE_SUFFIX ] +WHERE r_reason_sk = 1 diff --git a/athena-tpcds/src/main/resources/queries/q90.sql b/athena-tpcds/src/main/resources/queries/q90.sql new file mode 100644 index 0000000000..85e35bf8bf --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q90.sql @@ -0,0 +1,19 @@ +SELECT cast(amc AS DECIMAL(15, 4)) / cast(pmc AS DECIMAL(15, 4)) am_pm_ratio +FROM (SELECT count(*) amc +FROM web_sales, household_demographics, time_dim, web_page +WHERE ws_sold_time_sk = time_dim.t_time_sk + AND ws_ship_hdemo_sk = household_demographics.hd_demo_sk + AND ws_web_page_sk = web_page.wp_web_page_sk + AND time_dim.t_hour BETWEEN 8 AND 8 + 1 + AND household_demographics.hd_dep_count = 6 + AND web_page.wp_char_count BETWEEN 5000 AND 5200) at, + (SELECT count(*) pmc + FROM web_sales, household_demographics, time_dim, web_page + WHERE ws_sold_time_sk = time_dim.t_time_sk + AND ws_ship_hdemo_sk = household_demographics.hd_demo_sk + AND ws_web_page_sk = web_page.wp_web_page_sk + AND time_dim.t_hour BETWEEN 19 AND 19 + 1 + AND household_demographics.hd_dep_count = 6 + AND web_page.wp_char_count BETWEEN 5000 AND 5200) pt +ORDER BY am_pm_ratio +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q91.sql b/athena-tpcds/src/main/resources/queries/q91.sql new file mode 100644 index 0000000000..9ca7ce00ac --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q91.sql @@ -0,0 +1,23 @@ +SELECT + cc_call_center_id Call_Center, + cc_name Call_Center_Name, + cc_manager Manager, + sum(cr_net_loss) Returns_Loss +FROM + call_center, catalog_returns, date_dim, customer, customer_address, + customer_demographics, household_demographics +WHERE + cr_call_center_sk = cc_call_center_sk + AND cr_returned_date_sk = d_date_sk + AND cr_returning_customer_sk = c_customer_sk + AND cd_demo_sk = c_current_cdemo_sk + AND hd_demo_sk = c_current_hdemo_sk + AND ca_address_sk = c_current_addr_sk + AND d_year = 1998 + AND d_moy = 11 + AND ((cd_marital_status = 'M' AND cd_education_status = 'Unknown') + OR (cd_marital_status = 'W' AND cd_education_status = 'Advanced Degree')) + AND hd_buy_potential LIKE 'Unknown%' + AND ca_gmt_offset = -7 +GROUP BY cc_call_center_id, cc_name, cc_manager, cd_marital_status, cd_education_status +ORDER BY sum(cr_net_loss) DESC diff --git a/athena-tpcds/src/main/resources/queries/q92.sql b/athena-tpcds/src/main/resources/queries/q92.sql new file mode 100644 index 0000000000..65cb68d5f4 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q92.sql @@ -0,0 +1,16 @@ +SELECT sum(ws_ext_discount_amt) AS "Excess Discount Amount " +FROM web_sales, item, date_dim +WHERE i_manufact_id = 350 + AND i_item_sk = ws_item_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-01-27' AS DATE) AND (cast('2000-01-27' AS DATE) + INTERVAL '90' day) + AND d_date_sk = ws_sold_date_sk + AND ws_ext_discount_amt > + ( + SELECT 1.3 * avg(ws_ext_discount_amt) + FROM web_sales, date_dim + WHERE ws_item_sk = i_item_sk + AND CAST(d_date as DATE) BETWEEN cast('2000-01-27' AS DATE) AND (cast('2000-01-27' AS DATE) + INTERVAL '90' day) + AND d_date_sk = ws_sold_date_sk + ) +ORDER BY sum(ws_ext_discount_amt) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q93.sql b/athena-tpcds/src/main/resources/queries/q93.sql new file mode 100644 index 0000000000..222dc31c1f --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q93.sql @@ -0,0 +1,19 @@ +SELECT + ss_customer_sk, + sum(act_sales) sumsales +FROM (SELECT + ss_item_sk, + ss_ticket_number, + ss_customer_sk, + CASE WHEN sr_return_quantity IS NOT NULL + THEN (ss_quantity - sr_return_quantity) * ss_sales_price + ELSE (ss_quantity * ss_sales_price) END act_sales +FROM store_sales + LEFT OUTER JOIN store_returns + ON (sr_item_sk = ss_item_sk AND sr_ticket_number = ss_ticket_number) + , + reason +WHERE sr_reason_sk = r_reason_sk AND r_reason_desc = 'reason 28') t +GROUP BY ss_customer_sk +ORDER BY sumsales, ss_customer_sk +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q94.sql b/athena-tpcds/src/main/resources/queries/q94.sql new file mode 100644 index 0000000000..86ae5d6be5 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q94.sql @@ -0,0 +1,23 @@ +SELECT + count(DISTINCT ws_order_number) AS "order count ", + sum(ws_ext_ship_cost) AS "total shipping cost ", + sum(ws_net_profit) AS "total net profit " +FROM + web_sales ws1, date_dim, customer_address, web_site +WHERE + CAST(d_date as DATE) BETWEEN CAST('1999-02-01' AS DATE) AND + (CAST('1999-02-01' AS DATE) + INTERVAL '60' day) + AND ws1.ws_ship_date_sk = d_date_sk + AND ws1.ws_ship_addr_sk = ca_address_sk + AND ca_state = 'IL' + AND ws1.ws_web_site_sk = web_site_sk + AND web_company_name = 'pri' + AND EXISTS(SELECT * + FROM web_sales ws2 + WHERE ws1.ws_order_number = ws2.ws_order_number + AND ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) + AND NOT EXISTS(SELECT * + FROM web_returns wr1 + WHERE ws1.ws_order_number = wr1.wr_order_number) +ORDER BY count(DISTINCT ws_order_number) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q95.sql b/athena-tpcds/src/main/resources/queries/q95.sql new file mode 100644 index 0000000000..ae6220a8c4 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q95.sql @@ -0,0 +1,29 @@ +WITH ws_wh AS +(SELECT + ws1.ws_order_number, + ws1.ws_warehouse_sk wh1, + ws2.ws_warehouse_sk wh2 + FROM web_sales ws1, web_sales ws2 + WHERE ws1.ws_order_number = ws2.ws_order_number + AND ws1.ws_warehouse_sk <> ws2.ws_warehouse_sk) +SELECT + count(DISTINCT ws_order_number) AS "order count ", + sum(ws_ext_ship_cost) AS "total shipping cost ", + sum(ws_net_profit) AS "total net profit " +FROM + web_sales ws1, date_dim, customer_address, web_site +WHERE + CAST(d_date as DATE) BETWEEN CAST('1999-02-01' AS DATE) AND + (CAST('1999-02-01' AS DATE) + INTERVAL '60' DAY) + AND ws1.ws_ship_date_sk = d_date_sk + AND ws1.ws_ship_addr_sk = ca_address_sk + AND ca_state = 'IL' + AND ws1.ws_web_site_sk = web_site_sk + AND web_company_name = 'pri' + AND ws1.ws_order_number IN (SELECT ws_order_number + FROM ws_wh) + AND ws1.ws_order_number IN (SELECT wr_order_number + FROM web_returns, ws_wh + WHERE wr_order_number = ws_wh.ws_order_number) +ORDER BY count(DISTINCT ws_order_number) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q96.sql b/athena-tpcds/src/main/resources/queries/q96.sql new file mode 100644 index 0000000000..7ab17e7bc4 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q96.sql @@ -0,0 +1,11 @@ +SELECT count(*) +FROM store_sales, household_demographics, time_dim, store +WHERE ss_sold_time_sk = time_dim.t_time_sk + AND ss_hdemo_sk = household_demographics.hd_demo_sk + AND ss_store_sk = s_store_sk + AND time_dim.t_hour = 20 + AND time_dim.t_minute >= 30 + AND household_demographics.hd_dep_count = 7 + AND store.s_store_name = 'ese' +ORDER BY count(*) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q97.sql b/athena-tpcds/src/main/resources/queries/q97.sql new file mode 100644 index 0000000000..e7e0b1a052 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q97.sql @@ -0,0 +1,30 @@ +WITH ssci AS ( + SELECT + ss_customer_sk customer_sk, + ss_item_sk item_sk + FROM store_sales, date_dim + WHERE ss_sold_date_sk = d_date_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + GROUP BY ss_customer_sk, ss_item_sk), + csci AS ( + SELECT + cs_bill_customer_sk customer_sk, + cs_item_sk item_sk + FROM catalog_sales, date_dim + WHERE cs_sold_date_sk = d_date_sk + AND d_month_seq BETWEEN 1200 AND 1200 + 11 + GROUP BY cs_bill_customer_sk, cs_item_sk) +SELECT + sum(CASE WHEN ssci.customer_sk IS NOT NULL AND csci.customer_sk IS NULL + THEN 1 + ELSE 0 END) store_only, + sum(CASE WHEN ssci.customer_sk IS NULL AND csci.customer_sk IS NOT NULL + THEN 1 + ELSE 0 END) catalog_only, + sum(CASE WHEN ssci.customer_sk IS NOT NULL AND csci.customer_sk IS NOT NULL + THEN 1 + ELSE 0 END) store_and_catalog +FROM ssci + FULL OUTER JOIN csci ON (ssci.customer_sk = csci.customer_sk + AND ssci.item_sk = csci.item_sk) +LIMIT 100 diff --git a/athena-tpcds/src/main/resources/queries/q98.sql b/athena-tpcds/src/main/resources/queries/q98.sql new file mode 100644 index 0000000000..4412d07d2d --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q98.sql @@ -0,0 +1,21 @@ +SELECT + i_item_desc, + i_category, + i_class, + i_current_price, + sum(ss_ext_sales_price) AS itemrevenue, + sum(ss_ext_sales_price) * 100 / sum(sum(ss_ext_sales_price)) + OVER + (PARTITION BY i_class) AS revenueratio +FROM + store_sales, item, date_dim +WHERE + ss_item_sk = i_item_sk + AND i_category IN ('Sports', 'Books', 'Home') + AND ss_sold_date_sk = d_date_sk + AND CAST(d_date as DATE) BETWEEN cast('1999-02-22' AS DATE) + AND (cast('1999-02-22' AS DATE) + INTERVAL '30' day) +GROUP BY + i_item_id, i_item_desc, i_category, i_class, i_current_price +ORDER BY + i_category, i_class, i_item_id, i_item_desc, revenueratio diff --git a/athena-tpcds/src/main/resources/queries/q99.sql b/athena-tpcds/src/main/resources/queries/q99.sql new file mode 100644 index 0000000000..18aee25c77 --- /dev/null +++ b/athena-tpcds/src/main/resources/queries/q99.sql @@ -0,0 +1,34 @@ +SELECT + substr(w_warehouse_name, 1, 20), + sm_type, + cc_name, + sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk <= 30) + THEN 1 + ELSE 0 END) AS "30 days ", + sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 30) AND + (cs_ship_date_sk - cs_sold_date_sk <= 60) + THEN 1 + ELSE 0 END) AS "31 - 60 days ", + sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 60) AND + (cs_ship_date_sk - cs_sold_date_sk <= 90) + THEN 1 + ELSE 0 END) AS "61 - 90 days ", + sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 90) AND + (cs_ship_date_sk - cs_sold_date_sk <= 120) + THEN 1 + ELSE 0 END) AS "91 - 120 days ", + sum(CASE WHEN (cs_ship_date_sk - cs_sold_date_sk > 120) + THEN 1 + ELSE 0 END) AS ">120 days " +FROM + catalog_sales, warehouse, ship_mode, call_center, date_dim +WHERE + d_month_seq BETWEEN 1200 AND 1200 + 11 + AND cs_ship_date_sk = d_date_sk + AND cs_warehouse_sk = w_warehouse_sk + AND cs_ship_mode_sk = sm_ship_mode_sk + AND cs_call_center_sk = cc_call_center_sk +GROUP BY + substr(w_warehouse_name, 1, 20), sm_type, cc_name +ORDER BY substr(w_warehouse_name, 1, 20), sm_type, cc_name +LIMIT 100 diff --git a/athena-tpcds/src/test/java/com/amazonaws/athena/connectors/tpcds/TPCDSMetadataHandlerTest.java b/athena-tpcds/src/test/java/com/amazonaws/athena/connectors/tpcds/TPCDSMetadataHandlerTest.java new file mode 100644 index 0000000000..2fa6389807 --- /dev/null +++ b/athena-tpcds/src/test/java/com/amazonaws/athena/connectors/tpcds/TPCDSMetadataHandlerTest.java @@ -0,0 +1,224 @@ +/*- + * #%L + * athena-tpcds + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.tpcds; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetSplitsResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableLayoutResponse; +import com.amazonaws.athena.connector.lambda.metadata.GetTableRequest; +import com.amazonaws.athena.connector.lambda.metadata.GetTableResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListSchemasResponse; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesRequest; +import com.amazonaws.athena.connector.lambda.metadata.ListTablesResponse; +import com.amazonaws.athena.connector.lambda.metadata.MetadataRequestType; +import com.amazonaws.athena.connector.lambda.metadata.MetadataResponse; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.util.Collections; +import java.util.HashMap; +import java.util.Map; + +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_NUMBER_FIELD; +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_SCALE_FACTOR_FIELD; +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_TOTAL_NUMBER_FIELD; +import static org.junit.Assert.*; + +@RunWith(MockitoJUnitRunner.class) +public class TPCDSMetadataHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(TPCDSMetadataHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private TPCDSMetadataHandler handler; + private BlockAllocator allocator; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + handler = new TPCDSMetadataHandler(new LocalKeyFactory(), mockSecretsManager, mockAthena, "spillBucket", "spillPrefix"); + allocator = new BlockAllocatorImpl(); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doListSchemaNames() + { + logger.info("doListSchemas - enter"); + + ListSchemasRequest req = new ListSchemasRequest(identity, "queryId", "default"); + ListSchemasResponse res = handler.doListSchemaNames(allocator, req); + logger.info("doListSchemas - {}", res.getSchemas()); + + assertTrue(res.getSchemas().size() == 5); + logger.info("doListSchemas - exit"); + } + + @Test + public void doListTables() + { + logger.info("doListTables - enter"); + + ListTablesRequest req = new ListTablesRequest(identity, "queryId", "default", "tpcds1"); + ListTablesResponse res = handler.doListTables(allocator, req); + logger.info("doListTables - {}", res.getTables()); + + assertTrue(res.getTables().contains(new TableName("tpcds1", "customer"))); + + assertTrue(res.getTables().size() == 25); + + logger.info("doListTables - exit"); + } + + @Test + public void doGetTable() + { + logger.info("doGetTable - enter"); + String expectedSchema = "tpcds1"; + + GetTableRequest req = new GetTableRequest(identity, + "queryId", + "default", + new TableName(expectedSchema, "customer")); + + GetTableResponse res = handler.doGetTable(allocator, req); + logger.info("doGetTable - {} {}", res.getTableName(), res.getSchema()); + + assertEquals(new TableName(expectedSchema, "customer"), res.getTableName()); + assertTrue(res.getSchema() != null); + + logger.info("doGetTable - exit"); + } + + @Test + public void doGetTableLayout() + throws Exception + { + logger.info("doGetTableLayout - enter"); + + Map constraintsMap = new HashMap<>(); + + Schema schema = SchemaBuilder.newBuilder().build(); + + GetTableLayoutRequest req = new GetTableLayoutRequest(identity, + "queryId", + "default", + new TableName("tpcds1", "customer"), + new Constraints(constraintsMap), + schema, + Collections.EMPTY_SET); + + GetTableLayoutResponse res = handler.doGetTableLayout(allocator, req); + + logger.info("doGetTableLayout - {}", res.getPartitions().getSchema()); + logger.info("doGetTableLayout - {}", res.getPartitions()); + + assertTrue(res.getPartitions().getRowCount() == 1); + + logger.info("doGetTableLayout - exit"); + } + + @Test + public void doGetSplits() + { + logger.info("doGetSplits: enter"); + + Schema schema = SchemaBuilder.newBuilder() + .addIntField("partitionId") + .build(); + + Block partitions = BlockUtils.newBlock(allocator, "partitionId", Types.MinorType.INT.getType(), 1); + + String continuationToken = null; + GetSplitsRequest originalReq = new GetSplitsRequest(identity, + "queryId", + "catalog_name", + new TableName("tpcds1", "customer"), + partitions, + Collections.EMPTY_LIST, + new Constraints(new HashMap<>()), + continuationToken); + + int numContinuations = 0; + do { + GetSplitsRequest req = new GetSplitsRequest(originalReq, continuationToken); + logger.info("doGetSplits: req[{}]", req); + + MetadataResponse rawResponse = handler.doGetSplits(allocator, req); + assertEquals(MetadataRequestType.GET_SPLITS, rawResponse.getRequestType()); + + GetSplitsResponse response = (GetSplitsResponse) rawResponse; + continuationToken = response.getContinuationToken(); + + logger.info("doGetSplits: continuationToken[{}] - numSplits[{}]", continuationToken, response.getSplits().size()); + + for (Split nextSplit : response.getSplits()) { + assertNotNull(nextSplit.getProperty(SPLIT_NUMBER_FIELD)); + assertNotNull(nextSplit.getProperty(SPLIT_TOTAL_NUMBER_FIELD)); + assertNotNull(nextSplit.getProperty(SPLIT_SCALE_FACTOR_FIELD)); + } + + if (continuationToken != null) { + numContinuations++; + } + } + while (continuationToken != null); + + assertTrue(numContinuations == 0); + + logger.info("doGetSplits: exit"); + } +} diff --git a/athena-tpcds/src/test/java/com/amazonaws/athena/connectors/tpcds/TPCDSRecordHandlerTest.java b/athena-tpcds/src/test/java/com/amazonaws/athena/connectors/tpcds/TPCDSRecordHandlerTest.java new file mode 100644 index 0000000000..01baaafa64 --- /dev/null +++ b/athena-tpcds/src/test/java/com/amazonaws/athena/connectors/tpcds/TPCDSRecordHandlerTest.java @@ -0,0 +1,277 @@ +/*- + * #%L + * athena-tpcds + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.tpcds; + +import com.amazonaws.athena.connector.lambda.data.Block; +import com.amazonaws.athena.connector.lambda.data.BlockAllocator; +import com.amazonaws.athena.connector.lambda.data.BlockAllocatorImpl; +import com.amazonaws.athena.connector.lambda.data.BlockUtils; +import com.amazonaws.athena.connector.lambda.data.S3BlockSpillReader; +import com.amazonaws.athena.connector.lambda.data.SchemaBuilder; +import com.amazonaws.athena.connector.lambda.domain.Split; +import com.amazonaws.athena.connector.lambda.domain.TableName; +import com.amazonaws.athena.connector.lambda.domain.predicate.Constraints; +import com.amazonaws.athena.connector.lambda.domain.predicate.EquatableValueSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.Range; +import com.amazonaws.athena.connector.lambda.domain.predicate.SortedRangeSet; +import com.amazonaws.athena.connector.lambda.domain.predicate.ValueSet; +import com.amazonaws.athena.connector.lambda.domain.spill.S3SpillLocation; +import com.amazonaws.athena.connector.lambda.domain.spill.SpillLocation; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsRequest; +import com.amazonaws.athena.connector.lambda.records.ReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.records.RecordResponse; +import com.amazonaws.athena.connector.lambda.records.RemoteReadRecordsResponse; +import com.amazonaws.athena.connector.lambda.security.EncryptionKeyFactory; +import com.amazonaws.athena.connector.lambda.security.FederatedIdentity; +import com.amazonaws.athena.connector.lambda.security.LocalKeyFactory; +import com.amazonaws.services.athena.AmazonAthena; +import com.amazonaws.services.s3.AmazonS3; +import com.amazonaws.services.s3.model.PutObjectResult; +import com.amazonaws.services.s3.model.S3Object; +import com.amazonaws.services.s3.model.S3ObjectInputStream; +import com.amazonaws.services.secretsmanager.AWSSecretsManager; +import com.google.common.io.ByteStreams; +import com.teradata.tpcds.Table; +import com.teradata.tpcds.column.Column; +import org.apache.arrow.vector.types.Types; +import org.apache.arrow.vector.types.pojo.Schema; +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.junit.runner.RunWith; +import org.mockito.Mock; +import org.mockito.invocation.InvocationOnMock; +import org.mockito.runners.MockitoJUnitRunner; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayInputStream; +import java.io.InputStream; +import java.util.ArrayList; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.UUID; + +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_NUMBER_FIELD; +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_SCALE_FACTOR_FIELD; +import static com.amazonaws.athena.connectors.tpcds.TPCDSMetadataHandler.SPLIT_TOTAL_NUMBER_FIELD; +import static org.junit.Assert.*; +import static org.mockito.Matchers.anyObject; +import static org.mockito.Matchers.anyString; +import static org.mockito.Mockito.mock; +import static org.mockito.Mockito.when; + +@RunWith(MockitoJUnitRunner.class) +public class TPCDSRecordHandlerTest +{ + private static final Logger logger = LoggerFactory.getLogger(TPCDSRecordHandlerTest.class); + + private FederatedIdentity identity = new FederatedIdentity("id", "principal", "account"); + private List mockS3Storage; + private TPCDSRecordHandler handler; + private S3BlockSpillReader spillReader; + private BlockAllocator allocator; + private EncryptionKeyFactory keyFactory = new LocalKeyFactory(); + private Table table; + private Schema schemaForRead; + + @Mock + private AmazonS3 mockS3; + + @Mock + private AWSSecretsManager mockSecretsManager; + + @Mock + private AmazonAthena mockAthena; + + @Before + public void setUp() + throws Exception + { + for (Table next : Table.getBaseTables()) { + if (next.getName().equals("customer")) { + table = next; + } + } + + SchemaBuilder schemaBuilder = SchemaBuilder.newBuilder(); + for (Column nextCol : table.getColumns()) { + schemaBuilder.addField(TPCDSUtils.convertColumn(nextCol)); + } + schemaForRead = schemaBuilder.build(); + + mockS3Storage = new ArrayList<>(); + allocator = new BlockAllocatorImpl(); + handler = new TPCDSRecordHandler(mockS3, mockSecretsManager, mockAthena); + spillReader = new S3BlockSpillReader(mockS3, allocator); + + when(mockS3.putObject(anyObject(), anyObject(), anyObject(), anyObject())) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + synchronized (mockS3Storage) { + InputStream inputStream = (InputStream) invocationOnMock.getArguments()[2]; + ByteHolder byteHolder = new ByteHolder(); + byteHolder.setBytes(ByteStreams.toByteArray(inputStream)); + mockS3Storage.add(byteHolder); + return mock(PutObjectResult.class); + } + }); + + when(mockS3.getObject(anyString(), anyString())) + .thenAnswer((InvocationOnMock invocationOnMock) -> + { + synchronized (mockS3Storage) { + S3Object mockObject = mock(S3Object.class); + ByteHolder byteHolder = mockS3Storage.get(0); + mockS3Storage.remove(0); + when(mockObject.getObjectContent()).thenReturn( + new S3ObjectInputStream( + new ByteArrayInputStream(byteHolder.getBytes()), null)); + return mockObject; + } + }); + } + + @After + public void tearDown() + throws Exception + { + allocator.close(); + } + + @Test + public void doReadRecordsNoSpill() + throws Exception + { + logger.info("doReadRecordsNoSpill: enter"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("c_customer_id", EquatableValueSet.newBuilder(allocator, Types.MinorType.VARCHAR.getType(), true, false) + .add("AAAAAAAABAAAAAAA") + .add("AAAAAAAACAAAAAAA") + .add("AAAAAAAADAAAAAAA").build()); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("tpcds1", table.getName()), + schemaForRead, + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), + keyFactory.create()) + .add(SPLIT_NUMBER_FIELD, "0") + .add(SPLIT_TOTAL_NUMBER_FIELD, "1000") + .add(SPLIT_SCALE_FACTOR_FIELD, "1") + .build(), + new Constraints(constraintsMap), + 100_000_000_000L, + 100_000_000_000L //100GB don't expect this to spill + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof ReadRecordsResponse); + + ReadRecordsResponse response = (ReadRecordsResponse) rawResponse; + logger.info("doReadRecordsNoSpill: rows[{}]", response.getRecordCount()); + + assertTrue(response.getRecords().getRowCount() == 3); + logger.info("doReadRecordsNoSpill: {}", BlockUtils.rowToString(response.getRecords(), 0)); + + logger.info("doReadRecordsNoSpill: exit"); + } + + @Test + public void doReadRecordsSpill() + throws Exception + { + logger.info("doReadRecordsSpill: enter"); + + Map constraintsMap = new HashMap<>(); + constraintsMap.put("c_current_cdemo_sk", SortedRangeSet.of( + Range.range(allocator, Types.MinorType.BIGINT.getType(), 100L, true, 100_000_000L, true))); + + ReadRecordsRequest request = new ReadRecordsRequest(identity, + "catalog", + "queryId-" + System.currentTimeMillis(), + new TableName("tpcds1", table.getName()), + schemaForRead, + Split.newBuilder(S3SpillLocation.newBuilder() + .withBucket(UUID.randomUUID().toString()) + .withSplitId(UUID.randomUUID().toString()) + .withQueryId(UUID.randomUUID().toString()) + .withIsDirectory(true) + .build(), + keyFactory.create()) + .add(SPLIT_NUMBER_FIELD, "0") + .add(SPLIT_TOTAL_NUMBER_FIELD, "10000") + .add(SPLIT_SCALE_FACTOR_FIELD, "1") + .build(), + new Constraints(constraintsMap), + 1_500_000L, //~1.5MB so we should see some spill + 0 + ); + + RecordResponse rawResponse = handler.doReadRecords(allocator, request); + + assertTrue(rawResponse instanceof RemoteReadRecordsResponse); + + try (RemoteReadRecordsResponse response = (RemoteReadRecordsResponse) rawResponse) { + logger.info("doReadRecordsSpill: remoteBlocks[{}]", response.getRemoteBlocks().size()); + + assertTrue(response.getNumberBlocks() > 1); + + int blockNum = 0; + for (SpillLocation next : response.getRemoteBlocks()) { + S3SpillLocation spillLocation = (S3SpillLocation) next; + try (Block block = spillReader.read(spillLocation, response.getEncryptionKey(), response.getSchema())) { + + logger.info("doReadRecordsSpill: blockNum[{}] and recordCount[{}]", blockNum++, block.getRowCount()); + // assertTrue(++blockNum < response.getRemoteBlocks().size() && block.getRowCount() > 10_000); + + logger.info("doReadRecordsSpill: {}", BlockUtils.rowToString(block, 0)); + assertNotNull(BlockUtils.rowToString(block, 0)); + } + } + } + + logger.info("doReadRecordsSpill: exit"); + } + + private class ByteHolder + { + private byte[] bytes; + + public void setBytes(byte[] bytes) + { + this.bytes = bytes; + } + + public byte[] getBytes() + { + return bytes; + } + } +} diff --git a/athena-udfs/LICENSE.txt b/athena-udfs/LICENSE.txt new file mode 100644 index 0000000000..67db858821 --- /dev/null +++ b/athena-udfs/LICENSE.txt @@ -0,0 +1,175 @@ + + Apache License + Version 2.0, January 2004 + http://www.apache.org/licenses/ + + TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION + + 1. Definitions. + + "License" shall mean the terms and conditions for use, reproduction, + and distribution as defined by Sections 1 through 9 of this document. + + "Licensor" shall mean the copyright owner or entity authorized by + the copyright owner that is granting the License. + + "Legal Entity" shall mean the union of the acting entity and all + other entities that control, are controlled by, or are under common + control with that entity. For the purposes of this definition, + "control" means (i) the power, direct or indirect, to cause the + direction or management of such entity, whether by contract or + otherwise, or (ii) ownership of fifty percent (50%) or more of the + outstanding shares, or (iii) beneficial ownership of such entity. + + "You" (or "Your") shall mean an individual or Legal Entity + exercising permissions granted by this License. + + "Source" form shall mean the preferred form for making modifications, + including but not limited to software source code, documentation + source, and configuration files. + + "Object" form shall mean any form resulting from mechanical + transformation or translation of a Source form, including but + not limited to compiled object code, generated documentation, + and conversions to other media types. + + "Work" shall mean the work of authorship, whether in Source or + Object form, made available under the License, as indicated by a + copyright notice that is included in or attached to the work + (an example is provided in the Appendix below). + + "Derivative Works" shall mean any work, whether in Source or Object + form, that is based on (or derived from) the Work and for which the + editorial revisions, annotations, elaborations, or other modifications + represent, as a whole, an original work of authorship. For the purposes + of this License, Derivative Works shall not include works that remain + separable from, or merely link (or bind by name) to the interfaces of, + the Work and Derivative Works thereof. + + "Contribution" shall mean any work of authorship, including + the original version of the Work and any modifications or additions + to that Work or Derivative Works thereof, that is intentionally + submitted to Licensor for inclusion in the Work by the copyright owner + or by an individual or Legal Entity authorized to submit on behalf of + the copyright owner. For the purposes of this definition, "submitted" + means any form of electronic, verbal, or written communication sent + to the Licensor or its representatives, including but not limited to + communication on electronic mailing lists, source code control systems, + and issue tracking systems that are managed by, or on behalf of, the + Licensor for the purpose of discussing and improving the Work, but + excluding communication that is conspicuously marked or otherwise + designated in writing by the copyright owner as "Not a Contribution." + + "Contributor" shall mean Licensor and any individual or Legal Entity + on behalf of whom a Contribution has been received by Licensor and + subsequently incorporated within the Work. + + 2. Grant of Copyright License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + copyright license to reproduce, prepare Derivative Works of, + publicly display, publicly perform, sublicense, and distribute the + Work and such Derivative Works in Source or Object form. + + 3. Grant of Patent License. Subject to the terms and conditions of + this License, each Contributor hereby grants to You a perpetual, + worldwide, non-exclusive, no-charge, royalty-free, irrevocable + (except as stated in this section) patent license to make, have made, + use, offer to sell, sell, import, and otherwise transfer the Work, + where such license applies only to those patent claims licensable + by such Contributor that are necessarily infringed by their + Contribution(s) alone or by combination of their Contribution(s) + with the Work to which such Contribution(s) was submitted. If You + institute patent litigation against any entity (including a + cross-claim or counterclaim in a lawsuit) alleging that the Work + or a Contribution incorporated within the Work constitutes direct + or contributory patent infringement, then any patent licenses + granted to You under this License for that Work shall terminate + as of the date such litigation is filed. + + 4. Redistribution. You may reproduce and distribute copies of the + Work or Derivative Works thereof in any medium, with or without + modifications, and in Source or Object form, provided that You + meet the following conditions: + + (a) You must give any other recipients of the Work or + Derivative Works a copy of this License; and + + (b) You must cause any modified files to carry prominent notices + stating that You changed the files; and + + (c) You must retain, in the Source form of any Derivative Works + that You distribute, all copyright, patent, trademark, and + attribution notices from the Source form of the Work, + excluding those notices that do not pertain to any part of + the Derivative Works; and + + (d) If the Work includes a "NOTICE" text file as part of its + distribution, then any Derivative Works that You distribute must + include a readable copy of the attribution notices contained + within such NOTICE file, excluding those notices that do not + pertain to any part of the Derivative Works, in at least one + of the following places: within a NOTICE text file distributed + as part of the Derivative Works; within the Source form or + documentation, if provided along with the Derivative Works; or, + within a display generated by the Derivative Works, if and + wherever such third-party notices normally appear. The contents + of the NOTICE file are for informational purposes only and + do not modify the License. You may add Your own attribution + notices within Derivative Works that You distribute, alongside + or as an addendum to the NOTICE text from the Work, provided + that such additional attribution notices cannot be construed + as modifying the License. + + You may add Your own copyright statement to Your modifications and + may provide additional or different license terms and conditions + for use, reproduction, or distribution of Your modifications, or + for any such Derivative Works as a whole, provided Your use, + reproduction, and distribution of the Work otherwise complies with + the conditions stated in this License. + + 5. Submission of Contributions. Unless You explicitly state otherwise, + any Contribution intentionally submitted for inclusion in the Work + by You to the Licensor shall be under the terms and conditions of + this License, without any additional terms or conditions. + Notwithstanding the above, nothing herein shall supersede or modify + the terms of any separate license agreement you may have executed + with Licensor regarding such Contributions. + + 6. Trademarks. This License does not grant permission to use the trade + names, trademarks, service marks, or product names of the Licensor, + except as required for reasonable and customary use in describing the + origin of the Work and reproducing the content of the NOTICE file. + + 7. Disclaimer of Warranty. Unless required by applicable law or + agreed to in writing, Licensor provides the Work (and each + Contributor provides its Contributions) on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or + implied, including, without limitation, any warranties or conditions + of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A + PARTICULAR PURPOSE. You are solely responsible for determining the + appropriateness of using or redistributing the Work and assume any + risks associated with Your exercise of permissions under this License. + + 8. Limitation of Liability. In no event and under no legal theory, + whether in tort (including negligence), contract, or otherwise, + unless required by applicable law (such as deliberate and grossly + negligent acts) or agreed to in writing, shall any Contributor be + liable to You for damages, including any direct, indirect, special, + incidental, or consequential damages of any character arising as a + result of this License or out of the use or inability to use the + Work (including but not limited to damages for loss of goodwill, + work stoppage, computer failure or malfunction, or any and all + other commercial damages or losses), even if such Contributor + has been advised of the possibility of such damages. + + 9. Accepting Warranty or Additional Liability. While redistributing + the Work or Derivative Works thereof, You may choose to offer, + and charge a fee for, acceptance of support, warranty, indemnity, + or other liability obligations and/or rights consistent with this + License. However, in accepting such obligations, You may act only + on Your own behalf and on Your sole responsibility, not on behalf + of any other Contributor, and only if You agree to indemnify, + defend, and hold each Contributor harmless for any liability + incurred by, or claims asserted against, such Contributor by reason + of your accepting any such warranty or additional liability. diff --git a/athena-udfs/README.md b/athena-udfs/README.md new file mode 100644 index 0000000000..a2dde7c694 --- /dev/null +++ b/athena-udfs/README.md @@ -0,0 +1,22 @@ +# Amazon Athena UDF Connector + +This connector extends Amazon Athena's capability by adding customizable UDFs via Lambda. + +## Supported UDFs + +TODO: Add supported UDFs + +### Deploying The Connector + +To use this connector in your queries, navigate to AWS Serverless Application Repository and deploy a pre-built version of this connector. Alternatively, you can build and deploy this connector from source follow the below steps or use the more detailed tutorial in the athena-example module: + +1. From the athena-federation-sdk dir, run `mvn clean install` if you haven't already. +2. From the athena-udfs dir, run `mvn clean install`. +3. From the athena-udfs dir, run `../tools/publish.sh S3_BUCKET_NAME athena-udfs` to publish the connector to your private AWS Serverless Application Repository. The S3_BUCKET in the command is where a copy of the connector's code will be stored for Serverless Application Repository to retrieve it. This will allow users with permission to do so, the ability to deploy instances of the connector via 1-Click form. Then navigate to [Serverless Application Repository](https://aws.amazon.com/serverless/serverlessrepo) +4. Try using your UDF(s) in a query. + + + +## License + +This project is licensed under the Apache-2.0 License. \ No newline at end of file diff --git a/athena-udfs/athena-udfs.yaml b/athena-udfs/athena-udfs.yaml new file mode 100644 index 0000000000..6c9a135a49 --- /dev/null +++ b/athena-udfs/athena-udfs.yaml @@ -0,0 +1,40 @@ +Transform: 'AWS::Serverless-2016-10-31' +Metadata: + 'AWS::ServerlessRepo::Application': + Name: AthenaUserDefinedFunctions + Description: 'This connector enables Amazon Athena to leverage common UDFs made available via Lambda.' + Author: 'Amazon Athena' + SpdxLicenseId: Apache-2.0 + LicenseUrl: LICENSE.txt + ReadmeUrl: README.md + Labels: + - athena-federation + HomePageUrl: 'https://github.com/awslabs/aws-athena-query-federation' + SemanticVersion: 1.0.0 + SourceCodeUrl: 'https://github.com/awslabs/aws-athena-query-federation' +Parameters: + Parameters: + UDFFunctionName: + Description: 'The name you want to give the Lambda function that will contain your UDFs.' + Type: String + LambdaTimeout: + Description: 'Maximum Lambda invocation runtime in seconds. (min 1 - 900 max)' + Default: 900 + Type: Number + LambdaMemory: + Description: 'Lambda memory in MB (min 128 - 3008 max).' + Default: 3008 + Type: Number +Resources: + ConnectorConfig: + Type: 'AWS::Serverless::Function' + Properties: + Environment: + Variables: + FunctionName: !Ref UDFFunctionName + Handler: "com.amazonaws.athena.connectors.udfs.AthenaUDFHandler" + CodeUri: "./target/athena-udfs-1.0.jar" + Description: "This connector enables Amazon Athena to leverage common UDFs made available via Lambda." + Runtime: java8 + Timeout: !Ref LambdaTimeout + MemorySize: !Ref LambdaMemory \ No newline at end of file diff --git a/athena-udfs/pom.xml b/athena-udfs/pom.xml new file mode 100644 index 0000000000..b9cdfe163c --- /dev/null +++ b/athena-udfs/pom.xml @@ -0,0 +1,57 @@ + + + + aws-athena-query-federation + com.amazonaws + 1.0 + + 4.0.0 + + athena-udfs + + + + com.amazonaws + aws-athena-federation-sdk + ${aws-athena-federation-sdk.version} + + + com.google.guava + guava + 21.0 + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + false + + + *:* + + META-INF/*.SF + META-INF/*.DSA + META-INF/*.RSA + + + + + + + package + + shade + + + + + + + \ No newline at end of file diff --git a/athena-udfs/src/main/java/com/amazonaws/athena/connectors/udfs/AthenaUDFHandler.java b/athena-udfs/src/main/java/com/amazonaws/athena/connectors/udfs/AthenaUDFHandler.java new file mode 100644 index 0000000000..ee6211438a --- /dev/null +++ b/athena-udfs/src/main/java/com/amazonaws/athena/connectors/udfs/AthenaUDFHandler.java @@ -0,0 +1,117 @@ +/*- + * #%L + * athena-udfs + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.udfs; + +import com.amazonaws.athena.connector.lambda.udf.UserDefinedFunctionHandler; +import com.google.common.collect.ImmutableMap; + +import java.math.BigDecimal; +import java.math.RoundingMode; +import java.time.LocalDate; +import java.time.LocalDateTime; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +public class AthenaUDFHandler + extends UserDefinedFunctionHandler +{ + public Boolean example_udf(Boolean value) + { + return !value; + } + + public Byte example_udf(Byte value) + { + return (byte) (value + 1); + } + + public Short example_udf(Short value) + { + return (short) (value + 1); + } + + public Integer example_udf(Integer value) + { + return value + 1; + } + + public Long example_udf(Long value) + { + return value + 1; + } + + public Float example_udf(Float value) + { + return value + 1; + } + + public Double example_udf(Double value) + { + return value + 1; + } + + public BigDecimal example_udf(BigDecimal value) + { + BigDecimal one = new BigDecimal(1); + one.setScale(value.scale(), RoundingMode.HALF_UP); + return value.add(one); + } + + public String example_udf(String value) + { + return value + "_dada"; + } + + public LocalDateTime example_udf(LocalDateTime value) + { + return value.minusDays(1); + } + + public LocalDate example_udf(LocalDate value) + { + return value.minusDays(1); + } + + public List example_udf(List value) + { + System.out.println("Array input: " + value); + List result = value.stream().map(o -> ((Integer) o) + 1).collect(Collectors.toList()); + System.out.println("Array output: " + result); + return result; + } + + public Map example_udf(Map value) + { + Long longVal = (Long) value.get("x"); + Double doubleVal = (Double) value.get("y"); + + return ImmutableMap.of("x", longVal + 1, "y", doubleVal + 1.0); + } + + public byte[] example_udf(byte[] value) + { + byte[] output = new byte[value.length]; + for (int i = 0; i < value.length; ++i) { + output[i] = (byte) (value[i] + 1); + } + return output; + } +} diff --git a/athena-udfs/src/test/java/com/amazonaws/athena/connectors/udfs/AthenaUDFHandlerTest.java b/athena-udfs/src/test/java/com/amazonaws/athena/connectors/udfs/AthenaUDFHandlerTest.java new file mode 100644 index 0000000000..8306c08c95 --- /dev/null +++ b/athena-udfs/src/test/java/com/amazonaws/athena/connectors/udfs/AthenaUDFHandlerTest.java @@ -0,0 +1,27 @@ +/*- + * #%L + * athena-udfs + * %% + * Copyright (C) 2019 Amazon Web Services + * %% + * Licensed under the Apache License, Version 2.0 (the "License"); + * you may not use this file except in compliance with the License. + * You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + * #L% + */ +package com.amazonaws.athena.connectors.udfs; + +import static org.junit.Assert.*; + +public class AthenaUDFHandlerTest +{ + +} diff --git a/checkstyle.xml b/checkstyle.xml new file mode 100644 index 0000000000..ccd3c1f750 --- /dev/null +++ b/checkstyle.xml @@ -0,0 +1,171 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/docs/img/athena_federation_demo.png b/docs/img/athena_federation_demo.png new file mode 100644 index 0000000000..4c4666201f Binary files /dev/null and b/docs/img/athena_federation_demo.png differ diff --git a/docs/img/athena_federation_flow.png b/docs/img/athena_federation_flow.png new file mode 100644 index 0000000000..6b5dbdc5ce Binary files /dev/null and b/docs/img/athena_federation_flow.png differ diff --git a/docs/img/athena_federation_summary.png b/docs/img/athena_federation_summary.png new file mode 100644 index 0000000000..72fb5f037d Binary files /dev/null and b/docs/img/athena_federation_summary.png differ diff --git a/docs/img/hbase_glue_example.png b/docs/img/hbase_glue_example.png new file mode 100644 index 0000000000..60300f7150 Binary files /dev/null and b/docs/img/hbase_glue_example.png differ diff --git a/pom.xml b/pom.xml new file mode 100644 index 0000000000..08f88d13d5 --- /dev/null +++ b/pom.xml @@ -0,0 +1,187 @@ + + + 4.0.0 + + com.amazonaws + aws-athena-query-federation + pom + 1.0 + + 1.8 + 1.8 + 1.11.490 + 2019.47.1 + 1.2.0 + 1.7.28 + 1.10.19 + 4.11 + + + + Amazon Web Services + https://https://aws.amazon.com// + + + 2019 + + + + Apache License 2.0 + http://www.apache.org/licenses/LICENSE-2.0 + repo + + + + + scm:git@github.com:awslabs/aws-athena-query-federation.git + https://github.com/awslabs/aws-athena-query-federation + HEAD + + + + athena-cloudwatch + athena-docdb + athena-redis + athena-bigquery + athena-aws-cmdb + athena-android + athena-dynamodb + athena-hbase + athena-cloudwatch-metrics + athena-example + athena-federation-sdk + athena-tpcds + athena-jdbc + athena-federation-sdk-tools + athena-udfs + + + + + org.slf4j + slf4j-log4j12 + ${slf4j-log4j.version} + + + + com.amazonaws + aws-lambda-java-core + ${aws.lambda-java-core.version} + + + + junit + junit + ${junit.version} + test + + + org.mockito + mockito-all + ${mockito.version} + test + + + + + + + org.apache.maven.plugins + maven-shade-plugin + 3.2.1 + + + package + + shade + + + + + classworlds:classworlds + junit:junit + jmock:* + *:xml-apis + org.apache.maven:lib:tests + java.* + + + + + + + + + org.jacoco + jacoco-maven-plugin + 0.8.4 + + + + prepare-agent + + + + + report + test + + report + + + + + + + org.apache.maven.plugins + maven-checkstyle-plugin + 3.1.0 + + checkstyle.xml + UTF-8 + true + true + false + + + + validate + validate + + check + + + + + + + org.codehaus.mojo + license-maven-plugin + 2.0.0 + + false + false + false + + + + first + + update-file-header + + process-sources + + apache_v2 + + src/main/java + src/test + + + + + + + + diff --git a/tools/prepare_dev_env.sh b/tools/prepare_dev_env.sh new file mode 100755 index 0000000000..55444d49af --- /dev/null +++ b/tools/prepare_dev_env.sh @@ -0,0 +1,74 @@ +#!/bin/bash + +# Copyright (C) 2019 Amazon Web Services +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +cat << EOF +# +# This script will prepare your development environment by installing certain pre-requisites, namely: +# 1. Apache Maven +# 2. HomeBrew - a package manager that will be used to fetch the next two pre-requistes. +# 3. AWS CLI (latest version) +# 4. AWS SAM Build Tool (latest version) +# +# This script has been designed and tested to work on Amazon Linux but may require slight adjustment for other Operating Systems. +# All tools used here (except HomeBrew) are supported on all major Operating Systems: Windows, Linux, Mac OS. +# +# This script may prompt you for yes/no responses or permission to continue at verious points. It is not meant to run unattended. +# +EOF + +while true; do + read -p "Do you wish to proceed? (yes or no) " yn + case $yn in + [Yy]* ) echo "Proceeding..."; break;; + [Nn]* ) exit;; + * ) echo "Please answer yes or no.";; + esac +done + +set -e +sudo wget https://archive.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.tar.gz -O /tmp/apache-maven-3.5.4-bin.tar.gz +sudo tar xf /tmp/apache-maven-3.5.4-bin.tar.gz -C /opt +echo "export M2_HOME=/opt/apache-maven-3.5.4" >> ~/.profile +echo "export PATH=\${M2_HOME}/bin:\${PATH}" >> ~/.profile +echo "export M2_HOME=/opt/apache-maven-3.5.4" >> ~/.bash_profile +echo "export PATH=\${M2_HOME}/bin:\${PATH}" >> ~/.bash_profile + +sudo yum -y install java-1.8.0-openjdk-devel +sudo update-alternatives --set java /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java +sudo update-alternatives --set javac /usr/lib/jvm/java-1.8.0-openjdk.x86_64/bin/javac + +# If the above update-alternatives doesn't work and you don't know your path try +# sudo update-alternatives --config java +# sudo update-alternatives --config javac + +sh -c "$(curl -fsSL https://raw.githubusercontent.com/Linuxbrew/install/master/install.sh)" +test -d ~/.linuxbrew && eval $(~/.linuxbrew/bin/brew shellenv) +test -d /home/linuxbrew/.linuxbrew && eval $(/home/linuxbrew/.linuxbrew/bin/brew shellenv) +test -r ~/.bash_profile && echo "eval \$($(brew --prefix)/bin/brew shellenv)" >>~/.bash_profile +echo "eval \$($(brew --prefix)/bin/brew shellenv)" >>~/.profile +echo "eval \$($(brew --prefix)/bin/brew shellenv)" >>~/.bash_profile + +source ~/.profile + +brew tap aws/tap +brew reinstall awscli +brew reinstall aws-sam-cli + +aws --version +sam --version + +echo "" +echo "" +echo "To ensure your terminal can see the new tools we installed run `source ~/.profile` or open a fresh terminal." \ No newline at end of file diff --git a/tools/publish.sh b/tools/publish.sh new file mode 100755 index 0000000000..e1becba6dd --- /dev/null +++ b/tools/publish.sh @@ -0,0 +1,97 @@ +#!/bin/bash + +# Copyright (C) 2019 Amazon Web Services +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +cat << EOF +# Run this script from the directory of the module (e.g. athena-example) that you wish to publish. +# This script performs the following actions: +# 1. Builds the maven project +# 2. Creates a Serverless Application Package using the athena-example.yaml +# 3. Produces a final packaged.yaml which can be used to publish the application to your +# private Serverless Application Repository or deployed via Cloudformation. +# 4. Uploads the packaged connector code to the S3 bucket you specified. +# 5. Uses sar_bucket_policy.json to grant Serverless Application Repository access to our connector code in s3. +# 6. Published the connector to you private Serverless Application Repository where you can 1-click deploy it. +EOF + +while true; do + read -p "Do you wish to proceed? (yes or no) " yn + case $yn in + [Yy]* ) echo "Proceeding..."; break;; + [Nn]* ) exit;; + * ) echo "Please answer yes or no.";; + esac +done + +if [ "$#" -lt 2 ]; then + echo "\n\nERROR: Script requires 3 arguments \n" + echo "\n1. S3_BUCKET used for publishing artifacts to Lambda/Serverless App Repo.\n" + echo "\n2. The connector module to publish (e.g. athena-exmaple or athena-cloudwatch) \n" + echo "\n3. The AS REGION to target (e.g. us-east-1 or us-east-2) \n" + echo "\n\n USAGE from the module's directory: ../tools/publish.sh my_s3_bucket athena-example \n" + exit; +fi + +if test -f "$2".yaml; then + echo "SAR yaml found. We appear to be in the right directory." +else + echo "SAR yaml not found, attempting to switch to module directory." + cd $2 +fi + +REGION=$3 +if [ -z "$REGION" ] +then + REGION="us-east-1" +fi + +echo "Using AWS Region $REGION" + + +if ! aws s3api get-bucket-policy --bucket $1 --region $REGION| grep 'Statement' ; then + echo "No bucket policy is set on $1 , would you like to add a Serverless Application Repository Bucket Policy?" + while true; do + read -p "Do you wish to proceed? (yes or no) " yn + case $yn in + [Yy]* ) echo "Proceeding..."; break;; + [Nn]* ) echo "Skipping bucket policy, not that this may result in failed attempts to publish to Serverless Application Repository"; break;; + * ) echo "Please answer yes or no.";; + esac + done + +cat > sar_bucket_policy.json <<- EOM +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Principal": { + "Service": "serverlessrepo.amazonaws.com" + }, + "Action": "s3:GetObject", + "Resource": "arn:aws:s3:::$1/*" + } + ] + } +EOM + set -e + aws s3api put-bucket-policy --bucket $1 --region $REGION --policy file://sar_bucket_policy.json +fi + +set -e +mvn clean install + +sam package --template-file $2.yaml --output-template-file packaged.yaml --s3-bucket $1 --region $REGION +sam publish --template packaged.yaml --region $REGION + diff --git a/tools/validate_connector.sh b/tools/validate_connector.sh new file mode 100755 index 0000000000..654a6c3164 --- /dev/null +++ b/tools/validate_connector.sh @@ -0,0 +1,50 @@ +#!/bin/bash + +# Copyright (C) 2019 Amazon Web Services +# Licensed under the Apache License, Version 2.0 (the "License"); +# you may not use this file except in compliance with the License. +# You may obtain a copy of the License at +# +# http://www.apache.org/licenses/LICENSE-2.0 +# +# Unless required by applicable law or agreed to in writing, software +# distributed under the License is distributed on an "AS IS" BASIS, +# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +# See the License for the specific language governing permissions and +# limitations under the License. + +cat << EOF +# Run this script from any directory: +# 1. Builds the maven project, if needed. +# 2. Simulates an Athena query running against your connector that is deployed as a Lambda function. +# +# NOTE: That this test may cause a full table scan against your data source. If prompted to provide a +# query predicate, doing so will avoid a full table scan. You can also opt to stop the simulated query +# after the 'planning phase' so that it does not simulate process any splits. +# +# Use the -h or --help args to print usage information. +# +# Use 'yes | tools/validate_connector.sh [args]' to bypass this check. USE CAUTION +# +EOF + +while true; do + read -p "Do you wish to proceed? (yes or no) " yn + case $yn in + [Yy]* ) echo "Proceeding..."; break;; + [Nn]* ) exit;; + * ) echo "Please answer yes or no.";; + esac +done + +dir=$(cd -P -- "$(dirname -- "$0")" && pwd -P) + +cd "$dir/../athena-federation-sdk-tools" + +if test -f "target/athena-federation-sdk-tools-1.0.jar"; then + echo "athena-federation-sdk-tools is already built, skipping compilation." +else + mvn clean install +fi + +java -cp target/athena-federation-sdk-tools-1.0-withdep.jar com.amazonaws.athena.connector.validation.ConnectorValidator $@ \ No newline at end of file