SExI System SQL Scripts and data loading implementation #51
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Overview
This PR implements the Silverbullet Expenses and Invoices (SExI) system using Trino SQL. The implementation processes data from HR and Finance files to create a database structure for tracking employees, expenses, and supplier invoices.
Implementation Details
Data Processing
load_data.py: Processes source data from HR and Finance directories, generating SQL files for database creation and data loading. Handles employee records, expense receipts, and supplier invoices with proper SQL escaping for special characters.
Database Structure
create_employees.sql: Creates the EMPLOYEE table with employee details and management relationships.
create_expenses.sql: Creates the EXPENSE table to track all company expenses with employee attribution.
create_invoices.sql: Creates SUPPLIER and INVOICE tables to manage vendor relationships and payment obligations.
Analysis Queries
find_manager_cycles.sql: Identifies cycles in the management hierarchy that could cause issues with expense approval workflows.
find_employee_expenses.sql: Retrieves detailed expense information for specific employees, including approval chain.
find_employee_managers.sql: Traverses the complete management chain for any employee using recursive CTEs.
calculate_largest_expensors.sql: Identifies employees who have exceeded spending thresholds, including manager information.
generate_supplier_payment_plans.sql: Creates structured payment plans for all suppliers, ensuring invoices are paid before due dates.
Technical Approach
Used recursive Common Table Expressions (CTEs) for hierarchical data analysis
Implemented proper SQL string escaping to prevent injection vulnerabilities
Created modular SQL files for each specific business requirement
Ensured all data is loaded exclusively from provided source files