Welcome to the Paradime dbt™ Data Modeling Challenge - Movie Edition!
- Submit Your Application: Fill out the registration form.
- Verification by Paradime: We'll review your application against the entry requirements.
After verification, you'll receive two emails from Paradime:
- Snowflake Account Credentials: Contains your Snowflake account details. Search for an email with subject line "Start Your Movie Data Modeling Challenge – Your Snowflake Credentials."
- Paradime Platform Invitation: An invitation to access the Paradime Platform. Search for an email with the subject line "[Paradime] Activate your account."
- Access Paradime: Use the provided credentials to log into your account. Join the Paradime workspace using the invite email.
- Snowflake Integration: Add Snowflake credentials (Username, Password, Role, Database) to Paradime.
- Act Fast - Limited Time Activation: The links to activate your Paradime account expire within 24 hours!
Note: A step-by-step tutorial is available in your Snowflake credentials email, "Start Your Movie Data Modeling Challenge – Your Snowflake Credentials".
- Create a New Branch: Open the Paradime Editor and create a new branch. Your branch name should follow this format: "movie-<your_email>"
- Start Developing: Begin crafting SQL queries, developing dbt™ models, and generating insights!
- Follow this documentation or watch this step-by-step tutorial.
Note: If you log in to Snowflake, your default role is public. Switch your role to the one we provide in the Snowflake email (ex. "[your_name]_transformer").
Need Help?: Check out this step-by-step video tutorial, and join the #movie-competition channel on Slack for assistance.
Now that you're set up, you have until May 26th, 2024, to complete and submit your project!
- Paradime:
- Dive into the Paradime Editor with this step-by-step, interactive guide. It's designed to familiarize you with the core functionalities and of the editor and get you familiar with the Project. You can also watch our YouTube videos:
- All features in our intuitive IDE Apps Panel
- AI-enabled IDE for dbt™ development | DinoAI | Paradime.io
- Paradime Help Docs: For a comprehensive understanding of all the features and how to make the most of Paradime for your project, explore the Paradime Help Docs.
- Snowflake Data Warehouse: Learn about the data warehouse and the pre-loaded data in this step-by-step, interactive guide.
- Lightdash: Learn about lightdash via this instructional video.
Paradime has pre-loaded your Snowflake account with 3 Movie datasets. These data sets contain roughly 1,700,000 rows of detailed Movie and TV Show data. Please understand that these data sets are not entirely accurate; They're simply a starting point - you will need to bring in your datasets to truly excel in this challenge.
- In Snowflake: Directly explore the datasets in Snowflake for hands-on analysis.
- GitHub Repository Resources:
- Staging Files: These files provide a preliminary view and structure of the datasets available in this repository.
- schema.yml File: This file contains schema definitions, helping you understand the data models and their relationships.
- Paradime Catalog UI: Use the Paradime Catalog UI for an interactive exploration of the datasets, featuring intuitive search and navigation.
Your primary goal is to construct dbt™ models that unearth compelling insights, captivating Movie fans. These three datasets are your starting point, and as you bring in additional data, the possibilities for discovery are virtually limitless. This is your playground to innovate and explore the depths of Movie and TV data.
Before diving in, ensure you're familiar with the Judging Criteria so you've got a chance to win the $500-$1500 Amazon gift cards!
Check out the example submission from Paradime's recent NBA Data Modeling Challenge, as well as the winning submissions from the NBA data modeling Challenge:
- First Place - Spence Perry's Submission
- Second Place - Chris Hughes' Submission
- Third Place - István Mózes' Submission
Additionally, Here are some questions you might consider answering:
- Highest grossing films of all time: - Data Required: omdb_movies and/or tmdb_movies. You might also consider bringing in third party data to understand highest grossing films by country.
- Highest/lowest ROI films of all time: - Data Required: omdb_movies and/or tmdb_movies. See columns "budget", "revenue", and "box office".
- Actors who appear in most films: - Data Required: omdb_movies, column "actors"
- Highest grossing directors and writers: - Data Required: omdb_movies, column "director" and "writer"
Submission Deadline: May 26th, 2024 Once your project is complete, please submit the following materials to Parker Rogers ([email protected]) with Subject Line "<your_name> - Movie Data Modeling Challenge Submission":
- GitHub Branch: Send the link to your GitHub branch containing your dbt™ models.
- README.md: Include a README file that narrates your project's story, methodology, and insights. Check out this example README from our previous NBA Data Modeling challenge.
- Data Visualizations and Insights: Showcase your findings, ideally within your README.md. For inspiration, refer to these example visualizations from our previous NBA Data Modeling challenge.
If you're having issues submitting your project, watch this interactive tutorial.
We look forward to seeing your creative and insightful analyses!
Here's an example project that fulfills all requirements and would be elligble eligible for cash prizes. Feel free to use this template for your submission. We also recommend diving into the the winner's submissions from our recent NBA Data Modeling Challenge for inspiration.
A simple intro. Example - "Explore my project for the dbt™ data modeling challenge - Movie Edition, Hosted by Paradime! This project dives into the analysis and visualization of Movie and TV data!"
My analysis leverages four key data sets:
- data set name #1
- data set name #2
- data set name #3
- data set name #4
- Copy and paste your data lineage image here. Watch this YouTube Tutorial to learn how.
- Paradime for SQL, dbt™.
- Snowflake for data storage and computing.
- Lightdash for data visualization.
- Other tool(s) used and why.
My analysis leverages four key data sets:
- data set name #1
- data set name #2
- data set name #3
- data set name #4
[Image]
Share a clear and concise conclusion of your findings!