What is this Book?   How to Contribute   YouTube   Twitter   Amazon Shop
Check out my Data Engineering Academy at LearnDataEngineering.com trusted by almost 2,000 students!
Visit learndataengineering.com: Click Here
- Learn Data Engineering with our online Academy
 - Perfect for becoming a Data Engineer or add Data Engineering to your skillset
 - Proven process based on years of experience and hundreds of hours of personal coaching
 - Over 30 prepared courses on the most important techniques, fundamental tools and platforms plus our
 - Associate Data Engineer Certification
 - Academy Discord server with over 1,000 members
 
- Amazon: Click Here buy whatever you like from Amazon using this link* (Also check out my complete podcast gear and books)
 
Find the change log with all recent updates here: SEE UPDATES
- Introduction
 - Basic Engineering Skills
 - Advanced Engineering Skills
 - Free Hands On Courses / Tutorials‚
 - Case Studies
 - Best Practices Cloud Platforms
 - 130+ Data Sources Data Science
 - 1001 Interview Questions
 - Recommended Books, Courses, and Podcasts
 - Updates
 
- What is this Cookbook
 - Data Engineers
 - My Data Science Platform Blueprint
 - Who Companies Need
 - How to Learn Data Engineering
 - Data Engineers Skills Matrix
 - How to Become a Senior Data Engineer
 
- Learn To Code
 - Get Familiar With Git
 - Agile Development
 - Software Engineering Culture
 - Learn how a Computer Works
 - Data Network Transmission
 - Security and Privacy
 - Linux
 - Docker
 - The Cloud
 - Security Zone Design
 
- Data Science Platform
 - 81 Platform & Pipeline Design Questions
 - Connect
 - Buffer
 - Processing Frameworks
- Lambda and Kappa Architecture
 - Batch Processing
 - Stream Processing
 - Should You do Stream or Batch Processing
 - Is ETL still relevant for Analytics?
 - MapReduce
 - Apache Spark
- What is the Difference to MapReduce?
 - How Spark Fits to Hadoop
 - Spark vs Hadoop
 - Spark and Hadoop a Perfect Fit
 - Spark on YARn
 - My Simple Rule of Thumb
 - Available Languages
 - Spark Driver Executor and SparkContext
 - Spark Batch vs Stream processing
 - How Spark uses Data From Hadoop
 - What are RDDs and How to Use Them
 - SparkSQL How and Why to Use It
 - What are Dataframes and How to Use Them
 - Machine Learning on Spark (TensorFlow)
 - MLlib
 - Spark Setup
 - Spark Resource Management
 
 - AWS Lambda
 - Apache Flink
 - Elasticsearch
 - Apache Drill
 - StreamSets
 
 - Store
 - Visualize
 - Machine Learning
- How to do Machine Learning in production
 - Why machine learning in production is harder then you think
 - Models Do Not Work Forever
 - Where are The Platforms That Support Machine Learning
 - Training Parameter Management
 - How to Convince People That Machine Learning Works
 - No Rules No Physical Models
 - You Have The Data. Use It!
 - Data is Stronger Than Opinions
 - AWS Sagemaker
 
 
- Free Data Engineering Course with AWS, TDengine, Docker and Grafana
 - Monitor your data in dbt & detect quality issues with Elementary
 - Solving Engineers 4 Biggest Airflow Problems
 - The best alternative to Airlfow? Mage.ai
 
- Data Science @Airbnb
 - Data Science @Amazon
 - Data Science @Baidu
 - Data Science @Blackrock
 - Data Science @BMW
 - Data Science @Booking.com
 - Data Science @CERN
 - Data Science @Disney
 - Data Science @DLR
 - Data Science @Drivetribe
 - Data Science @Dropbox
 - Data Science @Ebay
 - Data Science @Expedia
 - Data Science @Facebook
 - Data Science @Google
 - Data Science @Grammarly
 - Data Science @ING Fraud
 - Data Science @Instagram
 - Data Science @LinkedIn
 - Data Science @Lyft
 - Data Science @NASA
 - Data Science @Netflix
 - Data Science @OLX
 - Data Science @OTTO
 - Data Science @Paypal
 - Data Science @Pinterest
 - Data Science @Salesforce
 - Data Science @Siemens Mindsphere
 - Data Science @Slack
 - Data Science @Spotify
 - Data Science @Symantec
 - Data Science @Tinder
 - Data Science @Twitter
 - Data Science @Uber
 - Data Science @Upwork
 - Data Science @Woot
 - Data Science @Zalando
 
- Student Favorites
 - General And Academic
 - Content Marketing
 - Crime
 - Drugs
 - Education
 - Entertainment
 - Environmental And Weather Data
 - Financial And Economic Data
 - Government And World
 - Health
 - Human Rights
 - Labor And Employment Data
 - Politics
 - Retail
 - Social
 - Travel And Transportation
 - Various Portals
 - Source Articles and Blog Posts
 - Free Data Sources Data Science
 
If you have some cool links or topics for the cookbook, please become a contributor.
Simply pull the repo, add your ideas and create a pull request. You can also open an issue and put your thoughts there.
Please use the "Issues" function for comments.
Subscribe to my YouTube channel for regular updates: Link to YouTube
I have a Medium publication where you can publish your data engineer articles to reach more people: Medium publication
*(As an Amazon Associate I earn from qualifying purchases from Amazon This is free of charge for you, but super helpful for supporting this channel)
