This capstone project will give you a taste of what data scientists go through in real life when working with real datasets. You will assume the role of a Data Scientist working for a startup intending to compete with SpaceX, and in the process follow the Data Science methodology involving data collection, data wrangling, exploratory data analysis, data visualization, model development, model evaluation, and reporting your results to stakeholders etc.
- Introduction
- Week 1
- Week 2
- Week 3
- Week 4
- Week 5
- Out of the Box Thinking
- Technology Stack
- Tips & Tricks
- Extra Study Materials
The commercial space age is here, companies are making space travel affordable for everyone. Virgin Galactic is providing suborbital spaceflights. Rocket Lab is a small satellite provider. Blue Origin manufactures sub-orbital and orbital reusable rockets. Perhaps the most successful is SpaceX. SpaceX’s accomplishments include: Sending spacecraft to the International Space Station. Starlink, a satellite internet constellation providing satellite Internet access. Sending manned missions to Space. One reason SpaceX can do this is the rocket launches are relatively inexpensive. SpaceX advertises Falcon 9 rocket launches on its website with a cost of 62 million dollars; other providers cost upwards of 165 million dollars eac![SpaceX Launch Failure Instances]h, much of the savings is because SpaceX can reuse the first stage.
In this project, I take the role of a data scientist working for a new rocket company. Space Y that would like to compete with SpaceX founded by Billionaire industrialist Allon Musk. My job is to determine the price of each launch. I have to do this by gathering information about Space X and creating dashboards for my team. I also determine if SpaceX will reuse the first stage. Instead of using rocket science to determine if the first stage will land successfully, I will train a machine learning model and use public information to predict if SpaceX will reuse the first stage.
-
Advanced Folium Function MeasurementControl Plugin with Folium
IBM Cognos Visualization with Player
- Python - Programming Language
- IBM Watson Studio - IBM’s software platform for data science
- IBM Db2 - Db2 is a family of data management products, including database servers, developed by IBM
- Jupyter Notebooks - Open-source web application that allows data scientists to create and share documents that integrate live code, equations, computational output, visualizations, and other multimedia resources.
- Anaconda - Local environment for practice
- pythonanywhere.com - Host your Python App via Flask
- plotly.com - Plotly stewards Python's leading data viz and UI libraries. With Dash Open Source, Dash apps run on your local laptop or server
- IBM Cognos Dashboard - IBM Cognos Analytics provides dashboards and stories to communicate your insights and analysis. You can assemble a view that contains visualizations such as a graph, chart, plot, table, map, or any other visual representation of data.
- GitHub - Repository for storing all files