Problem statement:
Below a link to the JSON profiles representing fictional customers from an e-commerce company. The profiles contain information about the customer, their orders, their transactions, what payment methods they used and whether the customer is fraudulent or not. Your task is one of the following.
- Option A
- Transform the json profiles into feature vectors
- Construct a model to predict if a customer is fraudulent based on their profile.
- Report on the models success and show what features are most important
- Option B
- Transform the json profiles into a csv or dataframe and provide exploratory analysis of the dataset.
- Summarise and explain the key trends in the data, providing visualisations and tabular representations as necessary.
- Explain what factors you think are significant and insignificant in contributing to fraud
Link to the data:
https://drive.google.com/file/d/1NEoLthft8j7-BLVs7NH4pApFIcb7-g0z/view
Expected outcome:
Please use Python for this exercise. You can use whatever external software libraries you think are appropriate.
Please don't spend more than 3-4 hours on this test.
You can download the data and store it locally in your program.
Check in your code and analysis to github and send us the link.