A sample data science project that uses a Logitistic Regression model built in R to predict default or pay off of loans from the German Credit dataset. Specifically, this example is used to demonstrate the creating of ModelOp Center(MOC)-compliant code.
is the R code that houses the MOC-compliant code to predict and get metrics on data.trained_model.RData
is the trained model artifact that is loaded upon prediction. In our case, the artifact is a workflow built on top of a recipe that includes a few data cleaning steps and a call to a logistic regression model.- The datasets used for scoring are
. These datasets represent raw data that would first be run into a batch scoring job. A sample of the outcome to the scoring job is provided in theoutput_action_sample.json
file. - The datasets for metrics are
. These datasets represent data that has appended the predictions from a scoring job. The columns are renamed to be compliant with MOC;label
is renamed tolabel_value
and the prediction column is namedscore
- For a scoring job, use the
or thedf_sample.json
files. The output is a JSON string object that has the orignallabel
for each input row.output_action_sample.json
is the output of the scoring job run on thedf_sample.json
file. - For a metrics job, use the
or thedf_sample_scored.json
files. THe output is a list of the relevant metrics (F1 score, Accuracy, Sensitivity, Specificity, Precision) for the classification model.output_metrics_sample.json
is the output of the metrics job run on thedf_sample_scored.json
The input data to the scoring job is df_sample.json
, which is a JSONS file (one-line JSON records). Here are the first three records:
{"id":20,"duration_months":9,"credit_amount":2134,"installment_rate":4,"present_residence_since":4,"age_years":48,"number_existing_credits":3,"checking_status":"A14","credit_history":"A34","purpose":"A40","savings_account":"A61","present_employment_since":"A73","debtors_guarantors":"A101","property":"A123","installment_plans":"A143","housing":"A152","job":"A173","number_people_liable":1,"telephone":"A192","foreign_worker":"A201","gender":"male","label":"Pay Off"}
The input data to the metrics job is df_sample_scored.json
, which is a JSONS file (one-line JSON records). Here are the first three records:
{"label_value":"Default","score":"Pay Off","id":1,"duration_months":48,"credit_amount":5951,"installment_rate":2,"present_residence_since":2,"age_years":22,"number_existing_credits":1,"checking_status":"A12","credit_history":"A32","purpose":"A43","savings_account":"A61","present_employment_since":"A73","debtors_guarantors":"A101","property":"A121","installment_plans":"A143","housing":"A152","job":"A173","number_people_liable":1,"telephone":"A191","foreign_worker":"A201","gender":"female"}
{"label_value":"Pay Off","score":"Pay Off","id":20,"duration_months":9,"credit_amount":2134,"installment_rate":4,"present_residence_since":4,"age_years":48,"number_existing_credits":3,"checking_status":"A14","credit_history":"A34","purpose":"A40","savings_account":"A61","present_employment_since":"A73","debtors_guarantors":"A101","property":"A123","installment_plans":"A143","housing":"A152","job":"A173","number_people_liable":1,"telephone":"A192","foreign_worker":"A201","gender":"male"}