Skip to content

CS 5542 BigData Lab Report #07

Amy Lin edited this page Mar 29, 2017 · 1 revision

TensorFlow PROGRAMMING - Linear Regression + Training/Testing Cost

[ QUESTION ]

Implement linear regression for dataset that is not covered in class.

[ IMPLEMENTATION ]

  • Set up training datasets x & y.
  • Create symbolic variables ( Inputs for tf Graph ).
  • Create a shared variable for the weight matrix ( Set Model Weights & Bias ).
  • Prediction function ( Build a Linear Model ).
  • Cost & Optimizer - Reduce error to improve accuracy.
  • Calculate Mean Squared Error.
  • Gradient Descent Optimizer - Build an optimizer to min cost and fit line into to data.
  • Initialize the session & local variables.
  • Launch the graph in a session.
  • Initialize the variables.
  • Start training data by using the feed dictionary in python.
  • Plot the Training Cost graph.
  • Input test data to do the Mean Square Loss Comparison.
  • Calculate Testing cost and Absolute Mean Square Loss Difference.

< DATASET >

Forest File from UCI Machine Learning Repository

< APPROACH >

I split the datasets into 85% training and 15% testing and ran the linear regression, optimization using tensorflow in python. There is still a lot of noise in the data so the Absolute Mean Square Difference between training and testing is still high. Some ways to improve is to retrain the data and adjust based on the testing result so the accuracy can be improved. From training and testing plot, we can see that the data is improving (Error is less than the raw data) but still not good enough. There is definitely more training to be done!

[ RESULTS ]

  • Training Cost




  • Training Result




  • Testing Result




  • TensorFlow Final Result



Clone this wiki locally