Skip to content

CS 5542 BigData Lab Report #03

Amy Lin edited this page Mar 29, 2017 · 2 revisions

SPARK PROGRAMMING - Machine Learning Tasks : Chimpanzee's Daily Activities


[ QUESTION ]

Q1. Build a Linear Regression Model for selected 2 parameters for chimpanzee's daily movement, activities & interactions. Define your own data sets.


[ IMPLEMENTATION ]

  • Import needed library & access to Spark.
  • Create an object called "LinearRegressionwithSGD" and define main as String Array.
  • Initialize Spark -> setMaster and AppName.
  • Turn off Info Logger for Consolexxx -> Reduce Spark's runtime output.
  • Read in data from the user defined data set "Chimpanzee_data.data. -> set this dataset for only 6 chimpanzees (Chimpanzee Label, Activity Level, Location X-Axis, Location Y-Axis)
  • Parse the Chimpanzee data.
  • Split data randomly into 95% Training & 5% Testing data.
  • Start building the model by using training data, number of iterations and step size. -> val model = LinearRegressionWithSGD.train(training, numIterations, stepSize)
  • Evaluate the model based on the training samples/testing samples and calculate the training mean squared error.
  • Save the Linear Regression Model. -> model.save(sc, "data\\LinearRegressionChimpanzees")
  • Load the model. -> val sameModel = LinearRegressionModel.load(sc, "data\\LinearRegressionChimpanzees")

I also include another way of doing Linear Regression in Scala Class. Person represents chimpanzees for a total of 6. Data sets is the same in the above approach. Person("Chimpanzee Label", Activity Level, Location X-Axis, Location Y-Axis)


Q2. Implement K-Mean Clustering for the clusters of the chimpanzee's activities. Define your own data sets.

[ IMPLEMENTATION ]

  • Import needed library & access to Spark.
  • Create an object called "kMeansClustering" and define main as String Array.
  • Initialize Spark -> setMaster and AppName.
  • Turn off Info Logger for Consolexxx -> Reduce Spark's runtime output.
  • Load data sets from "chimapanzee_KmreanData.txt" (Activity Level, Location X-Axis, Location Y-Axis ) & parse the data
  • Set number of clusters and iterations then cluster the data into 2 classes using K-Means. -> val clusters = KMeans.train(parsedData, numClusters, numIterations)
  • Evaluate clustering by computing Within Set Sum of Squared Errors (WSSSE) -> val WSSSE = clusters.computeCost(parsedData)
  • Make predictions based on training data in the cluster. -> clusters.predict(parsedData).zip(parsedData).foreach(f=>println(f._2,f._1))
  • Save the file "KMeansModelChimpanzee" and load the model. -> clusters.save(sc, "data/KMeansModelChimpanzee")

Clarifai API - Video Annotation

[ QUESTION ]

Build a simple application to give the summary of a video by using Clarifai API. Use OpenImg Library to the key-frame images from the Clarifai API.


##[ IMPLEMENTATION : KeyFrameDetection.java ]

  • Import needed library & how to access Spark, Clarifai API and OpenImj.
  • Create public class KeyFrameDetection -> Get all the frames from the video.
  • Frames Extraction: Iterate over video frames. -> public static void Frames(String path) & Error handling for outputting the images.
  • Go through all frames and pick out the main frames in this video. -> public static void MainFrames() then perform:
  • Shot Transition Detection: Compare SIFT features with adjacent images; common features < Threshold = shot transition
  • Find number of key points.
  • Collect all the main points together and output the results to the mainframe file.

[ IMPLEMENTATION : ImageAnnotation.java ]

  • Connect the Clarifai API server by using your own API key and access code to get the token.
  • Access mainframes file.
  • Start doing detail scanned on the image and predict what possible information contains in this image.
  • Print out possible contents in each image in the Console.
  • Output the possible info onto the image.

[ EXTRA EXAMPLE : Image Annotation ] -- A simple image annotation example. Use the Clarafai API to predict the image "animal.jpg". Output all possible informations on that image and display to the user.

Clone this wiki locally