CIS565-Fall-2019 · taylornelms15 · Aug 29, 2019 · Sep 1, 2019 · Sep 2, 2019 · Sep 2, 2019
diff --git a/GRAPHING.md b/GRAPHING.md
@@ -0,0 +1,120 @@
+GRAPHING WITH PYTHON
+====================
+
+## Files
+
+The python script may be found [here](outputData/csvReader.py). The other relevant code is in the `main` file within `src`. For any gaps, or if you're curious as to the support structure behind some of this functionality, feel free to look there.
+
+## Recording the CSV file
+
+After collecting the data across a number of iterations, inside a `std::vector` of structs called `eventRecords`, I looped through each of them to extract the iteration number, the time between the two records, and the total simulation time up until that point:
+
+```C++
+      void writeTime(const char* fileName) {
+          FILE* of = fopen(fileName, "w");
+
+          for (auto& record : eventRecords) {
+              double millisPerFrame = record.time / TIMEKEEPING_FRAMESIZE;
+              millisPerFrame /= 1000.0;//seconds per frame
+              double fps = 1.0 / millisPerFrame;
+              double seconds = record.totalTime / 1000.0;
+              fprintf(of, "%d,%0.3f,%f\n", record.frameNo, seconds, fps);
+          }//for
+
+          fclose(of);
+      }//writeTime
+```
+
+After this, I wrote the records into the csv. This involves rows being separated by newlines, and columns being separated by commas.
+
+For my particular code, the filenames were hard-coded filenames, like the following:
+
+```C++
+    const char timingFileName[] = "../outputData/CoherentGrid_HighDensity_128.csv";
+```
+
+## Using the Data
+
+### Invoking the Python script
+
+The script I wrote takes in at least two arguments; the first is the title for the graph (and, by extension, the name of the image to save), and all arguments after that are names of csv files, in the format recorded above.
+
+For example, a way to use this script would be to call `python csvReader.py "Coherent Grid, High Density, All Block Sizes" CoherentGrid_HighDensity_*.csv`
+
+The main function is as follows:
+
+```Python
+
+    def main():
+        if len(sys.argv) < 3:
+            print("Please input a title and file names")
+            exit(0)
+
+        resultSets = []
+
+        for i, fileName in enumerate(sys.argv):
+            if i == 0 or i == 1:
+                continue
+            resultSets.append((fileName, readCSV(fileName)))
+
+        makeGraphs(resultSets, sys.argv[1])
+
+```
+
+It takes each CSV file, transforms it into a Numpy array, and binds that into a tuple along with the name of the file the data came from. It then hands each of those sets off to a function that graphs them, and saves out the data (or, optionally, displays it onscreen).
+
+### Reading in the CSV data
+
+I made use of the `csv` package, but the file format is so simple, you could quickly write up your own. The function was the following:
+
+```Python
+    def readCSV(filename):
+        """
+        takes in a filename
+        returns a numpy array, with the first row as the
+            timestamp in seconds, and the second row
+            as the fps across the last time block
+        """
+        results = []
+        with open(filename) as csv_file:
+            reader = csv.reader(csv_file, delimiter=',')
+            for line in reader:
+                timestamp = float(line[1])
+                fps = float(line[2])
+                results.append([timestamp, fps])
+
+        return np.array(results).T
+```
+
+If you're not familiar with the `numpy` library, it's a handy library for making array operations more efficient, while also making some simple things entirely too complicated. In this case, the `.T` within the return statement transposes the 2D array, making our two columns into two rows, which will be essential for displaying them nicely.
+
+### Graphing the data
+
+I used `matplotlib` to graph the data. It has a billion fun features, along with occasionally frustrating documentation, meaning all of my matplotlib code is a hacked-together mess of code snippets stolen from forum posts. That said, the goals here were relatively simple; I wanted to take a bunch of 2d arrays representing pairs of `(timestamp, fps)` data points and graph those series' onto a line plot.
+
+The code was as follows:
+
+```Python
+    def makeGraphs(resultSets, title):
+        """
+        Displays the resultant data sets, along with a given title
+        """
+        fig, ax = plt.subplots(1)
+        for filename, data in resultSets:
+            ax.plot(data[0], data[1], label = cleanFileName(filename))
+
+        ax.legend()
+        plt.xlabel("Time (seconds)")
+        plt.ylabel("Ticks/Frames per Second")
+
+        fig.suptitle(title)
+        fig.set_size_inches(10,6)
+
+        #plt.show() #uncomment this to display the graph on your screen
+        filePath = makeSavePath(title)
+        plt.savefig(filePath)
+```
+
+The core functionality is the `plot` function, where you can provide a series of `x` data, along with a series of `y` data, along with a bunch of other optional variables, such as a label for the series. I did this for each set of data, stuck some labels onto the plot, and then saved it all to an image file.
+
+Happy coding!
diff --git a/README.md b/README.md
@@ -1,11 +1,67 @@
 **University of Pennsylvania, CIS 565: GPU Programming and Architecture,
 Project 1 - Flocking**
 
-* (TODO) YOUR NAME HERE
-  * (TODO) [LinkedIn](), [personal website](), [twitter](), etc.
-* Tested on: (TODO) Windows 22, i7-2222 @ 2.22GHz 22GB, GTX 222 222MB (Moore 2222 Lab)
+* Taylor Nelms
+  * [LinkedIn](https://www.linkedin.com/in/taylor-k-7b2110191/), [twitter](https://twitter.com/nelms_taylor)
+* Tested on: Windows 10, Intel i3 Coffee Lake 4-core 3.6GHz processor, 16GB RAM, NVidia GeForce GTX1650 4GB
 
-### (TODO: Your README)
+## Results
+![](images/mLo_dMed_gMed.gif)
+*No Grid used for implementation*
+![](images/mMed_dMed_gMed.gif)
+*Uniform grid used within implementation*
+![](images/mHi_dMed_gMed.gif)
+*Coherent grid used within implementation*
 
-Include screenshots, analysis, etc. (Remember, this is public, so don't put
-anything here that you don't want to share with the world.)
+## Analysis
+
+### Implementation Strategy
+
+Unsurprisingly, the grid implementations ended up significantly more efficient than the naive implementation. For runs with 5000 boids, with a block size of 128, the FPS over a 45-ish-second run yielded the following results:
+
+![](images/All&#32;Grids,&#32;Medium&#32;Density,&#32;Block&#32;Size&#32;128.png)
+
+There are a few things to unpack here. First, the spike in the initial framerate of the naive implementation. Frankly, I have no idea why this exists; I was taking data points every 300 ticks, so I can't imagine this being some fluke of initial set-up taking less time. In all honesty, I would need to do significantly more debugging to figure it out.
+
+Of course, the more interesting behavior lies within the meat of the simulation. The grid-based solutions performed better overall, with a slight improvement for the coherent grid over the uniform grid. Of course, algorithmically, this makes sense; the grid-based approaches cut execution time on the GPU from order `O(N)` to `O(n)`, where `n` is the number of boids that are within the grid-neighborhood of each boid. (On a CPU, the naive approach would run in time `O(N^2)`, while the grid approaches would run in time `O(Nn)`.)
+
+For another example, let's look at each of the models with a higher density of boids; in this case, we're operating with 10,000 boids, rather than 5,000, in that same space:
+
+![](images/All&#32;Grids,&#32;High&#32;Density,&#32;Block&#32;Size&#32;128.png)
+
+Another notable part here is that the framerate drops off over time for the uniform grid, while the coherent grid stays relatively steady. The best I can figure for the drop is that, over time, each boid has more neighbors, and so the number of data accesses to those neighbors increases (as more of the boids are in flocks). This increases the penalties felt from the boid data being more scattered in the uniform grid implementation, as cached data accesses become less favorable.
+
+### Number of Boids
+
+Unsurprisingly, as the number of boids increase, the execution speed of the simulation decreases. Here are some comparisons for all the models, running with `2000` boids, `5000` boids, and `10000` boids:
+
+![](images/No&#32;Grid,&#32;All&#32;Densities,&#32;Block&#32;Size&#32;128.png) ![](images/Uniform&#32;Grid,&#32;All&#32;Densities,&#32;Block&#32;Size&#32;128.png) ![](images/Coherent&#32;Grid,&#32;All&#32;Densities,&#32;Block&#32;Size&#32;128.png)
+
+Unsurprisingly, the naive implementation shows a roughly linear relationship between simulation speed and the number of boids. The others have a more complex relationship, but the overall trend is clear, and they seem to do slightly better than linear with sample size.
+
+### Block Size
+
+The differences between block size were very interesting. I ran a series of simulations with block sizes of `32`, `128`, and `512`. Here are a couple of graphs comparing runs with various block sizes:
+
+![](images/No&#32;Grid,&#32;Medium&#32;Density,&#32;All&#32;Block&#32;Sizes.png)![](images/Coherent&#32;Grid,&#32;Medium&#32;Density,&#32;All&#32;Block&#32;Sizes.png)
+
+Notably, there is not that much difference in performance for the naive implementation based on block size. This makes some sense; so many of the operations are simple and easily parallelizable, it is hard to imagine that the various levels of scheduling or memory caching could make a significant performance difference.
+
+However, for the grid implementations, block sizes made huge differences in outcome. 
+
+The block size of 32 ran the worst. This makes some sense; there must be some amount of overhead in creating a block, and getting access to the relevant memory, and given the number of blocks that were spun up at various points in the simulation step, it makes sense that those penalties would add up.
+
+This would imply that a larger block size would improve performance; however, we see performance dip when we increase the block size from `128` to `512`. The best explanation I can think of is that a block needing to finish together before a new block can be put in its place would lead to situations where an entire block could be held up by a few rogure warps. In those cases, a whole section of processing power could be lost while the scheduler keeps a block running.
+
+### Bonus Graph
+
+Everyone needs a little graph gore in their life every now and then:
+
+![](images/All&#32;Test&#32;Runs.png)
+
+### Miscellaneous Notes
+
+During the naive implementation, I changed my `distBetween` function, which computed the distance between two vectors, 
+between using the `glm::distance` function and a simple `sqrt(xdiff * xdiff + ydiff * ydiff + zdiff * zdiff)` function.
+Though I would have expected the `glm::distance` function to be highly optimized in some fashion,
+I saw framerate drop from around 10fps to around 2.5fps in the simulation window.
diff --git a/images/All Grids, High Density, Block Size 128.png b/images/All Grids, High Density, Block Size 128.png
diff --git a/images/All Grids, Medium Density, Block Size 128.png b/images/All Grids, Medium Density, Block Size 128.png
diff --git a/images/All Test Runs.png b/images/All Test Runs.png
diff --git a/images/Coherent Grid, All Densities, All Block Sizes.png b/images/Coherent Grid, All Densities, All Block Sizes.png
diff --git a/images/Coherent Grid, All Densities, Block Size 128.png b/images/Coherent Grid, All Densities, Block Size 128.png
diff --git a/images/Coherent Grid, High Density, All Block Sizes.png b/images/Coherent Grid, High Density, All Block Sizes.png
diff --git a/images/Coherent Grid, Medium Density, All Block Sizes.png b/images/Coherent Grid, Medium Density, All Block Sizes.png
diff --git a/images/No Grid, All Densities, All Block Sizes.png b/images/No Grid, All Densities, All Block Sizes.png
diff --git a/images/No Grid, All Densities, Block Size 128.png b/images/No Grid, All Densities, Block Size 128.png
diff --git a/images/No Grid, Medium Density, All Block Sizes.png b/images/No Grid, Medium Density, All Block Sizes.png
diff --git a/images/Uniform Grid, All Densities, All Block Sizes.png b/images/Uniform Grid, All Densities, All Block Sizes.png
diff --git a/images/Uniform Grid, All Densities, Block Size 128.png b/images/Uniform Grid, All Densities, Block Size 128.png
diff --git a/images/mHi_dMed_gMed.gif b/images/mHi_dMed_gMed.gif
diff --git a/images/mLo_dMed_gMed.gif b/images/mLo_dMed_gMed.gif
diff --git a/images/mMed_dMed_gMed.gif b/images/mMed_dMed_gMed.gif
diff --git a/outputData/CoherentGrid_HighDensity_128.csv b/outputData/CoherentGrid_HighDensity_128.csv
@@ -0,0 +1,69 @@
+300,0.698,429.799427
+600,1.326,477.707006
+900,1.976,462.249615
+1200,2.618,468.750000
+1500,3.268,461.538462
+1800,3.912,466.562986
+2100,4.568,458.015267
+2400,5.214,465.116279
+2700,5.870,458.015267
+3000,6.519,462.962963
+3300,7.193,445.765230
+3600,7.853,455.235205
+3900,8.510,457.317073
+4200,9.166,458.015267
+4500,9.826,455.235205
+4800,10.487,454.545455
+5100,11.161,445.765230
+5400,11.841,441.826215
+5700,12.503,453.857791
+6000,13.168,451.807229
+6300,13.814,465.116279
+6600,14.463,462.962963
+6900,15.133,449.101796
+7200,15.802,449.101796
+7500,16.460,455.927052
+7800,17.124,452.488688
+8100,17.782,456.621005
+8400,18.435,459.418070
+8700,19.082,465.116279
+9000,19.766,439.238653
+9300,20.421,458.715596
+9600,21.074,460.122699
+9900,21.723,462.962963
+10200,22.380,457.317073
+10500,23.052,447.093890
+10800,23.715,453.857791
+11100,24.386,447.761194
+11400,25.038,460.122699
+11700,25.701,452.488688
+12000,26.366,452.488688
+12300,27.036,448.430493
+12600,27.690,459.418070
+12900,28.353,453.172205
+13200,29.018,451.807229
+13500,29.673,458.015267
+13800,30.327,459.418070
+14100,30.987,455.235205
+14400,31.643,458.015267
+14700,32.314,447.093890
+15000,32.973,456.621005
+15300,33.626,460.122699
+15600,34.292,451.127820
+15900,34.958,451.127820
+16200,35.641,439.882698
+16500,36.305,452.488688
+16800,36.975,448.430493
+17100,37.640,451.807229
+17400,38.296,458.715596
+17700,38.944,462.962963
+18000,39.616,447.093890
+18300,40.291,445.765230
+18600,40.941,462.249615
+18900,41.591,461.538462
+19200,42.249,456.621005
+19500,42.924,445.103858
+19800,43.597,447.093890
+20100,44.261,452.488688
+20400,44.928,449.775112
+20700,45.593,452.488688
diff --git a/outputData/CoherentGrid_HighDensity_512.csv b/outputData/CoherentGrid_HighDensity_512.csv
@@ -0,0 +1,64 @@
+300,0.728,412.087912
+600,1.408,441.826215
+900,2.087,442.477876
+1200,2.779,434.153401
+1500,3.502,414.937759
+1800,4.204,427.960057
+2100,4.907,426.742532
+2400,5.594,437.317784
+2700,6.288,432.900433
+3000,6.988,429.184549
+3300,7.685,430.416069
+3600,8.400,420.168067
+3900,9.119,417.827298
+4200,9.820,428.571429
+4500,10.515,431.654676
+4800,11.220,426.742532
+5100,11.929,423.131171
+5400,12.630,428.571429
+5700,13.330,429.799427
+6000,14.014,438.596491
+6300,14.727,421.940928
+6600,15.431,426.742532
+6900,16.120,436.046512
+7200,16.801,441.176471
+7500,17.476,444.444444
+7800,18.184,424.328147
+8100,18.884,429.184549
+8400,19.583,429.799427
+8700,20.297,420.757363
+9000,21.007,423.131171
+9300,21.713,425.531915
+9600,22.431,418.410042
+9900,23.131,429.184549
+10200,23.840,423.728814
+10500,24.537,431.034483
+10800,25.239,427.960057
+11100,25.958,417.246175
+11400,26.672,421.348315
+11700,27.366,432.900433
+12000,28.066,428.571429
+12300,28.764,430.416069
+12600,29.476,421.940928
+12900,30.146,448.430493
+13200,30.857,422.535211
+13500,31.548,434.153401
+13800,32.258,423.131171
+14100,32.947,436.046512
+14400,33.643,431.654676
+14700,34.342,429.799427
+15000,35.020,442.477876
+15300,35.734,421.348315
+15600,36.460,413.223140
+15900,37.155,432.276657
+16200,37.882,413.223140
+16500,38.586,426.742532
+16800,39.296,423.131171
+17100,40.014,417.827298
+17400,40.721,424.929178
+17700,41.443,416.088766
+18000,42.145,428.571429
+18300,42.837,434.153401
+18600,43.552,420.168067
+18900,44.277,414.937759
+19200,44.986,423.728814